Tip
This page is for reproducibility. For results where methods have been tuned for optimal performance, please refer to the community/leaderboard
.
The scripts below execute standard baseline unlearning experiments on the TOFU and MUSE datasets, evaluated using their corresponding benchmarks.
bash scripts/tofu_unlearn.sh
bash scripts/muse_unlearn.sh
For all the experiments below, we used the following setup
Category | Details |
---|---|
Hardware | 2 × L40s GPUs (48GB each) |
Distributed Computing | DeepSpeed ZeRO Stage 3 (Accelerate) |
Hyperparameters | Learning Rate (lr) = 1e-5 α = 1, γ = 1, β = 0.1 (where applicable) Batch size 32 effectively: 8 per device, 4 grad accum steps Number of Epochs = 10 Optimizer: paged_adamw_32bit |
Note
- The results in the next section display only some important subsets of metrics for each benchmark. For examples of more available evaluation metrics available: see
muse*/*_SUMMARY.json
,tofu*/evals*/*_SUMMARY.json
files on the HuggingFace space. - Results may vary even with the same effective hyperparameters when trained with modifications to the distributed training setup, including when training on a single GPU. For example: methods such as SimNPO & RMU can be significantly improved with careful tuning. Please use the below numbers only for reproducibility purposes.
- NPO inconsistency: for NPO, the MUSE implementation is inconsistent with the original paper as discussed here. This inconsistency is carried over into implementations like SimNPO. Here, we use the original NPO implementation with the same loss function expression across datasets.
Method | forget01 | forget05 | forget10 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
forget_quality | model_utility | forget_truth_ratio | forget_quality | model_utility | forget_truth_ratio | forget_quality | model_utility | forget_truth_ratio | |||||||||||
Finetuned | 1.27e-03 | 0.63 | 0.53 | 5.87e-14 | 0.63 | 0.51 | 4.35e-25 | 0.63 | 0.52 | ||||||||||
Retain | 1.0 | 0.63 | 0.68 | 1.0 | 0.63 | 0.67 | 1.0 | 0.61 | 0.68 | ||||||||||
GradAscent | 1.88e-04 | 0.55 | 0.36 | 1.94e-119 | 0.00e+00 | 8.82e-96 | 1.06e-239 | 0.00e+00 | 2.21e-32 | ||||||||||
GradDiff | 3.02e-03 | 0.57 | 0.41 | 1.94e-119 | 0.56 | 4.14e-95 | 1.80e-229 | 0.58 | 1.46e-07 | ||||||||||
IdkDPO | 0.1 | 0.56 | 0.67 | 4.02e-06 | 0.04 | 0.67 | 5.42e-13 | 0.04 | 0.64 | ||||||||||
NPO | 0.4 | 0.58 | 0.65 | 0.09 | 0.53 | 0.71 | 0.42 | 0.54 | 0.73 | ||||||||||
SimNPO | 1.27e-03 | 0.58 | 0.41 | 1.06e-106 | 0.6 | 3.94e-05 | 1.47e-198 | 0.6 | 3.17e-04 | ||||||||||
RMU | 0.4 | 0.62 | 0.64 | 9.59e-10 | 0.02 | 0.81 | 6.92e-21 | 0.03 | 0.81 |
Method | forget01 | forget05 | forget10 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
forget_quality | model_utility | forget_truth_ratio | forget_quality | model_utility | forget_truth_ratio | forget_quality | model_utility | forget_truth_ratio | |||||||||||
Finetuned | 0.01 | 0.6 | 0.47 | 1.33e-13 | 0.6 | 0.47 | 1.66e-21 | 0.6 | 0.48 | ||||||||||
Retain | 1.0 | 0.60 | 0.65 | 1.0 | 0.6 | 0.64 | 1.0 | 0.59 | 0.63 | ||||||||||
GradAscent | 0.27 | 0.33 | 0.59 | 1.94e-119 | 0 | 2.52e-23 | 1.06e-239 | 0 | 2.25e-18 | ||||||||||
GradDiff | 0.77 | 0.43 | 0.57 | 1.94e-119 | 0.53 | 3.87e-34 | 1.06e-239 | 0.49 | 3.53e-27 | ||||||||||
IdkDPO | 0.01 | 0.51 | 0.60 | 1.12e-05 | 0.07 | 0.62 | 4.64e-12 | 0.23 | 0.6 | ||||||||||
NPO | 0.92 | 0.56 | 0.66 | 0.14 | 0.45 | 0.7 | 0.02 | 0.46 | 0.7 | ||||||||||
SimNPO | 0.58 | 0.46 | 0.55 | 5.01e-100 | 0.58 | 4.19e-03 | 2.47e-203 | 0.54 | 1.07e-05 | ||||||||||
RMU | 0.16 | 0.55 | 0.70 | 4.87e-10 | 0.58 | 0.77 | 3.15e-15 | 0.59 | 0.76 |
Method | News | Books | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
forget_knowmem_ROUGE | forget_verbmem_ROUGE | privleak | retain_knowmem_ROUGE | forget_knowmem_ROUGE | forget_verbmem_ROUGE | privleak | retain_knowmem_ROUGE | ||||||||||||
Finetuned | 0.64 | 0.58 | -99.81 | 0.56 | 0.47 | 1.0 | -57.26 | 0.69 | |||||||||||
Retain | 0.33 | 0.20 | 0 | 0.56 | 0.3 | 0.14 | 0 | 0.69 | |||||||||||
GradAscent | 0 | 0 | 52.11 | 0 | 0 | 0 | -0.67 | 0 | |||||||||||
GradDiff | 0.41 | 8.92e-03 | 93.23 | 0.37 | 0.18 | 0.16 | -37.79 | 0.3 | |||||||||||
NPO | 0.56 | 0.35 | -86.00 | 0.51 | 0.32 | 0.84 | -54.24 | 0.55 | |||||||||||
SimNPO | 0.54 | 0.36 | -86.11 | 0.51 | 0.32 | 0.84 | -54.26 | 0.54 | |||||||||||
RMU | 0.48 | 0.05 | 56.36 | 0.51 | 0.29 | 0.79 | -60.52 | 0.48 |