Skip to content

Commit e9cfd78

Browse files
committed
Align fsdp_sft_trainer warmup_steps_ratio->lr_warmup_steps_ratio
1 parent f4c7285 commit e9cfd78

File tree

3 files changed

+8
-4
lines changed

3 files changed

+8
-4
lines changed

docs/examples/config.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -618,7 +618,7 @@ Optim
618618
optimizer_impl: torch.optim
619619
lr: 1e-5
620620
weight_decay: 0.01
621-
warmup_steps_ratio: 0.1
621+
lr_warmup_steps_ratio: 0.1
622622
clip_grad: 1.0
623623
lr_scheduler: cosine
624624
override_optimizer_config: null
@@ -627,7 +627,7 @@ Optim
627627
- ``optimizer_impl``: Module path to import optimizer from (e.g., ``"torch.optim"``, ``"torchao.optim"``, ``"bitsandbytes.optim"``).
628628
- ``optim.lr``: Learning rate for the optimizer.
629629
- ``optim.weight_decay``: Weight decay for the optimizer.
630-
- ``optim.warmup_steps_ratio``: Ratio of warmup steps to total training steps.
630+
- ``optim.lr_warmup_steps_ratio``: Ratio of warmup steps to total training steps.
631631
- ``optim.clip_grad``: Gradient clipping value.
632632
- ``optim.lr_scheduler``: Learning rate scheduler type. Options:
633633

verl/trainer/config/sft_trainer.yaml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
defaults:
2+
- optim: fsdp
3+
- _self_
4+
15
data:
26
train_batch_size: 256
37
micro_batch_size: null # will be deprecated, use micro_batch_size_per_gpu
@@ -45,7 +49,7 @@ optim:
4549
lr: 1e-5
4650
betas: [0.9, 0.95]
4751
weight_decay: 0.01
48-
warmup_steps_ratio: 0.1
52+
lr_warmup_steps_ratio: 0.1
4953
clip_grad: 1.0
5054
lr_scheduler: cosine
5155
ulysses_sequence_parallel_size: 1

verl/trainer/fsdp_sft_trainer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -331,7 +331,7 @@ def _build_model_optimizer(self):
331331
f"{self.config.trainer.total_epochs}, total number of steps {self.total_steps}"
332332
)
333333

334-
num_warmup_steps = int(self.total_steps * self.config.optim.warmup_steps_ratio)
334+
num_warmup_steps = int(self.total_steps * self.config.optim.lr_warmup_steps_ratio)
335335

336336
if not hasattr(self.config.optim, "lr_scheduler") or self.config.optim.lr_scheduler == "cosine":
337337
self.lr_scheduler = get_cosine_schedule_with_warmup(

0 commit comments

Comments
 (0)