Skip to content

Commit 7f27789

Browse files
bxyangweiqi.li
andauthored
[fsdp,doc] refactor: rename warmup_style@FSDPOptimizerConfig -> lr_scheduler_type (#3739)
### What does this PR do? > Rename `warmup_style` in FSDPOptimizerConfig to `lr_scheduler_type` to align with Hugging Face Trainer API。 The following pull request is for refactoring the optimizer, however, the naming issue persists. #3656 ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: weiqi.li <[email protected]>
1 parent e9ee6b3 commit 7f27789

File tree

16 files changed

+64
-38
lines changed

16 files changed

+64
-38
lines changed

docs/examples/config.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ Actor/Rollout/Reference Policy
132132
lr_warmup_steps_ratio: 0. # the total steps will be injected during runtime
133133
min_lr_ratio: 0.0 # only used with cosine lr scheduler, default to 0.0
134134
num_cycles: 0.5 # only used with cosine lr scheduler, default to 0.5
135-
warmup_style: constant # select from constant/cosine
135+
lr_scheduler_type: constant # select from constant/cosine
136136
total_training_steps: -1 # must be override by program
137137
fsdp_config:
138138
wrap_policy:
@@ -415,7 +415,7 @@ ____________________________________________________
415415
416416
Notice that there are some differences in APIs between Megatron optimizer and FSDP optimizer.
417417

418-
- Megatron optimizer scheduler names the period after lr_warmup as lr_decay_steps, so the ``warmup_style`` actually means the style of lr decay after warmup.
418+
- Megatron optimizer scheduler names the period after lr_warmup as lr_decay_steps, so the ``lr_scheduler_type`` actually means the style of lr decay after warmup.
419419
- Megatron optimizer also support weight decay decay mechanism
420420
- ``use_checkpoint_opt_param_scheduler`` determines whether to use the checkpoint optimizer parameter scheduler. If set to True, the optimizer parameter scheduler will be saved in the checkpoint and loaded from the checkpoint during resuming training.
421421

examples/split_placement/config/ppo_trainer_split.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ actor_rollout_ref:
5151
lr_warmup_steps: -1 # Prioritized. Negative values mean delegating to lr_warmup_steps_ratio.
5252
lr_warmup_steps_ratio: 0. # the total steps will be injected during runtime
5353
min_lr_ratio: null # only useful for warmup with cosine
54-
warmup_style: constant # select from constant/cosine
54+
lr_scheduler_type: constant # select from constant/cosine
5555
total_training_steps: -1 # must be override by program
5656
fsdp_config:
5757
wrap_policy:
@@ -105,7 +105,7 @@ critic:
105105
lr: 1e-5
106106
lr_warmup_steps_ratio: 0. # the total steps will be injected during runtime
107107
min_lr_ratio: null # only useful for warmup with cosine
108-
warmup_style: constant # select from constant/cosine
108+
lr_scheduler_type: constant # select from constant/cosine
109109
total_training_steps: -1 # must be override by program
110110
model:
111111
path: ~/models/deepseek-llm-7b-chat

recipe/entropy/32b_clip_cov.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ HYDRA_FULL_ERROR=1 python -m recipe.entropy.main_entropy \
103103
actor_rollout_ref.model.enable_gradient_checkpointing=True \
104104
actor_rollout_ref.actor.optim.lr=1e-6 \
105105
actor_rollout_ref.actor.optim.weight_decay=0 \
106-
actor_rollout_ref.actor.optim.warmup_style=constant \
106+
actor_rollout_ref.actor.optim.lr_scheduler_type=constant \
107107
actor_rollout_ref.actor.ppo_mini_batch_size=${train_prompt_mini_bsz} \
108108
actor_rollout_ref.actor.ppo_micro_batch_size=${train_micro_batch_size} \
109109
actor_rollout_ref.actor.fsdp_config.param_offload=${offload} \

recipe/entropy/32b_kl_cov.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ HYDRA_FULL_ERROR=1 python -m recipe.entropy.main_entropy \
100100
actor_rollout_ref.model.enable_gradient_checkpointing=True \
101101
actor_rollout_ref.actor.optim.lr=1e-6 \
102102
actor_rollout_ref.actor.optim.weight_decay=0 \
103-
actor_rollout_ref.actor.optim.warmup_style=constant \
103+
actor_rollout_ref.actor.optim.lr_scheduler_type=constant \
104104
actor_rollout_ref.actor.ppo_mini_batch_size=${train_prompt_mini_bsz} \
105105
actor_rollout_ref.actor.ppo_micro_batch_size=${train_micro_batch_size} \
106106
actor_rollout_ref.actor.fsdp_config.param_offload=${offload} \

recipe/entropy/32b_kl_cov_mininbsz.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ HYDRA_FULL_ERROR=1 python -m recipe.entropy.main_entropy \
9999
actor_rollout_ref.model.enable_gradient_checkpointing=True \
100100
actor_rollout_ref.actor.optim.lr=1e-6 \
101101
actor_rollout_ref.actor.optim.weight_decay=0 \
102-
actor_rollout_ref.actor.optim.warmup_style=constant \
102+
actor_rollout_ref.actor.optim.lr_scheduler_type=constant \
103103
actor_rollout_ref.actor.ppo_mini_batch_size=${train_prompt_mini_bsz} \
104104
actor_rollout_ref.actor.ppo_micro_batch_size=${train_micro_batch_size} \
105105
actor_rollout_ref.actor.fsdp_config.param_offload=${offload} \

recipe/entropy/7b_clip_cov.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ HYDRA_FULL_ERROR=1 python -m recipe.entropy.main_entropy \
103103
actor_rollout_ref.model.enable_gradient_checkpointing=True \
104104
actor_rollout_ref.actor.optim.lr=1e-6 \
105105
actor_rollout_ref.actor.optim.weight_decay=0 \
106-
actor_rollout_ref.actor.optim.warmup_style=constant \
106+
actor_rollout_ref.actor.optim.lr_scheduler_type=constant \
107107
actor_rollout_ref.actor.ppo_mini_batch_size=${train_prompt_mini_bsz} \
108108
actor_rollout_ref.actor.ppo_micro_batch_size=${train_micro_batch_size} \
109109
actor_rollout_ref.actor.fsdp_config.param_offload=${offload} \

recipe/entropy/7b_kl_cov.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ HYDRA_FULL_ERROR=1 python -m recipe.entropy.main_entropy \
9999
actor_rollout_ref.model.enable_gradient_checkpointing=True \
100100
actor_rollout_ref.actor.optim.lr=1e-6 \
101101
actor_rollout_ref.actor.optim.weight_decay=0 \
102-
actor_rollout_ref.actor.optim.warmup_style=constant \
102+
actor_rollout_ref.actor.optim.lr_scheduler_type=constant \
103103
actor_rollout_ref.actor.ppo_mini_batch_size=${train_prompt_mini_bsz} \
104104
actor_rollout_ref.actor.ppo_micro_batch_size=${train_micro_batch_size} \
105105
actor_rollout_ref.actor.fsdp_config.param_offload=${offload} \

recipe/prime/config/prime_trainer.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,8 @@ reward_model:
4848
lr_warmup_steps: -1 # Prioritized. Negative values mean delegating to lr_warmup_steps_ratio.
4949
lr_warmup_steps_ratio: 0. # the total steps will be injected during runtime
5050
min_lr_ratio: null
51-
warmup_style: constant
51+
warmup_style: null # deprecated
52+
lr_scheduler_type: constant
5253
total_training_steps: -1 # must be overridden by program
5354
weight_decay: 0.
5455
grad_clip: 10.0

tests/special_e2e/sft/run_sft_engine_gsm8k.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ FSDP_ENGINE_CONFIG="\
4242
optim.betas="[0.9,0.95]" \
4343
optim.clip_grad=1.0 \
4444
optim.min_lr_ratio=0.1 \
45-
optim.warmup_style=cosine \
45+
optim.lr_scheduler_type=cosine \
4646
engine.ulysses_sequence_parallel_size=${SP_SIZE} \
4747
engine.strategy=${FSDP_STRATEGY} \
4848
engine.fsdp_size=${FSDP_SIZE}"

tests/trainer/config/legacy_ppo_trainer.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -301,8 +301,8 @@ actor_rollout_ref:
301301
# Number of cosine cycles in LR schedule
302302
num_cycles: 0.5
303303

304-
# LR warmup style: "constant" or "cosine"
305-
warmup_style: constant
304+
# LR scheduler type: "constant" or "cosine"
305+
lr_scheduler_type: constant
306306

307307
# Total training steps (must be overridden at runtime)
308308
total_training_steps: -1
@@ -605,8 +605,8 @@ critic:
605605
# Minimum LR ratio for cosine schedule
606606
min_lr_ratio: 0.0
607607

608-
# LR warmup style: "constant" or "cosine"
609-
warmup_style: constant
608+
# LR scheduler type: "constant" or "cosine"
609+
lr_scheduler_type: constant
610610

611611
# Total training steps (must be overridden at runtime)
612612
total_training_steps: -1

0 commit comments

Comments
 (0)