Skip to content

Conversation

dalgarak
Copy link

@dalgarak dalgarak commented Oct 14, 2025

This pull request fixes a missing condition in the FP8 delayed scaling check related to set_save_original_input().

When FP8 delayed scaling is enabled (--fp8-recipe 'delayed'), set_save_original_input() function should not be called, but the necessary condition was accidentally omitted in commit 08814e8
(ADLR/megatron-lm!4030 - perf(MoE): Support recomputation for FP8 layernorm/moe_act/shared_experts).

This PR adds the missing condition to ensure the correct behavior, and fixes an "AssertionError: DelayedScaling recipe is not supported with save_original_input" error in core_v0.14.0 released version.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 14, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@dalgarak dalgarak changed the title Fix set_save_original_input not being used in fp8 delayed scaling recipe Avoid calling set_save_original_input with FP8 delayed scaling Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant