Skip to content

Conversation

nagarajankarthik
Copy link

This pull request attempts to add support for running the muon optimizer with tensor parallelism. It builds upon the code introduced in this pull request( Dist_Muon optimizer support #1813).

BoxiangW and others added 15 commits September 15, 2025 14:52
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Copy link

copy-pr-bot bot commented Oct 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@BoxiangW
Copy link
Contributor

Hi thanks for the contribution, we have already merged this https://github.com/NVIDIA/Megatron-LM/blob/dev/megatron/core/optimizer/muon.py in our new dev branch, please feel free to take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants