Skip to content

Conversation

@shuhuayu
Copy link
Contributor

@shuhuayu shuhuayu commented Nov 7, 2025

Rebased on main to merge this pr: #1964

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 7, 2025
@shuhuayu
Copy link
Contributor Author

shuhuayu commented Nov 7, 2025

Added a test training from qwen3 4b huggingface checkpoint, which omits lm_head.weight.

Pasted Graphic 6

self.model_args.enable_weight_tying
and "lm_head.weight" not in hf_state_dict
):
if "model.embed_tokens.weight" in hf_state_dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this if? shouldn't we assert the existence of embedding?

My guess is that this if was copied from somewhere PP can be enabled, so embedding is on some ranks but not others. But with PP, we'd also require embedding and lm_head to be on the same rank -- o/w how would you be able to load the lm_head weights?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for this point, added an assertion here.

Copy link
Contributor

@tianyu-l tianyu-l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

@tianyu-l tianyu-l merged commit 5ecc871 into main Nov 7, 2025
9 checks passed
jquesnelle pushed a commit to NousResearch/torchtitan that referenced this pull request Nov 10, 2025
…pytorch#1999)

Rebased on main to merge this pr:
pytorch#1964

---------

Co-authored-by: William <[email protected]>
Co-authored-by: Achazwl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants