Skip to content

Pass chunk size to moe op #2264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: aice/v122
Choose a base branch
from
Open

Pass chunk size to moe op #2264

wants to merge 1 commit into from

Conversation

yiliu30
Copy link
Contributor

@yiliu30 yiliu30 commented Aug 15, 2025

@yiliu30 yiliu30 requested a review from Copilot August 16, 2025 05:55
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the FP8 quantization MoE (Mixture of Experts) forward pass to support passing chunk size information to the underlying MoE operation. The change enables dynamic chunk size configuration by extracting tokens_num from hidden_states and delegating to the original module for additional kwargs.

  • Adds a helper method to extract extra kwargs from the original module
  • Modifies forward_quant to pass chunk size information via extra_kwargs

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

def forward_quant(self,
hidden_states,
expert_routing_table,
router_weights,
permuted_weights=True,
activation="silu"):
tokens_num, hidden_dim = hidden_states.shape
Copy link
Preview

Copilot AI Aug 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes hidden_states has exactly 2 dimensions, but tensors could have more dimensions (e.g., batch_size, sequence_length, hidden_dim). Consider using hidden_states.shape[0] or hidden_states.shape[-2] depending on the expected tensor layout.

Suggested change
tokens_num, hidden_dim = hidden_states.shape
tokens_num = hidden_states.shape[-2]
hidden_dim = hidden_states.shape[-1]

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant