Adding context parallel support to eager attention implementation #1859

nrailg · 2025-10-13T15:15:51Z

Sometimes, certain attention and mask implementations are difficult to write a fused / optimized implementation in a short period of time. However, we still need to run experiments to verify their effectiveness.
At such times, we need to fallback to the eager mode. Therefore, I added a switch to fallback to the eager implementation of attention:

--fallback-to-eager-attn

Additionally, since Megatron Core's eager attention does not support context parallelism, I provided a distributed attention implementation similar to that described in the Llama 3 paper.

copy-pr-bot · 2025-10-13T15:15:54Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

nrailg changed the title ~~Add context parallel support to eager attention implementation~~ Adding context parallel support to eager attention implementation Oct 13, 2025

yanring requested a review from yuzhongw-nvidia October 13, 2025 15:21

nrailg force-pushed the nrwu/eagercp branch 2 times, most recently from 10ddbd8 to 5c981b7 Compare October 14, 2025 08:48

nrailg added 2 commits October 15, 2025 17:30

fallback to eager attn config

9568a15

adding cp support to eager attn

b4c7979

nrailg force-pushed the nrwu/eagercp branch from 5c981b7 to b4c7979 Compare October 15, 2025 09:30

yuzhongw-nvidia added 2 commits October 16, 2025 15:46

refine the code

ac0fc23

Refine dot_product_attention_context_parallel.py

370aa4a

sbhavani added the enhancement New feature or request label Oct 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding context parallel support to eager attention implementation #1859

Adding context parallel support to eager attention implementation #1859

Uh oh!

nrailg commented Oct 13, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Adding context parallel support to eager attention implementation #1859

Are you sure you want to change the base?

Adding context parallel support to eager attention implementation #1859

Uh oh!

Conversation

nrailg commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nrailg commented Oct 13, 2025 •

edited

Loading