Skip to content

Conversation

sarithad-meta
Copy link

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2013

Added paged attention support to FMHA FWD blackwell kernel.

  1. Added support for fixed length case.
  2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N
  3. Added unit test, test_paged_forward.

Next steps:

  1. Test the performance of fixed length case.
  2. Add support for variable length case to FWD kernel.
  3. Add support for small page sizes to FWD kernel.
  4. Add paged attention support for decode.

Differential Revision: D84023396

Copy link

netlify bot commented Oct 13, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 298beb6
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68f2687f4403ca0008842ead
😎 Deploy Preview https://deploy-preview-4999--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

meta-codesync bot commented Oct 13, 2025

@sarithad-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84023396.

@meta-cla meta-cla bot added the cla signed label Oct 13, 2025
sarithad-meta added a commit to sarithad-meta/FBGEMM-1 that referenced this pull request Oct 13, 2025
…d length (pytorch#4999)

Summary:

X-link: facebookresearch/FBGEMM#2013

Added paged attention support to FMHA FWD blackwell kernel.

1. Added support for fixed length case.
2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N
3. Added unit test, test_paged_forward.

Next steps: 
1. Test the performance of fixed length case.
2. Add support for variable length case to FWD kernel.
3. Add support for small page sizes to FWD kernel.
4. Add paged attention support for decode.

Differential Revision: D84023396
sarithad-meta added a commit to sarithad-meta/FBGEMM-1 that referenced this pull request Oct 16, 2025
…d length (pytorch#4999)

Summary:

X-link: facebookresearch/FBGEMM#2013

Added paged attention support to FMHA FWD blackwell kernel.

1. Added support for fixed length case.
2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N
3. Added unit test, test_paged_forward.

Next steps: 
1. Test the performance of fixed length case.
2. Add support for variable length case to FWD kernel.
3. Add support for small page sizes to FWD kernel.
4. Add paged attention support for decode.

Reviewed By: Aya-ZIbra, sijiac

Differential Revision: D84023396
sarithad-meta added a commit to sarithad-meta/FBGEMM-1 that referenced this pull request Oct 17, 2025
…d length (pytorch#4999)

Summary:
Pull Request resolved: pytorch#4999

X-link: https://github.com/facebookresearch/FBGEMM/pull/2013

Added paged attention support to FMHA FWD blackwell kernel.

1. Added support for fixed length case.
2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N
3. Added unit test, test_paged_forward.

Next steps:
1. Test the performance of fixed length case.
2. Add support for variable length case to FWD kernel.
3. Add support for small page sizes to FWD kernel.
4. Add paged attention support for decode.

Reviewed By: Aya-ZIbra, sijiac

Differential Revision: D84023396
…d length (pytorch#4999)

Summary:
Pull Request resolved: pytorch#4999

X-link: https://github.com/facebookresearch/FBGEMM/pull/2013

Added paged attention support to FMHA FWD blackwell kernel.

1. Added support for fixed length case.
2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N
3. Added unit test, test_paged_forward.

Next steps:
1. Test the performance of fixed length case.
2. Add support for variable length case to FWD kernel.
3. Add support for small page sizes to FWD kernel.
4. Add paged attention support for decode.

Reviewed By: Aya-ZIbra, sijiac

Differential Revision: D84023396
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant