Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length #4999

sarithad-meta · 2025-10-13T18:00:23Z

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2013

Added paged attention support to FMHA FWD blackwell kernel.

Added support for fixed length case.
Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N
Added unit test, test_paged_forward.

Next steps:

Test the performance of fixed length case.
Add support for variable length case to FWD kernel.
Add support for small page sizes to FWD kernel.
Add paged attention support for decode.

Differential Revision: D84023396

netlify · 2025-10-13T18:00:29Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`298beb6`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68f2687f4403ca0008842ead
😎 Deploy Preview	https://deploy-preview-4999--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

meta-codesync · 2025-10-13T18:00:30Z

@sarithad-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84023396.

…d length (pytorch#4999) Summary: X-link: facebookresearch/FBGEMM#2013 Added paged attention support to FMHA FWD blackwell kernel. 1. Added support for fixed length case. 2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N 3. Added unit test, test_paged_forward. Next steps: 1. Test the performance of fixed length case. 2. Add support for variable length case to FWD kernel. 3. Add support for small page sizes to FWD kernel. 4. Add paged attention support for decode. Differential Revision: D84023396

…d length (pytorch#4999) Summary: X-link: facebookresearch/FBGEMM#2013 Added paged attention support to FMHA FWD blackwell kernel. 1. Added support for fixed length case. 2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N 3. Added unit test, test_paged_forward. Next steps: 1. Test the performance of fixed length case. 2. Add support for variable length case to FWD kernel. 3. Add support for small page sizes to FWD kernel. 4. Add paged attention support for decode. Reviewed By: Aya-ZIbra, sijiac Differential Revision: D84023396

…d length (pytorch#4999) Summary: Pull Request resolved: pytorch#4999 X-link: https://github.com/facebookresearch/FBGEMM/pull/2013 Added paged attention support to FMHA FWD blackwell kernel. 1. Added support for fixed length case. 2. Added support for 2 cases: a) page_block_size = N tile size b) page_block_size > N 3. Added unit test, test_paged_forward. Next steps: 1. Test the performance of fixed length case. 2. Add support for variable length case to FWD kernel. 3. Add support for small page sizes to FWD kernel. 4. Add paged attention support for decode. Reviewed By: Aya-ZIbra, sijiac Differential Revision: D84023396

meta-cla bot added the cla signed label Oct 13, 2025

meta-codesync bot added fb-exported meta-exported labels Oct 13, 2025

sarithad-meta force-pushed the export-D84023396 branch from 7ddeab4 to d007740 Compare October 13, 2025 19:02

sarithad-meta force-pushed the export-D84023396 branch from d007740 to 3dde689 Compare October 16, 2025 19:07

sarithad-meta force-pushed the export-D84023396 branch from 3dde689 to 0f54e39 Compare October 17, 2025 15:57

sarithad-meta force-pushed the export-D84023396 branch from 0f54e39 to 298beb6 Compare October 17, 2025 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length #4999

Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length #4999

Uh oh!

sarithad-meta commented Oct 13, 2025

Uh oh!

netlify bot commented Oct 13, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length #4999

Are you sure you want to change the base?

Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length #4999

Uh oh!

Conversation

sarithad-meta commented Oct 13, 2025

Uh oh!

netlify bot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

meta-codesync bot commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

netlify bot commented Oct 13, 2025 •

edited

Loading