-
Notifications
You must be signed in to change notification settings - Fork 665
Add several features. #4998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add several features. #4998
Conversation
jiayus-nvidia
commented
Oct 13, 2025
- Add support for arbitrary mask.
- Add support for paged kv in Ampere.
- Add support for fp8 bwd with 5 quantization modes in Hopper.
- Add support for all fp16/bf16 masks in fp8 fwd and bwd.
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hi @jiayus-nvidia the default values parsed in generate_kernels.py appear to be broken, I see
when I run the script. Could you update the script so that reasonable defaults are applied when no environment variables are set? |
Hi @q10, sorry for not considering this situation. Fixed now. |
Hi @jiayus-nvidia it appears there are some undefined symbols:
Maybe the code generation step didn't generate all the template instantiations? |
Hi @q10, you're right. That's because of the mismatch of instantiations generated and kernels called in main. Now it's ok to compile with no environment variables and run test. |