Is it possible to support nvfp4 in blackwell ? #1543
edisonchan
started this conversation in
General
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
https://docs.nvidia.com/deeplearning/cudnn/frontend/latest/operations/BlockScaling.html
https://docs.nvidia.com/cuda/cuda-math-api/cuda_math_api/group__CUDA__MATH__FP4__MISC.html
"The NVFP4 recipe quantizes across 16 FP32 elements along the rows to produce 16 FP4 output values (E2M1) and 1 FP8 scaling factor (E4M3)."
Beta Was this translation helpful? Give feedback.
All reactions