Skip to content

Lower Quantization Operations #5418

@nvMelissa

Description

@nvMelissa

Including definition of lowering criteria

Metadata

Metadata

Assignees

Labels

MoE Inference nvFP4Get Llama 4, GPT OSS, Deepseek R1, and Qwen3-Next running with performant 4-bit inference

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions