Welcome to LeanModels, an organization founded by Tianyi Zhang dedicated to making foundation models, such as LLMs and diffusion models, more memory- and compute-efficient through practical compression and inference optimization techniques.
Explore our key projects:
- DFloat11: A lossless LLM compression framework enabling efficient GPU inference
- Bagel-DFloat11: DFloat11-compressed version of Bagel, a unified multimodal model
- LeanQuant: Scalable, loss-error-aware quantization for LLMs
We welcome contributors, collaborators, and feedback! If you're working on model compression or efficient inference, feel free to reach out.