🏢 Group: Owner. @xlite-dev | @vipshop | Prev. @PaddlePaddle 🏰
🛠 Creator: lite.ai.toolkit | Awesome-LLM-Inference | LeetCUDA | ffpa-attn 🎧
🎉 Contributor: FastDeploy | vLLM | SGLang | Many Others ⚙❤ I love open source, bro, and I think you do too. ❤
🏢 Group: Owner. @xlite-dev | @vipshop | Prev. @PaddlePaddle 🏰
🛠 Creator: lite.ai.toolkit | Awesome-LLM-Inference | LeetCUDA | ffpa-attn 🎧
🎉 Contributor: FastDeploy | vLLM | SGLang | Many Others ⚙❤ I love open source, bro, and I think you do too. ❤
📚LeetCUDA: 200+ CUDA/Tensor Cores Kernels, HGEMM, FA2 MMA.
🛠 A lite C++ AI toolkit: 100+🎉 models with MNN, ORT and TRT.
📚A curated list of Awesome LLM Inference Papers with Codes.
Large Language Model Deployment Toolkit
📚FFPA: Extend FA2 with Split-D for large headdim, 2x↑ vs SDPA.
A high-throughput and memory-efficient inference and serving engine for LLMs