ambitiousCC

Follow

Q's repo ambitiousCC

Follow

22 followers · 26 following

Hong Kong
23:48 (UTC +08:00)

Achievements

Achievements

Highlights

Pro

ambitiousCC/README.md

Hi there 👋

Pinned Loading

fastllm fastllm Public

Forked from ztxz16/fastllm

fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型，任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型，单并发20tps；INT4量化模型单并发30tps，多并发可达60+。

C++
kvcache-ai/ktransformers kvcache-ai/ktransformers Public

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14.4k 1k
chitu chitu Public

Forked from thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python
sglang sglang Public

Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
vllm-ascend vllm-ascend Public

Forked from vllm-project/vllm-ascend

Community maintained hardware plugin for vLLM on Ascend

Python