Skip to content
View ambitiousCC's full-sized avatar
  • Hong Kong
  • 23:48 (UTC +08:00)

Highlights

  • Pro

Block or report ambitiousCC

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ambitiousCC/README.md

Hi there 👋

Pinned Loading

  1. fastllm fastllm Public

    Forked from ztxz16/fastllm

    fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。

    C++

  2. kvcache-ai/ktransformers kvcache-ai/ktransformers Public

    A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

    Python 14.4k 1k

  3. chitu chitu Public

    Forked from thu-pacman/chitu

    High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

    Python

  4. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Python

  5. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  6. vllm-ascend vllm-ascend Public

    Forked from vllm-project/vllm-ascend

    Community maintained hardware plugin for vLLM on Ascend

    Python