Skip to content
@FastLM

FastLM

We develop fast, lightweighted LM in large-scale, distributed, parallel, sparsity senarios.

Popular repositories Loading

  1. TinyServe TinyServe Public

    [ACM MM 2025 Oral] TinyServe Page Allocation Kernel Optimization

    Cuda 8 2

  2. CSV-Decode CSV-Decode Public

    CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference

    Python 7

  3. FastCache FastCache Public

    Forked from NoakLiu/FastCache-xDiT

    FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]

    Python 6

  4. PiKV PiKV Public

    Forked from NoakLiu/PiKV

    PiKV: KV Cache Management System for MoE [Efficient ML System]

    Python 5

  5. HSGM HSGM Public

    [ICPADS 2025 Oral, *SEM 2025 Oral] HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

    Python 5

  6. SemToken SemToken Public

    [IWCS 2025 Oral] SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling

    Python 4

Repositories

Showing 10 of 11 repositories

Top languages

Loading…

Most used topics

Loading…