Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[WIP] remove wip models from model_loader
#1143 opened Nov 21, 2025 by mailvijayasingh Loading…
Fix numerical issue on hybrid kv cache allocation
#1139 opened Nov 20, 2025 by Chenyaaang Loading…
[DP] Functional DP for GPT-OSS
#1137 opened Nov 20, 2025 by wenxindongwork Loading…
DP support for GPT OSS
#1096 opened Nov 13, 2025 by wenxindongwork Draft
Enable Pipeline Parallelism on Jax models
#1077 opened Nov 12, 2025 by Chenyaaang Loading…
1 of 8 tasks
Exposes graphdef for flax models.
#1059 opened Nov 10, 2025 by wang2yn84 Loading…
Enable Pipeline Parallelism on Jax runner
#1053 opened Nov 8, 2025 by Chenyaaang Loading…
1 of 8 tasks
Update v7's jax requirements
#1037 opened Nov 7, 2025 by qihqi Loading…
[Docs] fix dead links in multiple documentation pages
#1027 opened Nov 6, 2025 by mattheliu Loading…
3 tasks done
Support float8_e4m3fn dtype weight loading for jax
#1024 opened Nov 6, 2025 by inho9606 Loading…
Remove SKIP_JAX_PRECOMPILE
#1018 opened Nov 5, 2025 by kyuyeunk Loading…
Support Embedding Model/Task
#1015 opened Nov 5, 2025 by carlesoctav Loading…
Update tpu_worker_jax.py
#982 opened Oct 30, 2025 by fenyuan-gg Loading…
[Spec Decoding] Reduce TPU <-> CPU data transfer
#961 opened Oct 28, 2025 by Lumosis Loading…
Update README.md
#956 opened Oct 27, 2025 by bvrockwell Loading…
ProTip! Follow long discussions with comments:>50.