Skip to content

Conversation

wenbinc-Bin
Copy link

@wenbinc-Bin wenbinc-Bin commented Sep 25, 2025

Purpose

Porting qwen3-vl and qwen3-omni to vllm-fork

Test Plan

PT_HPU_LAZY_MODE=1 VLLM_SKIP_WARMUP=true python examples/offline_inference/vision_language.py --modality image --model-type qwen3_vl_moe
PT_HPU_LAZY_MODE=1 VLLM_SKIP_WARMUP=true python examples/offline_inference/vision_language.py --modality video --model-type qwen3_vl_moe
PT_HPU_LAZY_MODE=1 VLLM_SKIP_WARMUP=true python examples/offline_inference/vision_language.py --modality image --model-type qwen3_omni_moe
PT_HPU_LAZY_MODE=1 VLLM_SKIP_WARMUP=true python examples/offline_inference/vision_language.py --modality video --model-type qwen3_omni_moe

ywang96 and others added 7 commits September 25, 2025 09:20
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Co-authored-by: Huang Jie <[email protected]>
Co-authored-by: 松灵 <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Signed-off-by: Chen, Wenbin <[email protected]>
Signed-off-by: Chen, Wenbin <[email protected]>
Avoid using instance variable when using hpu_graph.

Signed-off-by: Chen, Wenbin <[email protected]>
Avoid using instance variable when using hpu_graph.

Signed-off-by: Chen, Wenbin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants