Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
857 commits
Select commit Hold shift + click to select a range
8dbc899
New linters
gshtras Feb 3, 2025
76b8163
Merge branch 'upstream_merge_25_02_03' of github.com:ROCm/vllm into u…
gshtras Feb 3, 2025
c887bc9
Custom params for mla attention backend
gshtras Feb 3, 2025
b43c8d1
Merge pull request #403 from ROCm/upstream_merge_25_02_03
gshtras Feb 3, 2025
d586c39
Mbatch p3l (#401)
Alexei-V-Ivanov-AMD Feb 4, 2025
ed3337d
Fix quark fp8 format loading. (#395)
fxmarty-amd Feb 4, 2025
f65ecc9
The code assumes WARP_SIZE to be equal to 32, which is not the case o…
gshtras Feb 5, 2025
13434bd
Update README.md 20250205_aiter (#407)
arakowsk-amd Feb 6, 2025
3f610f0
fix rocm get_device name for moe configs (#359)
divakar-amd Feb 7, 2025
29499bb
Fixing the output formatting (#414)
gshtras Feb 10, 2025
6a0deb7
Merge remote-tracking branch 'upstream/main'
gshtras Feb 10, 2025
e2dc610
Add tuned moe config for qwen1.5_moe_A2.7B (#398)
sky0530 Feb 11, 2025
c536ed5
Removing non-existent parameter
gshtras Feb 11, 2025
869a461
Merge remote-tracking branch 'upstream/main' into upstream_merge_25_0…
gshtras Feb 11, 2025
9917cda
Update Benchmark Profiling Scripts (#417)
AdrianAbeyta Feb 12, 2025
b06c154
DS V2V3 fix for same file
Concurrensee Feb 12, 2025
c9a338f
Merge remote-tracking branch 'origin/DS_V2V3_FP16_fix' into upstream_…
gshtras Feb 12, 2025
0ad02c3
Merge branch 'main' into upstream_merge_25_02_10
gshtras Feb 12, 2025
a657220
Lint
gshtras Feb 12, 2025
46476bd
updating manfiest (#416)
arakowsk-amd Feb 12, 2025
42e17aa
Merge branch 'main' into upstream_merge_25_02_10
gshtras Feb 12, 2025
d92dea8
Merge pull request #418 from ROCm/upstream_merge_25_02_10
gshtras Feb 12, 2025
cbbbecb
Aiter base (#419)
gshtras Feb 12, 2025
5f8d758
Initial attempt to adjust codeowners to the ROCm fork (#420)
gshtras Feb 13, 2025
aa63571
Applying weight padding to deepseek (#421)
gshtras Feb 13, 2025
66ee774
[Model] DeepSeek Tunings (#423)
rasmith Feb 13, 2025
2679970
Removing bad config (#425)
gshtras Feb 14, 2025
b96c11c
The order in the file is important. One needs to be explicitly be add…
gshtras Feb 14, 2025
ccaff7f
avoid calling hf_list_repo_files for local model
Isotr0py Feb 16, 2025
7cc05dd
annotation
Isotr0py Feb 16, 2025
ce342c7
Merge remote-tracking branch 'upstream/main' into upstream_merge_25_0…
gshtras Feb 17, 2025
669fc3f
Merge remote-tracking branch 'Isotr0py/local-lookup' into upstream_me…
gshtras Feb 17, 2025
365687d
Merge pull request #430 from ROCm/upstream_merge_25_02_17
gshtras Feb 17, 2025
4fd2f5b
Updating PR template to point people to the upstream repo. Updating c…
gshtras Feb 17, 2025
17b26bd
Enabling the ROCm-vLLM CI on MI250 machines (#432)
Alexei-V-Ivanov-AMD Feb 18, 2025
955ba64
Optimization for quantized gemm skinny sizes (#411)
amd-hhashemi Feb 19, 2025
b63a984
Restricting FP8 wvSplitk to MI300x (#439)
gshtras Feb 19, 2025
39456f3
Remove mi300a (#440)
gshtras Feb 19, 2025
5a6afcc
resolve diff for mixtral8x7B configs (#437)
divakar-amd Feb 20, 2025
ff13c7a
Torch version bump to fix tunable ops (#442)
gshtras Feb 20, 2025
cea7419
Using AITER branch with fixed whl. Disabling PREBUILD_KERNELS until i…
gshtras Feb 21, 2025
118296d
Bump hipblaslt version. Minor fixes to printing the versions (#447)
gshtras Feb 25, 2025
18689d8
Bumping the version in the right place (#448)
gshtras Feb 25, 2025
07336d2
init
SageMoore Feb 25, 2025
c226a30
init
SageMoore Feb 25, 2025
ae3594e
update logs
SageMoore Feb 25, 2025
92a2279
Merge remote-tracking branch 'upstream/main' into upstream_merge_25_0…
gshtras Feb 25, 2025
8230388
Merge remote-tracking branch 'nm/sage/deepseek-rocm-fix' into upstrea…
gshtras Feb 25, 2025
d619b41
Merge branch 'main' into upstream_merge_25_02_24
gshtras Feb 25, 2025
46c1c97
Fix test that was missed by local linters
gshtras Feb 25, 2025
ba6f019
Merge pull request #449 from ROCm/upstream_merge_25_02_24
gshtras Feb 25, 2025
b5a4a37
Stable aiter build (#450)
gshtras Feb 26, 2025
f932181
Remove batch padding on ROCm (#451)
gshtras Feb 26, 2025
386763c
Aiter whl fix branch (#452)
gshtras Feb 27, 2025
fd70f59
tuning adjustment for quantized skinny gemm. (#444)
amd-hhashemi Feb 28, 2025
24c6283
Merge remote-tracking branch 'upstream/main' into upstream_merge_25_0…
gshtras Mar 3, 2025
87bf00a
Revert "[core] Perf improvement for DSv3 on AMD GPUs (#13718)"
gshtras Mar 4, 2025
7cd9ea1
using list for typing
gshtras Mar 4, 2025
caa2810
Merge pull request #458 from ROCm/upstream_merge_25_03_03
gshtras Mar 4, 2025
f501118
cython doesn't support type (#460)
gshtras Mar 5, 2025
27f6c7b
Building the base images for MI and Navi; Using aiter hotfix (#461)
gshtras Mar 5, 2025
ae056e1
init
SageMoore Mar 5, 2025
0feb91a
Building hipblaslt including the clients (#462)
gshtras Mar 5, 2025
f1dbffb
cleanup boolean logic
SageMoore Mar 6, 2025
8f9664d
comments
SageMoore Mar 7, 2025
3ee6551
Fixing the shape to use in padding calculation (#464)
gshtras Mar 7, 2025
9ef3d37
Merge remote-tracking branch 'upstream/main'
gshtras Mar 10, 2025
ff60bf3
Merge remote-tracking branch 'nm/sage/amd-deepseek' into upstream_mer…
gshtras Mar 10, 2025
1095cff
Merge pull request #471 from ROCm/upstream_merge_25_03_10
gshtras Mar 10, 2025
34dbe31
V1 rocm support (#469)
maleksan85 Mar 11, 2025
0f2300e
nightly_fixed_aiter_integration_final_20250305 README update (#470)
Mcirino1 Mar 11, 2025
16c8185
Updated README.md with config info and header font size (#473)
Mcirino1 Mar 12, 2025
1aec156
Bump aiter version (#476)
gshtras Mar 12, 2025
7be8c1f
Merge remote-tracking branch 'upstream/main'
gshtras Mar 12, 2025
d7657c2
Remove CMake FP8 conditioning
gshtras Mar 12, 2025
b758abb
Merge pull request #478 from ROCm/unified_fp8
gshtras Mar 12, 2025
2df0e9b
Rocm vllm ci fix (new design) (#475)
Alexei-V-Ivanov-AMD Mar 13, 2025
f9d626f
Add @hongxiayang (#481)
gshtras Mar 13, 2025
9d4368d
use current_platform.fp8_dtype() for FA (#483)
divakar-amd Mar 14, 2025
af40d33
Removing the padding again after it had been overwritten by upstream …
gshtras Mar 17, 2025
34ece77
Merge remote-tracking branch 'upstream/main'
gshtras Mar 21, 2025
8ef1ed7
fix process weights after loading for qkv-x linear
Isotr0py Mar 22, 2025
b0b2936
fix weight loader
Isotr0py Mar 22, 2025
e130220
fix unquantize
Isotr0py Mar 22, 2025
cb5d391
Remove duplicated line
gshtras Mar 24, 2025
f6bf144
Bring back fallback to eager mode removed in #14917, but for ROCm only
gshtras Mar 24, 2025
9bedee8
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Mar 24, 2025
763044d
Merge remote-tracking branch 'Isotr0py/fix-xqkv-quant' into upstream_…
gshtras Mar 24, 2025
514a0b8
Merge remote-tracking branch 'origin/mllama_rocm_eager' into upstream…
gshtras Mar 24, 2025
3f5c97a
Fixed attention kernel test
gshtras Mar 24, 2025
4a0afe5
Merge pull request #489 from ROCm/upstream_merge_2025_03_24
gshtras Mar 24, 2025
7210905
Avoid saving files to the current folder as it does not always have w…
gshtras Mar 25, 2025
a1c35e7
Using torch commit that supports running scaled_mm on Radeon (#492)
gshtras Mar 25, 2025
94c662f
Merge remote-tracking branch 'upstream/main'
gshtras Mar 26, 2025
970a0b4
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Mar 26, 2025
f782c66
add AITER MLA implementation in attention backend
vllmellm Mar 28, 2025
42d5c62
remove unused arguments in aiter mla decode fwd kernel
vllmellm Mar 28, 2025
f274194
Remove duplicate code in config.py (#494)
sstamenk Mar 28, 2025
812a41c
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Mar 28, 2025
82ade59
Merge remote-tracking branch 'origin/main' into upstream_merge_2025_0…
gshtras Mar 28, 2025
a4dba75
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Mar 28, 2025
51c9e6d
In light of the breaking cmake v4 release (#495)
gshtras Mar 28, 2025
25070a1
Docs_update_20250327 (#493)
arakowsk-amd Mar 28, 2025
565a3fd
add unittest for AITER MLA backend in attention selector
vllmellm Mar 29, 2025
e294861
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Mar 31, 2025
f264100
Merge branch 'main' into upstream_merge_2025_03_31
gshtras Mar 31, 2025
be404be
Merge pull request #497 from ROCm/upstream_merge_2025_03_31
gshtras Mar 31, 2025
bc05166
Triton MLA parameter tweak for AMD GPU
qli88 Mar 31, 2025
645f400
add unittest for MLA attention backend selector
vllmellm Apr 1, 2025
22c8726
code cleaning
vllmellm Apr 1, 2025
5dc1348
update AITER version
vllmellm Apr 1, 2025
12f8023
Merge remote-tracking branch 'origin/main' into aiter-mla-integration
vllmellm Apr 1, 2025
da8c69f
add ck flash attn in prefill mla computation
vllmellm Apr 2, 2025
1ea5718
further code cleaning
vllmellm Apr 2, 2025
681d777
Merge remote-tracking branch 'origin/main' into aiter-mla-integration
vllmellm Apr 2, 2025
0ad8064
Merge remote-tracking branch 'upstream/main'
gshtras Apr 2, 2025
9ada055
fix mypy typing errors
vllmellm Apr 3, 2025
1ceb3b9
Merge remote-tracking branch 'origin/main' into aiter-mla-integration
vllmellm Apr 3, 2025
20a3f07
fix mypy error on Iterable typing error
vllmellm Apr 3, 2025
7153046
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Apr 3, 2025
e3f03b7
Disable fp8_out_scale on V1
gshtras Apr 3, 2025
eaecf03
Merge remote-tracking branch 'embedded/aiter-mla-integration' into up…
gshtras Apr 3, 2025
c045f59
Merge pull request #499 from ROCm/upstream_merge_2025_04_02
gshtras Apr 3, 2025
b101125
Bump aiter version (#500)
gshtras Apr 3, 2025
6d258fa
Adding 2stage MoE support separately until it is added upstream (#501)
gshtras Apr 3, 2025
732455b
Fused FP8 conversion in attention for v1 (#502)
gshtras Apr 7, 2025
f657987
Merge remote-tracking branch 'upstream/main'
gshtras Apr 7, 2025
2b6e9c9
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Apr 7, 2025
d17d4df
Merge pull request #503 from ROCm/upstream_merge_2025_04_07
gshtras Apr 7, 2025
8826599
Fix fused moe (#506)
gshtras Apr 7, 2025
97b78bf
Update moe_tune_script.sh (#507)
divakar-amd Apr 8, 2025
f68829f
doubled size to wa issue and preserve CAR perf (#510)
maleksan85 Apr 10, 2025
b8498bc
re-enable custom paged attention for V0 (#511)
charlifu Apr 10, 2025
f4b308f
Add gfx950 to the attention archs
jpvillam-amd Apr 3, 2025
e201e58
Linter
jpvillam-amd Apr 10, 2025
c43debd
Updated README.md with April 10 results (#512)
Mcirino1 Apr 14, 2025
9025082
Update README.md (#514)
faisalgulfam32 Apr 16, 2025
1c0a1ae
update base image (#515)
charlifu Apr 17, 2025
44c9580
Merge remote-tracking branch 'upstream/main'
gshtras Apr 21, 2025
40f2157
Update test-template.j2 to enable building (#517)
Alexei-V-Ivanov-AMD Apr 21, 2025
60cd57b
Update test-template.j2 to fix new location of run-amd-test.sh (#518)
Alexei-V-Ivanov-AMD Apr 21, 2025
e26141f
Rocm 6.4 docker (#519)
gshtras Apr 22, 2025
8ad1c44
Update README.md (#521)
t-parry Apr 22, 2025
49e4719
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Apr 23, 2025
a9af7a9
Remove leftovers from 2stage
gshtras Apr 23, 2025
105e655
Re-add 2stage moe
gshtras Apr 23, 2025
ae144d6
custom all-reduce, gfx950
seungrokj Apr 24, 2025
cfda5b3
Merge remote-tracking branch 'origin/main' into upstream_merge_2025_0…
gshtras Apr 24, 2025
c5b41dc
Missing parameter for sdpa
gshtras Apr 24, 2025
c383e6c
Update README.md (#523)
t-parry Apr 24, 2025
cfc530a
Merge branch 'main' into upstream_merge_2025_04_21
gshtras Apr 24, 2025
c3f61dd
Merge pull request #522 from ROCm/upstream_merge_2025_04_21
gshtras Apr 24, 2025
8c211e5
Merge remote-tracking branch 'upstream/main'
gshtras Apr 25, 2025
a9e7a00
Fix API typo and remove FP8 on V1 restriction
gshtras Apr 25, 2025
28007b0
Upstream merge 2025 04 25 (#524)
gshtras Apr 25, 2025
8bd7ee1
Bump hiblaslt (#528)
gshtras Apr 28, 2025
328b04d
Merge branch 'main' into jpvillam/fa_gfx950
jpvillam-amd Apr 28, 2025
550b072
Update rocm.py
jpvillam-amd Apr 28, 2025
1fbb019
Restrict setuptools version (#529)
gshtras Apr 28, 2025
ad806ba
Linter
jpvillam-amd Apr 28, 2025
dc6c46b
lint
gshtras Apr 28, 2025
1f4e00c
Revert aiter commit (#530)
gshtras Apr 29, 2025
e8766c6
Merge remote-tracking branch 'upstream/main'
gshtras Apr 29, 2025
7a9f58a
Update README.md (#531)
t-parry Apr 30, 2025
41b85b6
Restrict ray version due to https://github.com/ray-project/ray/issues…
gshtras Apr 30, 2025
8e45f88
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Apr 30, 2025
285ac51
Merge remote-tracking branch 'upstream/main' into jpvillam/fa_gfx950
gshtras Apr 30, 2025
0bc1d7c
No vllm.vllm_flash_attn.layers.rotary on ROCm
gshtras Apr 30, 2025
134d285
Merge remote-tracking branch 'origin/rocm_fix' into upstream_merge_20…
gshtras Apr 30, 2025
2921150
Merge remote-tracking branch 'origin/jpvillam/fa_gfx950' into upstrea…
gshtras Apr 30, 2025
f3a5bf0
Restore the function that is used elsewhere
gshtras Apr 30, 2025
8334e54
Merge remote-tracking branch 'origin/jpvillam/fa_gfx950' into upstrea…
gshtras Apr 30, 2025
c1cb05e
Fix Quark API use
gshtras May 1, 2025
2c68ff9
Merge branch 'main' into upstream_merge_2025_04_29
gshtras May 2, 2025
29241ca
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras May 2, 2025
0b8eaec
Re-fix Quark API
gshtras May 2, 2025
f3f620a
Using the right torch API
gshtras May 2, 2025
2fea69f
Merge pull request #536 from ROCm/upstream_merge_2025_04_29
gshtras May 2, 2025
d283632
Merge remote-tracking branch 'upstream/main'
gshtras May 6, 2025
8e62073
Fix for the condition to accept empty encoder inputs for mllama
gshtras May 6, 2025
a0b4ef2
Cherry-pick skinny gemm fix
gshtras May 6, 2025
166d0ef
Merge pull request #538 from ROCm/upstream_merge_2025_05_06
gshtras May 6, 2025
b526478
Aiter mla cherrypick (#543)
gshtras May 9, 2025
c791a85
Cherry pick skinny gemms (#544)
gshtras May 9, 2025
59f1b15
add gfx950 support for skinny gemms
charlifu May 12, 2025
5b1895e
Merge branch 'main' into amd/gfx950_skinny_gemm
charlifu May 13, 2025
6f5df79
Merge remote-tracking branch 'upstream/main'
gshtras May 13, 2025
6b08324
Merge remote-tracking branch 'origin/main'
gshtras May 13, 2025
d9da93f
fix on_mi3xx
charlifu May 14, 2025
bb1f213
Merge remote-tracking branch 'upstream/main'
gshtras May 15, 2025
0c6ce45
Merge remote-tracking branch 'upstream/main'
gshtras May 15, 2025
222fa01
Remove gradlib
gshtras May 15, 2025
34483a3
Fix P3L Arg parser
gshtras May 15, 2025
c13eddf
pre-commit
gshtras May 15, 2025
1466c79
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras May 16, 2025
ccd96e8
Toggle for v1 attention
gshtras May 16, 2025
262ed1e
Merge pull request #547 from ROCm/upstream_merge_2025_05_15
gshtras May 16, 2025
d1d3ff9
Remove gradlib mention from pyproject (#549)
gshtras May 16, 2025
db892e7
Fix input layer norm mismatch for Eagle Speculative Decoding compatib…
mmkamani7 May 16, 2025
16d2b92
Updated README.md (#546)
Mcirino1 May 19, 2025
662127a
Merge remote-tracking branch 'upstream/main'
gshtras May 19, 2025
9b131ae
Caching the env variable in the __init__
gshtras May 19, 2025
e34fd18
Restrict FP8 attention output to non unified backend until the accura…
gshtras May 19, 2025
e94c760
Merge pull request #550 from ROCm/upstream_merge_2025_05_19
gshtras May 19, 2025
8a67a53
Reduce diff from upstream (#551)
gshtras May 20, 2025
e950b15
Fixing a bug from transformers==4.52. config.head_dim is now explicit…
gshtras May 20, 2025
258d2d3
Remove the option to compile cython during the docker build. It hasn'…
gshtras May 20, 2025
a31e5d8
Fixing pre-commit in github. Not sure why this issue does not affect …
gshtras May 20, 2025
16af49c
Merge remote-tracking branch 'upstream/main'
gshtras May 21, 2025
91a5600
Fused FP8 attention output is now only possible for both flash and pa…
gshtras May 21, 2025
7c1213e
Remove incorrect env value
gshtras May 21, 2025
1c450a5
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras May 27, 2025
d5e35a9
Merge remote-tracking branch 'origin/main' into upstream_merge_2025_0…
gshtras May 27, 2025
1900335
Upstream merge 2025 05 27 (#557)
gshtras May 27, 2025
307d8bc
Removing redundant parameters from the MIs side and fixing Navi build…
gshtras May 27, 2025
12447b9
Merge branch 'main' into amd/gfx950_skinny_gemm
charlifu May 28, 2025
630ed84
cache get_lds_size()
charlifu May 28, 2025
f4a992c
Removing RPD in favor of torch profiler for V1 (#558)
gshtras May 29, 2025
bee14ca
Merge remote-tracking branch 'upstream/main'
gshtras May 29, 2025
7bb0618
Added benchmark results and commit hash (#556)
Mcirino1 May 29, 2025
0286875
Merge branch 'main' into amd/gfx950_skinny_gemm
charlifu May 29, 2025
7bf92f9
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras May 29, 2025
421c498
Merge leftover
gshtras May 29, 2025
628db8d
Merge remote-tracking branch 'origin/amd/gfx950_skinny_gemm' into ups…
gshtras May 29, 2025
d92c04b
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Jun 2, 2025
9c22cdd
Remove redundant configs
gshtras Jun 2, 2025
9d4c238
Merge branch 'main' into upstream_merge_2025_06_02
gshtras Jun 2, 2025
3712649
Merge pull request #565 from ROCm/upstream_merge_2025_06_02
gshtras Jun 2, 2025
ab92741
Merge remote-tracking branch 'upstream/main'
gshtras Jun 3, 2025
aee731f
cleanup
gshtras Jun 3, 2025
8cde510
Merge pull request #566 from ROCm/upstream_merge_2025_06_03
gshtras Jun 3, 2025
8377189
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Jun 3, 2025
29bef2c
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Jun 4, 2025
71cbfe5
Fix attention fp8 output fusion for split attention path in v1 (#569)
gshtras Jun 5, 2025
ccfa3b8
Merge remote-tracking branch 'origin/main' into upstream_merge_2025_0…
gshtras Jun 5, 2025
bcbb7a6
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Jun 5, 2025
1a254d8
Merge pull request #570 from ROCm/upstream_merge_2025_06_05
gshtras Jun 5, 2025
cdfe72b
Rocm 6.4.1 as base (#571)
gshtras Jun 5, 2025
4908f2c
Merge remote-tracking branch 'upstream/main'
gshtras Jun 9, 2025
6ec2533
CAR check is done elsewhere, as in upstream
gshtras Jun 9, 2025
93d7a4d
Merge pull request #575 from ROCm/upstream_merge_2025_06_09
gshtras Jun 9, 2025
68af055
Updated README.md for June 10 release (#574)
Mcirino1 Jun 11, 2025
4b03baa
Merge remote-tracking branch 'upstream/main'
gshtras Jun 11, 2025
d3fc29f
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Jun 11, 2025
5b601bb
Cleanup
gshtras Jun 12, 2025
b376029
Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…
gshtras Jun 12, 2025
0eb854c
New typos checker
gshtras Jun 12, 2025
9b43d47
Merge pull request #577 from ROCm/upstream_merge_2025_06_12
gshtras Jun 12, 2025
7296ad6
Update test-template.j2
okakarpa Jun 16, 2025
274d64a
Update test-template.j2
okakarpa Jun 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .buildkite/scripts/hardware_ci/run-amd-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ export PYTHONPATH=".."
echo "--- Confirming Clean Initial State"
while true; do
sleep 3
if grep -q clean /opt/amdgpu/etc/gpu_state; then
if grep -q clean ${BUILDKITE_AGENT_META_DATA_RESET_TARGET}; then
echo "GPUs state is \"clean\""
break
fi
Expand Down Expand Up @@ -49,18 +49,18 @@ cleanup_docker

echo "--- Resetting GPUs"

echo "reset" > /opt/amdgpu/etc/gpu_state
echo "reset" > ${BUILDKITE_AGENT_META_DATA_RESET_TARGET}

while true; do
sleep 3
if grep -q clean /opt/amdgpu/etc/gpu_state; then
if grep -q clean ${BUILDKITE_AGENT_META_DATA_RESET_TARGET}; then
echo "GPUs state is \"clean\""
break
fi
done

echo "--- Pulling container"
image_name="rocm/vllm-ci:${BUILDKITE_COMMIT}"
image_name="rocm/vllm-ci-private:${BUILDKITE_COMMIT}"
container_name="rocm_${BUILDKITE_COMMIT}_$(tr -dc A-Za-z0-9 < /dev/urandom | head -c 10; echo)"
docker pull "${image_name}"

Expand Down
58 changes: 58 additions & 0 deletions .buildkite/test-template.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
{% set docker_image = "public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:$BUILDKITE_COMMIT" %}
{% set docker_image_amd = "rocm/vllm-ci-private:$BUILDKITE_COMMIT" %}
{% set default_working_dir = "vllm/tests" %}
{% set hf_home = "/root/.cache/huggingface" %}

steps:
- label: ":docker: build image"
depends_on: ~
commands:
- "docker build --build-arg max_jobs=16 --tag {{ docker_image_amd }} -f docker/Dockerfile.rocm --build-arg ARG_PYTORCH_ROCM_ARCH='gfx90a;gfx942' --target test --progress plain ."
- "docker push {{ docker_image_amd }}"
key: "amd-build"
env:
DOCKER_BUILDKIT: "1"
retry:
automatic:
- exit_status: -1 # Agent was lost
limit: 5
- exit_status: -10 # Agent was lost
limit: 5
agents:
queue: amd-cpu
soft_fail: false

{% for step in steps %}
{% if step.mirror_hardwares and mirror_hw in step.mirror_hardwares %}
- label: "AMD MI300: {{ step.label }}"
depends_on: amd-build
agents:
{% if step.label and step.label=="Benchmarks" or step.label=="Kernels Attention Test %N" or step.label=="Kernels Quantization Test %N" %}
queue: amd_mi300_8
{% elif step.label=="Distributed Tests (4 GPUs)" or step.label=="2 Node Tests (4 GPUs in total)" or step.label=="Multi-step Tests (4 GPUs)" or step.label=="Pipeline Parallelism Test" or step.label=="LoRA TP Test (Distributed)" %}
queue: amd_mi300_4
{% elif step.label=="Distributed Comm Ops Test" or step.label=="Distributed Tests (2 GPUs)" or step.label=="Plugin Tests (2 GPUs)" or step.label=="Weight Loading Multiple GPU Test" or step.label=="Weight Loading Multiple GPU Test - Large Models" %}
queue: amd_mi300_2
{% else %}
queue: amd_mi300_1
{% endif%}
command: bash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd {{ (step.working_dir or default_working_dir) | safe }} ; {{ step.command or (step.commands | join(" && ")) | safe }}"
env:
DOCKER_BUILDKIT: "1"
priority: 100
soft_fail: true
{% endif %}
{% endfor %}
{% for step in steps %}
{% if step.mirror_hardwares and mirror_hw in step.mirror_hardwares and (step.label and step.label=="Benchmarks" or step.label=="LoRA Test %N" or step.label=="Kernels Attention Test %N" or step.label=="Kernels Quantization Test %N" or step.label=="Distributed Tests (4 GPUs)" or step.label=="Distributed Comm Ops Test" or step.label=="2 Node Tests (4 GPUs in total)" or step.label=="Distributed Tests (2 GPUs)" or step.label=="Plugin Tests (2 GPUs)" or step.label=="Multi-step Tests (4 GPUs)" or step.label=="Pipeline Parallelism Test" or step.label=="LoRA TP Test (Distributed)" or step.label=="Weight Loading Multiple GPU Test" or step.label=="Weight Loading Multiple GPU Test - Large Models") %}
- label: "AMD MI250: {{ step.label }}"
depends_on: amd-build
agents:
queue: amd_mi250_8
command: bash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd {{ (step.working_dir or default_working_dir) | safe }} ; {{ step.command or (step.commands | join(" && ")) | safe }}"
env:
DOCKER_BUILDKIT: "1"
priority: 100
soft_fail: true
{% endif %}
{% endfor %}
52 changes: 8 additions & 44 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,50 +1,14 @@
# See https://help.github.com/articles/about-codeowners/
# for more info about CODEOWNERS file

# This lists cover the "core" components of vLLM that require careful review
/vllm/attention/backends/abstract.py @WoosukKwon @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/core @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/engine/llm_engine.py @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/executor/executor_base.py @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/worker/worker_base.py @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/worker/worker.py @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/model_executor/layers/sampler.py @zhuohan123 @youkaichao @alexm-redhat @comaniac @njhill
/vllm/model_executor/layers/quantization @mgoin @robertgshaw2-redhat @tlrmchlsmth
/vllm/model_executor/guided_decoding @mgoin @russellb @aarnphm
/vllm/multimodal @DarkLight1337 @ywang96
/vllm/vllm_flash_attn @LucasWilkinson
/vllm/lora @jeejeelee
/vllm/reasoning @aarnphm
/vllm/entrypoints @aarnphm
CMakeLists.txt @tlrmchlsmth
* @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang

# vLLM V1
/vllm/v1 @WoosukKwon @robertgshaw2-redhat @njhill @ywang96 @comaniac @alexm-redhat
/vllm/v1/structured_output @mgoin @russellb @aarnphm
/csrc/ @charlifu @mawong-amd @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang
/vllm/ @charlifu @mawong-amd @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang

# Test ownership
/.buildkite/lm-eval-harness @mgoin @simon-mo
/tests/async_engine @njhill @robertgshaw2-redhat @simon-mo
/tests/basic_correctness/test_chunked_prefill @rkooo567 @comaniac
/tests/distributed/test_multi_node_assignment.py @youkaichao
/tests/distributed/test_pipeline_parallel.py @youkaichao
/tests/distributed/test_same_node.py @youkaichao
/tests/entrypoints @DarkLight1337 @robertgshaw2-redhat @simon-mo @aarnphm
/tests/entrypoints/llm/test_guided_generate.py @mgoin @russellb @aarnphm
/tests/kernels @tlrmchlsmth @WoosukKwon
/tests/model_executor/test_guided_processors.py @mgoin @russellb
/tests/models @DarkLight1337 @ywang96
/tests/multi_step @alexm-redhat @comaniac
/tests/multimodal @DarkLight1337 @ywang96
/tests/prefix_caching @comaniac @KuntaiDu
/tests/quantization @mgoin @robertgshaw2-redhat
/tests/spec_decode @njhill @LiuXiaoxuanPKU
/tests/test_inputs.py @DarkLight1337 @ywang96
/tests/v1/entrypoints/llm/test_struct_output_generate.py @mgoin @russellb @aarnphm
/tests/v1/structured_output @mgoin @russellb @aarnphm
/tests/weight_loading @mgoin @youkaichao
/tests/lora @jeejeelee
fused_moe @divakar-amd @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang

# Docs
/docs @hmellor
mkdocs.yaml @hmellor
/tests/ @Alexei-V-Ivanov-AMD @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang
/.buildkite/ @Alexei-V-Ivanov-AMD @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang

/benchmarks/profiling @AdrianAbeyta @dllehr-amd @shajrawi @gshtras @maleksan85 @sunway513 @hongxiayang
19 changes: 2 additions & 17 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,3 @@
## Essential Elements of an Effective PR Description Checklist
- [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
- [ ] The test plan, such as providing test command.
- [ ] The test results, such as pasting the results comparison before and after, or e2e results
- [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model.
Please direct your PRs to the upstream vllm (https://github.com/vllm-project/vllm.git)

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.

## Purpose

## Test Plan

## Test Result

## (Optional) Documentation Update

<!--- pyml disable-next-line no-emphasis-as-heading -->
**BEFORE SUBMITTING, PLEASE READ <https://docs.vllm.ai/en/latest/contributing>** (anything written below this line will be removed by GitHub Actions)
Accepting PRs into the ROCm fork (https://github.com/ROCm/vllm) will require a clear previously communicated exception
85 changes: 0 additions & 85 deletions .github/workflows/lint-and-deploy.yaml

This file was deleted.

105 changes: 38 additions & 67 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ jobs:
release:
# Retrieve tag and create release
name: Create Release
runs-on: ubuntu-latest
runs-on: self-hosted
container:
image: rocm/pytorch:rocm6.2_ubuntu20.04_py3.9_pytorch_release_2.3.0
outputs:
upload_url: ${{ steps.create_release.outputs.upload_url }}
steps:
Expand All @@ -39,73 +41,42 @@ jobs:
const script = require('.github/workflows/scripts/create_release.js')
await script(github, context, core)

# NOTE(simon): No longer build wheel using GitHub Actions. See buildkite's release workflow.
# wheel:
# name: Build Wheel
# runs-on: ${{ matrix.os }}
# needs: release
wheel:
name: Build Wheel
runs-on: self-hosted
container:
image: rocm/pytorch:rocm6.2_ubuntu20.04_py3.9_pytorch_release_2.3.0
needs: release

# strategy:
# fail-fast: false
# matrix:
# os: ['ubuntu-20.04']
# python-version: ['3.9', '3.10', '3.11', '3.12']
# pytorch-version: ['2.4.0'] # Must be the most recent version that meets requirements/cuda.txt.
# cuda-version: ['11.8', '12.1']
strategy:
fail-fast: false

# steps:
# - name: Checkout
# uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

# - name: Setup ccache
# uses: hendrikmuhs/ccache-action@ed74d11c0b343532753ecead8a951bb09bb34bc9 # v1.2.14
# with:
# create-symlink: true
# key: ${{ github.job }}-${{ matrix.python-version }}-${{ matrix.cuda-version }}

# - name: Set up Linux Env
# if: ${{ runner.os == 'Linux' }}
# run: |
# bash -x .github/workflows/scripts/env.sh

# - name: Set up Python
# uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
# with:
# python-version: ${{ matrix.python-version }}

# - name: Install CUDA ${{ matrix.cuda-version }}
# run: |
# bash -x .github/workflows/scripts/cuda-install.sh ${{ matrix.cuda-version }} ${{ matrix.os }}

# - name: Install PyTorch ${{ matrix.pytorch-version }} with CUDA ${{ matrix.cuda-version }}
# run: |
# bash -x .github/workflows/scripts/pytorch-install.sh ${{ matrix.python-version }} ${{ matrix.pytorch-version }} ${{ matrix.cuda-version }}

# - name: Build wheel
# shell: bash
# env:
# CMAKE_BUILD_TYPE: Release # do not compile with debug symbol to reduce wheel size
# run: |
# bash -x .github/workflows/scripts/build.sh ${{ matrix.python-version }} ${{ matrix.cuda-version }}
# wheel_name=$(find dist -name "*whl" -print0 | xargs -0 -n 1 basename)
# asset_name=${wheel_name//"linux"/"manylinux1"}
# echo "wheel_name=${wheel_name}" >> "$GITHUB_ENV"
# echo "asset_name=${asset_name}" >> "$GITHUB_ENV"
steps:
- name: Prepare
run: |
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
pip3 install -U triton

# - name: Upload Release Asset
# uses: actions/upload-release-asset@e8f9f06c4b078e705bd2ea027f0926603fc9b4d5 # v1.0.2
# env:
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# with:
# upload_url: ${{ needs.release.outputs.upload_url }}
# asset_path: ./dist/${{ env.wheel_name }}
# asset_name: ${{ env.asset_name }}
# asset_content_type: application/*
- name: Checkout
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

# (Danielkinz): This last step will publish the .whl to pypi. Warning: untested
# - name: Publish package
# uses: pypa/gh-action-pypi-publish@release/v1.8
# with:
# repository-url: https://test.pypi.org/legacy/
# password: ${{ secrets.PYPI_API_TOKEN }}
# skip-existing: true
- name: Build wheel
shell: bash
env:
CMAKE_BUILD_TYPE: Release # do not compile with debug symbol to reduce wheel size
run: |
bash -x .github/workflows/scripts/build.sh
wheel_name=$(find dist -name "*whl" -print0 | xargs -0 -n 1 basename)
asset_name=${wheel_name//"linux"/"manylinux1"}
echo "wheel_name=${wheel_name}" >> "$GITHUB_ENV"
echo "asset_name=${asset_name}" >> "$GITHUB_ENV"

- name: Upload vllm Release Asset
uses: actions/upload-release-asset@e8f9f06c4b078e705bd2ea027f0926603fc9b4d5 # v1.0.2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.release.outputs.upload_url }}
asset_path: ./dist/${{ env.wheel_name }}
asset_name: ${{ env.asset_name }}
asset_content_type: application/*
Loading