Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1314 commits
Select commit Hold shift + click to select a range
a0be71e
[MM] Enable FlashInfer metadata support for Qwen2.5-VL vision attenti…
huanghua1994 May 23, 2026
7c2ff1f
[Docs] Fix stale version number in token_embed.md (#43488)
fuergaosi233 May 23, 2026
8737e4a
[Docs] Fix stale version number in token_classify.md (#43489)
fuergaosi233 May 23, 2026
4438b6e
[MoE] Migrate W4A8 CT to oracle kernel setup (#42680)
bedeks May 23, 2026
819c610
[Mooncake] Add metrics for MooncakeStoreConnector operations (#43392)
Dao007forever May 23, 2026
46f95b2
[ROCm][Critical] Fix the GDN import bug (#43486)
tjtanaa May 23, 2026
10d264a
Revert "[Misc] add humming to dependencies" (#43492)
mgoin May 23, 2026
b32fe41
[Bugfix] Fix reasoning dropped on streaming boundary deltas (#42691)
sfeng33 May 23, 2026
33d7cbe
[Model Runner v2] Force v1 runner for tests (#43233)
yewentao256 May 23, 2026
0902d8e
[KV Connector] Keep MooncakeStore full hits block-aligned (#43494)
Dao007forever May 24, 2026
357fddf
[kv_offload]: Add DSv4 support (#43142)
orozery May 24, 2026
5940590
[ROCm][CI] Stabilize 400 error return code for invalid schema inputs …
AndreasKaratzas May 24, 2026
1806d1a
[ROCm] [DSv4] [Perf] Support DeepSeek v4 MTP (#43385)
tjtanaa May 24, 2026
d56285c
Tuning script and configs for Triton Mamba SSU kernel (#43083)
danisereb May 24, 2026
d0a100c
File system secondary tier implemented in python (#41735)
rshavitt May 24, 2026
b06813e
[Kernel] Add mhc_pre_big_fuse_with_norm_tilelang (#43474)
jeejeelee May 25, 2026
6cbe448
fix: MoE model using shared routed experts crashes on AMD GPUs (#42373)
weizhoublue May 25, 2026
1b26fa3
[Docs] Reorganize offline inference docs. (#43552)
noooop May 25, 2026
3df1c7c
[Docker] Non-root support for vllm-openai; add opt-in vllm-openai-non…
TheDuyIT May 25, 2026
81252d4
[Feat][KVConnector] Support DSV4 in SimpleCPUOffloadBackend (#42296)
ivanium May 25, 2026
0c942c6
[Doc] Add section on escalating stalled contributions (#43568)
esmeetu May 25, 2026
5c1aec3
Reduce memory usage for granite_speech. (#42933)
Yihuki May 25, 2026
873758c
[KV Connector] Handle Mooncake finish after preemption (#43281)
zhewenl May 25, 2026
716d529
[Misc] Print accuracy value for PD tests even on success (#43583)
NickLucche May 25, 2026
d400445
[Kernel] Remove NormGateLinear (#43554)
jeejeelee May 25, 2026
71d810b
[XPU] Ensure RNG offset alignment with PyTorch requirements in XPU sa…
chaojun-zhang May 26, 2026
ec5de7f
[LoRA] Add one shot triton kernel For MoE LoRA (#42290)
jeejeelee May 26, 2026
aa2b56f
[DeepSeek V4] Move MegaMoE input prep kernel to nvidia/ops (#43632)
WoosukKwon May 26, 2026
7966fc7
[KV Connector][Bugfix] MooncakeStore: don't double-apply Eagle prune …
Dao007forever May 26, 2026
c2a4005
[KV Connector] Propagate MooncakeStore load failures (#42788)
Dao007forever May 26, 2026
f815c99
[Bugfix] fix device mismatch in MiniCPM-o-4_5 resampler (#43194)
yma11 May 26, 2026
d5cf7b4
[Frontend] Split the offline inference APIs and utils. (#43553)
noooop May 26, 2026
6f95598
[Bugfix][Model] Fix GPT2ForSequenceClassification sub-module prefix (…
QingZhou-YangHY May 26, 2026
d56612c
[GDN] GDN Prefill kernel for SM100 (#43273)
gau-nernst May 26, 2026
771e1e4
[CPU] Enable non-divisible GQA for decode workitems in mixed batches …
zhejiangxiaomai May 26, 2026
e6adbd7
Upgrade tpu-inference to v0.20.0 (#43394)
CienetStingLin May 26, 2026
a37e471
Add CuTe DSL sparse compressor support (#43584)
Jie-Fang May 26, 2026
b326945
[chores][log] change registry log from `warning` to `debug` (#43045)
ILikeIneine May 26, 2026
97e4022
[Bugfix] Apply fc_norm in Eagle3DeepseekV2 combine_hidden_states (#43…
yubofredwang May 26, 2026
755043c
[KV Transfer] Enable HMA by default for connectors that support it (#…
chfeng-cs May 26, 2026
681d7dd
[Misc][Refactor][ROCm] Convert MoRI-related envvars to extra config a…
simondanielsson May 26, 2026
5d09f47
[Misc] Support interleaved custom image benchmark datasets (#43636)
ThibaultCastells May 26, 2026
739af5c
[Reasoning] [Bugfix] Reject invalid thinking_token_budget values (#43…
linzm1007 May 26, 2026
ebd0692
[Model] Use AutoWeightsLoader for InternLM2 (#38278)
javierdejesusda May 26, 2026
861b977
[XPU] Fix fused MoE LoRA kernel crash on XPU by using platform-agnos …
chaojun-zhang May 26, 2026
a970fb5
Fix CuPy runtime deps and restore humming (#43530)
mmangkad May 26, 2026
d565357
[Docs][ROCm] MoRI-IO Connector Usage Guide (#43603)
simondanielsson May 26, 2026
445ded1
[ROCm][CI] Extend ROCm quick reduce coverage (#40990)
AndreasKaratzas May 26, 2026
6ab6ffb
[Feat][DSV4] Fuse q pad into deepseek v4 fused kernel (#43162)
zyongye May 26, 2026
b226dda
[MoE Refactor] Migrate ModelOptMxFp8FusedMoE to oracle (#42768)
bnellnm May 26, 2026
f51bbc6
[MoE Refactor] W4a8 int8 oracle (#42789)
bnellnm May 26, 2026
c8414a8
[ROCm] Remove MegaMoE integration in deepseek v4 (#43629)
WoosukKwon May 26, 2026
6f5b533
Add LM head quantization support for ModelOpt (#42124)
meenchen May 26, 2026
3aea37d
[Doc] Add line limit to AGENTS.md (#43635)
WoosukKwon May 26, 2026
193ce88
[DSv4] Drop _get_compressed_kv_buffer in DeepseekCompressor (#43690)
WoosukKwon May 26, 2026
49b4882
[CI] Soft-fail AMD entrypoints mirror tests (#43709)
khluu May 26, 2026
6e50386
[Kernel] Porting fuse_minimax_qk_norm to manual fusion (#43410)
jeejeelee May 26, 2026
d98cbf4
[KV Connector] MooncakeStore: drop dead discard_partial_chunks parame…
zhewenl May 26, 2026
812e7e7
[Bugfix][V1] Fix TOCTOU race causing intermittent `EADDRINUSE` on mul…
vadiklyutiy May 26, 2026
e19b9b1
[ci] Add arm64 ci image (#41303)
khluu May 26, 2026
dede691
[Bugfix] Split attention groups by num_heads_q for spec-decode drafts…
lucianommartins May 27, 2026
0b68f21
[Rust Frontend] Add reasoning/tool parser & renderer roundtrip tests …
BugenZhao May 27, 2026
5bdb181
[ROCm][CI] Fix ROCm multimodal Qwen2.5-VL activation compile and Phi4…
AndreasKaratzas May 27, 2026
d8eebe6
[Perf] Optimize Fp8BlockScaledMMLinearKernel input_scale tensor using…
xyang16 May 27, 2026
7e33081
[Attention] Make FlexAttention and FlashAttention use num-blocks firs…
LucasWilkinson May 27, 2026
aa61381
[MLA][Attention] Add OOT MLA prefill backend registration mechanism (…
MatthewBonanni May 27, 2026
c02c758
[Deprecation] Deprecate functions as scheduled for v0.21.0 (#43358)
yewentao256 May 27, 2026
adaa5e4
[DSv4] Refactor compressor & Fix ROCm compatibility (#43710)
WoosukKwon May 27, 2026
0fa3114
Fix test_aot_compile for torch 2.12 (#43695)
angelayi May 27, 2026
1fc2cee
[KVConnector][Mooncake] Wire reset_cache cascade end-to-end (#42694)
aoshen02 May 27, 2026
7b54690
[ROCm][Perf] Expose AITER MoE sorting dispatch policy via env var (#3…
nholmber May 27, 2026
8c94938
[MRV2][BugFix] Fix KV connector handling in spec decode case (#43719)
njhill May 27, 2026
683033d
[Frontend] Add MiniCPM5 XML tool call parser (#43175)
zhangtao2-1 May 27, 2026
de12f5c
[ROCm][GPT-OSS] Avoid repeated compile-time `cos_sin_cache.to(bf16)` …
akii96 May 27, 2026
ad464e1
[Doc] Add Ascend NPU tab to the quickstart installation guide (#43550)
adityasingh2400 May 27, 2026
396c8fe
[Rust Frontend] Align tool parser fallback behavior between streaming…
BugenZhao May 27, 2026
158289e
[Docs] Fix MLA prefill backend default docs (#43697)
mmangkad May 27, 2026
2272062
[Kernel] Enable TritonW4A16LinearKernel as CUDA fallback for non-Marl…
lucianommartins May 27, 2026
52a31cc
[Bugfix] Map reasoning_effort to enable_thinking in chat template kwa…
ashwing May 27, 2026
03d9cc2
[misc] Bump cutedsl version to 4.5.2 (#43745)
zyongye May 27, 2026
1654609
[BugFix] HFValidationError with cloud storage URIs when HF_HUB_OFFLIN…
sts07142 May 27, 2026
49a3510
[Docs] Fix the duplicate doc icon issue (#43546)
chunyang-wen May 27, 2026
41688e2
Fix early CUDA init (#43791)
hmellor May 27, 2026
05c50c7
[ROCm] mori: add InterNodeV1LL inter-node kernel selection via VLLM_M…
jatseng-ai May 27, 2026
284e6f5
[8/n] Migrate merge_attn_states, mamba, sampler to torch stable ABI (…
cleonard530 May 27, 2026
206b72c
[Quantization] Fix Humming RoutedExperts import (#43540)
fallintoplace May 27, 2026
2616f67
Remove Transformers forward/backward compatibility tests (#43785)
hmellor May 27, 2026
2c2c966
Validate against some config fields being set to 0 (#43794)
hmellor May 27, 2026
7fb9c01
[Bugfix][DFlash]allocate the proper number of lookahead slots (#43733)
benchislett May 27, 2026
5963c19
Fix Qwen3-VL and Qwen3-omni-thinker accuracy degradation from deepsta…
andakai May 27, 2026
094124a
Add @AndreasKaratzas to CODEOWNERS (#43740)
AndreasKaratzas May 27, 2026
381edde
[Bugfix][Kernel] TRTLLM NVFP4 MoE chunking (#43599)
amitz-nv May 28, 2026
1223732
[ModelRunnerV2][Hybrid model] Support kernel block size in hybrid mod…
MengqingCao May 28, 2026
c87f62c
[Rust Frontend] Introduce mock engine for benchmark baseline (#43469)
BugenZhao May 28, 2026
05eec71
Fix RunAI streamer tensor buffer reuse during weight loading (#43464)
bbartels May 28, 2026
2d2c660
[MoE] Remove inplace fused experts mechanism (#43727)
zyongye May 28, 2026
413ac5c
[Misc][Rocm] Remove redundant `AiterUnifiedAttentionBackend` block si…
NickLucche May 28, 2026
33e94fc
[ROCm][CI] Stabilize Cargo cache and pre-test image checks (#43815)
AndreasKaratzas May 28, 2026
05ac829
fix: parse Qwen3 XML JSON arguments first (#43243)
he-yufeng May 28, 2026
e54eff7
[Bugfix] Pass `routed_scaling_factor` to FlashInfer TRTLLM BF16 MoE (…
gau-nernst May 28, 2026
626fa9b
[BugFix] Fix blocked reasoning parsing with MRV2 (#43808)
njhill May 28, 2026
7909f82
[Bugfix][Frontend] streaming tool-call serializer drops first args ch…
ignaciosica May 28, 2026
e1814f8
minor docs: fix incorrect example path (#43830)
JINO-ROHIT May 28, 2026
0ba46d4
[ROCm][DSV4] Enable Tilelang MHC replacing torch/triton mhc (#43679)
tjtanaa May 28, 2026
1b16f2d
change name of fs_python secondary tier to fs. (#43600)
rshavitt May 28, 2026
d6b48f9
[BugFix] Fix hard-coded timeout for multi-API-server startup (#43768)
vadiklyutiy May 28, 2026
6cc8577
[Kernel] Marlin MoE: include SM 12.x in default arch list (#40923)
tonyliu312 May 28, 2026
a04afd7
[DSV4] Remove AMD/XPU path in deepseek_v4/nvidia (#43829)
WoosukKwon May 28, 2026
2a78175
Restore `Literal` for `WeightTransferConfig.backend` (#43183)
hmellor May 28, 2026
b372ad3
[Bugfix] Stream DeepSeek DSML tool-call argument deltas incrementally…
QwertyJack May 28, 2026
a9bc0ad
[ROCm][CI] Move workload from MI300 to MI325 (#43824)
AndreasKaratzas May 28, 2026
bfb9ebc
[Feature] Add support for timed trace replay in `vllm bench serve` to…
animeshtrivedi May 28, 2026
f2caefe
[UX] Increase DP Coordinator startup timeout from 30s to 120s (#42343)
wzhao18 May 28, 2026
4ec2817
[Model][Bugfix] Rename weight_mapper to hf_to_vllm_mapper in LlamaNem…
jzakrzew May 28, 2026
a583c84
[Bugfix][ROCm] Fix Accuracy Drop in Sparse Indexer on gfx950 (#43781)
kliuae May 28, 2026
61288b5
[Bugfix] Fix HyperCLOVAX CI failure after upstream removed remote cod…
khluu May 28, 2026
8e0580f
[CI] Auto-apply `rust` label to relevant PRs (#43866)
BugenZhao May 28, 2026
d692b89
[Feature] Add structured output and effort support to Anthropic Messa…
chaunceyjiang May 28, 2026
c1c4db8
Log dummy DP step in iteration details (#41406)
vadiklyutiy May 28, 2026
811d805
[EC Connector] Add shutdown API to EC Connector. (#42423)
omerpaz95 May 28, 2026
19af4e6
Fix `OlmoHybridForCausalLM` not initialising (#43846)
hmellor May 28, 2026
02606b0
[BUGFIX] Multimodal benchmark with MistralTokenizer (#42965)
juliendenize May 28, 2026
64e1218
[Perf] Optimize moe permute by pre-allocate buffer, 9~14% kernel perf…
yewentao256 May 28, 2026
f3b2a81
[Perf][KDA] Fuse gate softplus, chunk-local cumsum, and RCP_LN2 scali…
zexplorerhj May 28, 2026
864990e
Add token-offset based selective offload in OffloadConnector (#39983)
ruocco May 28, 2026
9957e4d
[Model Refactoring] Remove torch compile dependency in DSv4 (#43746)
WoosukKwon May 28, 2026
552eb81
[Bugfix][ROCm] Resolve MoRI connector hangs at high concurrency (#40344)
simondanielsson May 28, 2026
20d69d1
[CPU] Migrate cpu_awq into awq_marlin (#43841)
bigPYJ1151 May 28, 2026
3a28223
[Rust Frontend] Add `hy_v3` tool parser (#43872)
BugenZhao May 28, 2026
61a1e30
[Rust Frontend] Reduce Gemma4 tool parser args scan complexity (#43850)
BugenZhao May 28, 2026
577d693
[rust] fix: aggregate `is_sleeping` and `reset_prefix_cache` across D…
willamhou May 28, 2026
be4062f
[Bug] Fix `tests/distributed/test_elastic_ep.py - assert False` (#43…
yewentao256 May 28, 2026
c08ebeb
[Perf] Add do_not_specialize to Mamba SSD chunk kernels (#43803)
Majid-Taheri May 28, 2026
5d126dd
[Bugfix] Exclude Ray DP from #42585's deferred port allocation (#43864)
vadiklyutiy May 28, 2026
4bfa0f2
[KV Offload] Rename `SecondaryTierManager.get_finished()` to `get_fin…
ronensc May 28, 2026
a9ec46d
[ROCm][Perf] Support N=5 in wvSplitK skinny GEMM kernels for speculat…
mgehre-amd May 28, 2026
3207e76
[XPU][MoE] Add WNA16 oracle backend for GPTQ sym-int4 (xpu_fused_moe)…
jasonboukheir May 28, 2026
1b5437c
[ROCm] Bump ROCm to 7.2.3 (#43136)
micah-wil May 28, 2026
9aa131f
Add Cosmos3 Reasoner model (#43356)
MaciejBalaNV May 28, 2026
0990247
[Rust Frontend] Optimize multimodal prompt expansion (#43670)
ricky-chaoju May 28, 2026
53a2088
Allow native KV cache dtype in Triton cache update (#43330)
mikekg May 28, 2026
5b115bb
[Attention][AMD] Standardize kv layout to blocks first for AMD (#43660)
NickLucche May 28, 2026
ed7fe83
[ROCm] Enable the aiter top-k/top-p sampler by default (#43331)
JohnQinAMD May 28, 2026
9006204
[MM][CG] Avoid over-padding Qwen2.5-VL encoder cudagraph window metad…
huanghua1994 May 28, 2026
085ac22
Deprecate `JAISLMHeadModel` (#43784)
hmellor May 28, 2026
9090368
[Feat] Add support for per GPU worker RDMA NIC selection (#42083)
rajkiranjoshi May 28, 2026
7e53283
[Core] Cleanup KVConnector handling with PP + fix MRV2 (#43732)
njhill May 28, 2026
a3ed5ab
[KV Offload] Add per-request offloading policy via `on_new_request` l…
ronensc May 28, 2026
69b8956
[Model Refactoring] Remove unncessary torch op registration for DSv4 …
WoosukKwon May 28, 2026
9202ea6
[Spec Decode] Allow causal DFlash (#43445)
benchislett May 28, 2026
03f03f9
Refactor output filename handling in ci-fetch-log.sh (#43901)
mgoin May 28, 2026
9769e2d
[AMD][CI][BugFix] Fix Distributed Compile Unit Tests (2xH100-2xMI300…
rasmith May 28, 2026
69c9f19
fix(frontend): Add multimodal placeholders to Gemma4 tool message tem…
harshaljanjani May 28, 2026
325a1ec
[CI] Enable prefix caching in BFCL benchmark (#43925)
yzong-rh May 28, 2026
b690b2b
[Model]Support Step-3.7-Flash (#43859)
ltd0924 May 29, 2026
1521173
[Rust Frontend] Add `/version` endpoint using engine-reported value (…
BugenZhao May 29, 2026
bf18d7e
[Misc][NUMA] Auto-bind to PCT priority cores on DGX B300 + widen Engi…
vadiklyutiy May 29, 2026
7bd45da
[DSv4] Move mHC tilelang kernels & Don't use CustomOP in dsv4/nvidia …
WoosukKwon May 29, 2026
212deff
[feat] add GlmgaProcessor specific logits in `glm4_1v.py` (#43575)
JaredforReal May 29, 2026
dfe8ba7
Adjust design around encoder_cudagraph_forward (#42288)
wdhongtw May 29, 2026
9636709
[XPU] add scale transpose to prepare_fp8_moe_layer_for_xpu and bump u…
mayuyuace May 29, 2026
d63108f
[kv_offload] Skip decode-phase blocks in CPU offload (#43797)
Etelis May 29, 2026
710f077
[Refactor] Remove dead code (#43234)
yewentao256 May 29, 2026
22a5864
[9/n] Migrate attention and cache kernels to torch stable ABI (contin…
cleonard530 May 29, 2026
648c3eb
[CI] Separate non-root smoke tests from image build step (#43712)
khluu May 29, 2026
04516ea
[XPU] add gelu_tanh to xpu moe backend supported activations (#42822)
yintong-lu May 29, 2026
94d3f4d
[CPU Backend] CPU top-k and top-p sampling kernels using Triton (#43633)
tianmu-li May 29, 2026
ab7521d
[ROCm][DSv4] Remove device pipeline stall in sparse attention (#43898)
kliuae May 29, 2026
87f12e5
[Frontend]Responses API supports chat_template_kwargs (#43761)
chaunceyjiang May 29, 2026
ff990d0
[ROCm][CI] Fix AITER unified attention for encoder-decoder cross-atte…
AndreasKaratzas May 29, 2026
30c6289
[XPU] fix xpu install document triton-xpu version (#43947)
jikunshang May 29, 2026
b7fb747
[CI][ROCm] Don't skip MoRI-IO Connector tests (#43703)
simondanielsson May 29, 2026
e8b5199
[XPU] support MTP of gdn attention (#43565)
mayuyuace May 29, 2026
7ebc0ec
[CI] Nixl+SimpleCPUOffloadingConnector unit tests (#43871)
NickLucche May 29, 2026
60a7a22
[Bugfix] Fix Step3 pipeline parallel KeyError for residual tensor (#3…
JMonde May 29, 2026
0cff074
[Kernel][ROCm] Native W4A16 kernel for AMD RDNA3 (gfx1100) — fp16 + b…
JartX May 29, 2026
ab12aab
[Bugfix] [ROCm] [DSV4] Fix AITER MXFP4 MoE weight loading and shuffle…
MHYangAMD May 29, 2026
0b56815
[ROCm][Perf] DSv3.2 MI355X TP4 decode-step orchestration cleanup (3 m…
frida-andersson May 29, 2026
d288972
[Bugfix] Corrupted MLA + linear attention (#43961)
gau-nernst May 29, 2026
0585b5b
Skip docs build if PR doesn't affect docs (#43972)
hmellor May 29, 2026
3f6f508
[Bugfix][CPU] Remove invalid extra deps (#43977)
bigPYJ1151 May 29, 2026
11dfa31
Add vLLM library info to Hugging Face Hub requests (#43857)
Wauplin May 29, 2026
f191d56
docs: clarify ITL acronym in optimization docs (#43922)
chunyang-wen May 29, 2026
5502c3b
[Misc] added unit tests for the core pooling methods (#43818)
taneem-ibrahim May 29, 2026
4ff865c
[Bugfix] Disable allreduce_rms_fusion when pipeline_parallel_size > 1…
zixi-qi May 29, 2026
84b2a8a
[MoE Refactor] WNA16 MoE backend selection into oracle module (#42553)
bnellnm May 29, 2026
4aaba00
[EPLB] Make async EPLB default (#43219)
ilmarkov May 29, 2026
d07ad06
[Bugfix] Use storage_block_size in KV cache reshape for compressed sp…
zixi-qi May 29, 2026
8b9deee
[Bugfix] Fix Ray placement group allocation with grouped nodes (#43998)
czhu-cohere May 29, 2026
739096a
[Bug] Fix torch device issue for MOE permute (#44005)
yewentao256 May 29, 2026
6aabe22
[CI] Make Model Executor test hangs fail fast with a traceback (#43971)
khluu May 29, 2026
6de08e8
[CI] Remove redundant test_chat_with_tool_reasoning.py (#44011)
sfeng33 May 29, 2026
acbc203
Add @khluu to CODEOWNERS (#44019)
khluu May 29, 2026
5dbf160
[Feature] SSL support for dp supervisor (#43688)
yewentao256 May 29, 2026
38b864d
[Metrics] Exclude KV transfer tokens from iteration_tokens_total (#43…
tlrmchlsmth May 29, 2026
46409fd
[Fronten] Clean up stop_token_ids override for Harmony (#44009)
yzong-rh May 29, 2026
106aa92
[MoE Refactor] Migrate MoeWNA16Method quantization to MK oracle (#42647)
bnellnm May 29, 2026
7b98f49
[MoE Refactor] Remove supports_expert_map (#43108)
bnellnm May 29, 2026
8c6daf6
[CI] Remove duplicate Harmony test coverage (#44023)
sfeng33 May 29, 2026
8fad266
[CI] Fix smoke test step key to bypass block gate (#43974)
khluu May 29, 2026
187457a
Revert "[MoE Refactor] Migrate MoeWNA16Method quantization to MK orac…
bnellnm May 29, 2026
559d671
[PERF]MiniMax-M2 gate kernel (#38445)
jeejeelee May 30, 2026
1e2ce5d
offload prompt_embeds decode in render_prompts_async to avoid blockin…
gagandhakrey May 30, 2026
1a096d8
[Refactor] Remove dead current_tool_name_sent assignments from tool p…
sfeng33 May 30, 2026
ef8840a
[ROCm][CI] Fix failure in the Phi3V pooling test (#44028)
AndreasKaratzas May 30, 2026
c0056b1
[ROCm] cmake: support PYTORCH_FOUND_HIP for torch 2.13 native HIP lan…
nemanjaudovic May 30, 2026
e949999
[BugFix][Platform] Fix import vllm.platforms.rocm error on non-CUDA t…
Liangliang-Ma May 30, 2026
124fac1
[Bugfix] Fix RMSNorm kernels to multiply in weight's native dtype (#4…
liulanze May 30, 2026
3becc5d
[ROCm] Add attention sink support to AITer flash attention backend (#…
sphinx07 May 30, 2026
50c80d7
[Governance] Add @BugenZhao as Rust frontend code owner (#44047)
BugenZhao May 30, 2026
e110506
[Bug] Fix gemma4 MTP IMA issue when TP>1, `CUDA error: an illegal mem…
yewentao256 May 30, 2026
27fa5aa
[MRV2] Support breakable CUDA graph (#44050)
WoosukKwon May 30, 2026
3fd9d2d
[CPU][Zen] Route W8A8 and W4A16 linear inference through zentorch on …
aadwived May 30, 2026
6bdabba
[CI/Build] Enable Step3p7ForConditionalGeneration testing (#43956)
jeejeelee May 31, 2026
8b8546d
docs: fix MLA attention docstring examples (#44118)
nightcityblade May 31, 2026
f46e6be
[Misc] Use VLLMValidationError consistently in chat completion and co…
umut-polat Jun 1, 2026
4721bb3
[MRV2] Remove Eagle's dedicated CUDA graph pool (#44078)
LucasWilkinson Jun 1, 2026
29d6933
[BugFix] Fix `_has_module` to verify native deps via trial import (#4…
jeffreywang-anyscale Jun 1, 2026
1fd8bd0
[Docs] Replace broken video url in examples (#44159)
Isotr0py Jun 1, 2026
98f1279
[CPU][RISC-V] Add missing RVV cpu_types helpers for WNA16 (#42730)
wcynb1023 Jun 1, 2026
1f6048a
fix: glm5.1 pp model loading (#42944)
UranusSeven Jun 1, 2026
0910f7e
[Frontend] Resettle generative scoring entrypoint. (#44153)
noooop Jun 1, 2026
de21863
[Rust Frontend] Add InternLM2 tool parser (#43481)
willamhou Jun 1, 2026
8796838
[Bugfix] fix wrong partial_rotary_factor calculation for bailing_moe …
zzt93 Jun 1, 2026
bd0aecd
[XPU][CI] Fix test_audio_in_video flake by using module-scoped server…
chaojun-zhang Jun 1, 2026
985c97a
[Perf] Optimize cutlass fp8 scaled mm bypassing padding, 20% kernel p…
yewentao256 Jun 1, 2026
023808c
[Feature] Add support for JetBrains' Mellum v2 code generation model …
shadeMe Jun 1, 2026
0357335
[Kernel][DSv4] Optimize sparse FP8 compressor kernels (#44161)
zyongye Jun 1, 2026
fd9e91d
[ROCm][CI] Fix and stabilize EAGLE3 acceptance tests (#41294)
AndreasKaratzas Jun 1, 2026
182c67d
[Rust Frontend] Support streaming `generate` endpoint (#43779)
Xunzhuo Jun 1, 2026
266b9d9
[Frontend][Core] Add sparse NCCL weight transfer support for in-place…
bedeks Jun 1, 2026
6f8b40a
[BugFix][CI] Fix added `_has_module` tests (#44248)
njhill Jun 1, 2026
e4cbc43
[Test][BugFix] Fix double-BOS in PD+specdec acceptance test (#44234)
njhill Jun 1, 2026
8c3cc98
[DSV4] Remove unncessary classes & functions (#44246)
WoosukKwon Jun 1, 2026
48c0d13
[ROCm][CI] Skip unbacked dynamic shapes tests on PyTorch < 2.11 (#44256)
JartX Jun 2, 2026
517e74a
[DSV4] Refactor RoPE initialization (#44262)
WoosukKwon Jun 2, 2026
d68f0b2
[Bugfix][Mooncake] Release GPU pin on failed store in MooncakeStoreCo…
Dao007forever Jun 2, 2026
2588ec4
[ROCm] Upgrade AITER to v0.1.13.post1 (#44265)
micah-wil Jun 2, 2026
816cc73
[Bugfix][CI] Normalize NIXL connector CUDA wheel installs (#44266)
alec-flowers Jun 2, 2026
9affc17
[Refactor] Move unstreamed tool-arg flush from serving layer to parse…
sfeng33 Jun 2, 2026
54d0c36
[CI] Stabilize OpenAI schema fuzzing for malformed structural tags (#…
AndreasKaratzas Jun 2, 2026
279d25f
[BugFix] Fix TypeError in MiniCPM-O audio feature unpadding (#38053)
Krishnachaitanyakc Jun 2, 2026
480fada
[BugFix][kv_offload]: Prevent offloading stale sliding window blocks …
orozery Jun 2, 2026
a3a5a5e
[XPU][Bugfix] Fix per_token_group_fp8_quant missing dummy args on XPU…
chaojun-zhang Jun 2, 2026
a045c74
[MM][CG] Profile encoder CUDA graph pool memory (#41714)
BWAAEEEK Jun 2, 2026
f91fb2f
[Bugfix] Convert Gemma4-MM ViT linear layers to vllm native impl (#43…
Isotr0py Jun 2, 2026
8a9eb40
[Model Runner V2] Support zeroing freshly allocated KV blocks for hyb…
izhuhaoran Jun 2, 2026
1edfd09
[Model Runner V2] Use actual batch max_seq_len for attn metadata (#43…
izhuhaoran Jun 2, 2026
68dafcc
[Refactor] Unify reasoning + tool-call parsing behind Parser.parse() …
sfeng33 Jun 2, 2026
abfeede
Merge upstream/main into macrodata main
hynky1999 Jun 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .buildkite/ci_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ run_all_patterns:
- "CMakeLists.txt"
- "requirements/common.txt"
- "requirements/cuda.txt"
- "requirements/kv_connectors.txt"
- "requirements/build/cuda.txt"
- "requirements/test/cuda.txt"
- "setup.py"
Expand Down
20 changes: 20 additions & 0 deletions .buildkite/hardware_tests/amd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,26 @@ steps:
--target test
--no-cache
--progress plain .
- |
docker run --rm --network=none --entrypoint /bin/bash "rocm/vllm-ci:${BUILDKITE_COMMIT}" -ec '
if [ ! -d /vllm-workspace ]; then echo Missing directory: /vllm-workspace >&2; exit 1; fi
if [ ! -d /vllm-workspace/tests ]; then echo Missing directory: /vllm-workspace/tests >&2; exit 1; fi
if [ ! -d /vllm-workspace/src/vllm ]; then echo Missing directory: /vllm-workspace/src/vllm >&2; exit 1; fi
if [ ! -x /vllm-workspace/src/vllm/vllm-rs ]; then echo Missing executable: /vllm-workspace/src/vllm/vllm-rs >&2; exit 1; fi
command -v python3
command -v uv
command -v pytest
if ! command -v amd-smi >/dev/null 2>&1 && ! command -v rocminfo >/dev/null 2>&1; then
echo No ROCm CLI found in image >&2
exit 1
fi
python3 - <<PY
import torch, vllm
print(torch.__version__)
print(vllm.__version__)
PY
echo AMD image smoke OK
'
- docker push "rocm/vllm-ci:${BUILDKITE_COMMIT}"
env:
DOCKER_BUILDKIT: "1"
54 changes: 44 additions & 10 deletions .buildkite/hardware_tests/cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,19 @@ steps:
- vllm/_custom_ops.py
- tests/kernels/attention/test_cpu_attn.py
- tests/kernels/moe/test_cpu_fused_moe.py
- tests/kernels/moe/test_cpu_quant_fused_moe.py
- tests/kernels/test_onednn.py
- tests/kernels/test_awq_int4_to_int8.py
- tests/kernels/quantization/test_cpu_fp8_scaled_mm.py
commands:
- |
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 20m "
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 30m "
pytest -x -v -s tests/kernels/attention/test_cpu_attn.py
pytest -x -v -s tests/kernels/moe/test_cpu_fused_moe.py
pytest -x -v -s tests/kernels/moe/test_cpu_quant_fused_moe.py
pytest -x -v -s tests/kernels/test_onednn.py
pytest -x -v -s tests/kernels/test_awq_int4_to_int8.py"
pytest -x -v -s tests/kernels/test_awq_int4_to_int8.py
pytest -x -v -s tests/kernels/quantization/test_cpu_fp8_scaled_mm.py"

- label: CPU-Compatibility Tests
depends_on: []
Expand Down Expand Up @@ -50,30 +54,49 @@ steps:
pytest -x -v -s tests/models/language/generation -m cpu_model
pytest -x -v -s tests/models/language/pooling -m cpu_model"

- label: CPU-ModelRunnerV2 Tests
depends_on: []
device: intel_cpu
no_plugin: true
soft_fail: true
source_file_dependencies:
- vllm/v1/worker/cpu/
- vllm/v1/worker/gpu/
- vllm/v1/sample/ops/topk_topp_triton.py
- vllm/v1/sample/ops/topk_topp_sampler.py
- tests/v1/sample/test_topk_topp_sampler.py
commands:
- |
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 45m "
uv pip install git+https://github.com/triton-lang/triton-cpu.git@270e696d
VLLM_USE_V2_MODEL_RUNNER=1 pytest -x -v -s tests/models/language/generation/test_granite.py -m cpu_model
# TODO: move to CPU-Kernel Tests once triton-cpu has a pre-built wheel
pytest -x -v -s tests/v1/sample/test_topk_topp_sampler.py::TestTritonTopkTopp"

- label: CPU-Quantization Model Tests
depends_on: []
device: intel_cpu
no_plugin: true
source_file_dependencies:
- csrc/cpu/
- vllm/model_executor/layers/quantization/cpu_wna16.py
- vllm/model_executor/layers/quantization/gptq_marlin.py
- vllm/model_executor/layers/quantization/auto_gptq.py
- vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w8a8_int8.py
- vllm/model_executor/layers/quantization/kernels/scaled_mm/cpu.py
- vllm/model_executor/layers/quantization/kernels/mixed_precision/cpu.py
- vllm/model_executor/kernels/linear/mixed_precision/cpu.py
- vllm/model_executor/kernels/linear/scaled_mm/cpu.py
- vllm/model_executor/layers/fused_moe/experts/cpu_moe.py
- tests/quantization/test_compressed_tensors.py
- tests/quantization/test_cpu_wna16.py
commands:
- |
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 20m "
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 30m "
pytest -x -v -s tests/quantization/test_compressed_tensors.py::test_compressed_tensors_w8a8_logprobs
pytest -x -v -s tests/quantization/test_cpu_wna16.py"

- label: CPU-Distributed Tests
- label: CPU-Distributed Tests (PP+TP)
depends_on: []
device: intel_cpu
no_plugin: true
source_file_dependencies:
source_file_dependencies: &cpu_distributed_deps
- csrc/cpu/shm.cpp
- vllm/v1/worker/cpu_worker.py
- vllm/v1/worker/gpu_worker.py
Expand All @@ -82,10 +105,21 @@ steps:
- vllm/platforms/cpu.py
- vllm/distributed/parallel_state.py
- vllm/distributed/device_communicators/cpu_communicator.py
- .buildkite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh
commands:
- |
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 10m "
bash .buildkite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh tp_pp"

- label: CPU-Distributed Tests (DP+TP)
depends_on: []
device: intel_cpu
no_plugin: true
source_file_dependencies: *cpu_distributed_deps
commands:
- |
bash .buildkite/scripts/hardware_ci/run-cpu-test.sh 10m "
bash .buildkite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh"
bash .buildkite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh dp_tp"

- label: CPU-Multi-Modal Model Tests %N
depends_on: []
Expand Down
7 changes: 0 additions & 7 deletions .buildkite/hardware_tests/intel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,3 @@ steps:
commands:
- bash .buildkite/scripts/hardware_ci/run-hpu-test.sh

- label: "Intel GPU Test"
depends_on: []
soft_fail: true
device: intel_gpu
no_plugin: true
commands:
- bash .buildkite/scripts/hardware_ci/run-xpu-test.sh
5 changes: 3 additions & 2 deletions .buildkite/image_build/image_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,8 @@ check_and_skip_if_image_exists() {
}

ecr_login() {
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY"
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 936637512419.dkr.ecr.us-east-1.amazonaws.com
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY" || true
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 936637512419.dkr.ecr.us-east-1.amazonaws.com || true
}

prepare_cache_tags() {
Expand Down Expand Up @@ -192,6 +192,7 @@ export BUILDKITE_COMMIT
export PARENT_COMMIT
export IMAGE_TAG
export IMAGE_TAG_LATEST
export COMMIT="${COMMIT:-${BUILDKITE_COMMIT}}"
export CACHE_FROM
export CACHE_FROM_BASE_BRANCH
export CACHE_FROM_MAIN
Expand Down
72 changes: 72 additions & 0 deletions .buildkite/image_build/image_build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,60 @@ steps:
- exit_status: -10 # Agent was lost
limit: 2

- label: ":docker: :smoking: Non-root smoke tests"
key: image-build-smoke-test
depends_on:
- image-build
commands:
# Smoke 1: the default (root) image must still be importable
# under a non-root UID via `--user 2000:0`. Validates the `vllm` passwd
# entry + group-0-writable /home/vllm + uv path cleanup from #31959.
# Uses `import vllm` rather than `vllm serve --help` because the latter
# instantiates `VllmConfig` which requires a GPU attached to the
# container.
- docker run --rm --user 2000:0 --entrypoint python3 "$IMAGE_TAG" -c "import vllm; print(vllm.__version__)"
# Smoke 2: assert the non-root enabling invariants are baked
# into the image. Runs as UID 2000:0 via a shell so we can verify
# filesystem perms + passwd/group file state + wrapper presence without
# triggering vLLM's GPU-requiring config-init path. The opt-in
# `vllm-openai-nonroot` target adds only `USER vllm`, `WORKDIR
# /home/vllm`, and an `ENTRYPOINT` override on top of these invariants;
# its build correctness is reviewed at the Dockerfile level. Wrapper
# logic is covered separately by the pre-commit hook
# `test-nonroot-entrypoint` (see .pre-commit-config.yaml).
- |
docker run --rm --user 2000:0 --entrypoint /bin/sh "$IMAGE_TAG" -ec '
if ! getent passwd 2000 | grep -q ^vllm:; then
echo FAIL: UID 2000 != vllm
exit 1
fi
if ! id -gn 2>/dev/null | grep -qx root; then
echo FAIL: GID 0 not root group
exit 1
fi
touch /home/vllm/.smoke && rm /home/vllm/.smoke
touch /opt/uv/cache/.smoke && rm /opt/uv/cache/.smoke
if ! test -x /usr/local/bin/vllm-nonroot-entrypoint.sh; then
echo FAIL: wrapper missing
exit 1
fi
if ! test -w /etc/passwd; then
echo FAIL: /etc/passwd not group-writable
exit 1
fi
if ! test -w /etc/group; then
echo FAIL: /etc/group not group-writable
exit 1
fi
echo non-root invariants OK
'
retry:
automatic:
- exit_status: -1 # Agent was lost
limit: 2
- exit_status: -10 # Agent was lost
limit: 2

- label: ":docker: Build CPU image"
key: image-build-cpu
depends_on: []
Expand Down Expand Up @@ -56,3 +110,21 @@ steps:
limit: 2
- exit_status: -10 # Agent was lost
limit: 2

- label: ":docker: Build arm64 image"
key: arm64-image-build
depends_on: []
source_file_dependencies:
- ".buildkite/image_build/image_build.yaml"
- ".buildkite/image_build/image_build_arm64.sh"
- "docker/Dockerfile"
commands:
- .buildkite/image_build/image_build_arm64.sh $REGISTRY $REPO $BUILDKITE_COMMIT
env:
DOCKER_BUILDKIT: "1"
retry:
automatic:
- exit_status: -1 # Agent was lost
limit: 2
- exit_status: -10 # Agent was lost
limit: 2
37 changes: 37 additions & 0 deletions .buildkite/image_build/image_build_arm64.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash
set -e

if [[ $# -lt 3 ]]; then
echo "Usage: $0 <registry> <repo> <commit>"
exit 1
fi

REGISTRY=$1
REPO=$2
BUILDKITE_COMMIT=$3

# authenticate with AWS ECR
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY" || true

# skip build if image already exists
if [[ -z $(docker manifest inspect "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-arm64) ]]; then
echo "Image not found, proceeding with build..."
else
echo "Image found"
exit 0
fi

# build (Grace/GH200 is the arm64 GPU target; sm_90)
docker build --file docker/Dockerfile \
--platform linux/arm64 \
--build-arg max_jobs=16 \
--build-arg nvcc_threads=4 \
--build-arg torch_cuda_arch_list="9.0" \
--build-arg USE_SCCACHE=1 \
--build-arg buildkite_commit="$BUILDKITE_COMMIT" \
--tag "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-arm64 \
--target test \
--progress plain .

# push
docker push "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-arm64
2 changes: 1 addition & 1 deletion .buildkite/image_build/image_build_cpu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ REPO=$2
BUILDKITE_COMMIT=$3

# authenticate with AWS ECR
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY"
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY" || true

# skip build if image already exists
if [[ -z $(docker manifest inspect "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-cpu) ]]; then
Expand Down
2 changes: 1 addition & 1 deletion .buildkite/image_build/image_build_cpu_arm64.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ REPO=$2
BUILDKITE_COMMIT=$3

# authenticate with AWS ECR
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY"
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY" || true

# skip build if image already exists
if [[ -z $(docker manifest inspect "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-arm64-cpu) ]]; then
Expand Down
2 changes: 1 addition & 1 deletion .buildkite/image_build/image_build_hpu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ REPO=$2
BUILDKITE_COMMIT=$3

# authenticate with AWS ECR
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY"
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY" || true

# skip build if image already exists
if [[ -z $(docker manifest inspect "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-hpu) ]]; then
Expand Down
2 changes: 1 addition & 1 deletion .buildkite/image_build/image_build_torch_nightly.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ echo "Image not found, proceeding with build..."

# --- CUDA 13.0 for nightly builds ---
# Nightly CI uses CUDA 13.0 while regular CI stays on CUDA 12.9
NIGHTLY_CUDA_VERSION="13.0.0"
NIGHTLY_CUDA_VERSION="13.0.2"
NIGHTLY_BUILD_BASE_IMAGE="nvidia/cuda:${NIGHTLY_CUDA_VERSION}-devel-ubuntu22.04"
NIGHTLY_FINAL_BASE_IMAGE="nvidia/cuda:${NIGHTLY_CUDA_VERSION}-base-ubuntu22.04"

Expand Down
4 changes: 2 additions & 2 deletions .buildkite/image_build/image_build_xpu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ REPO=$2
BUILDKITE_COMMIT=$3

# authenticate with AWS ECR
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY"
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 936637512419.dkr.ecr.us-east-1.amazonaws.com
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$REGISTRY" || true
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 936637512419.dkr.ecr.us-east-1.amazonaws.com || true

# skip build if image already exists
if ! docker manifest inspect "$REGISTRY"/"$REPO":"$BUILDKITE_COMMIT"-xpu &> /dev/null; then
Expand Down
21 changes: 21 additions & 0 deletions .buildkite/intel_jobs/engine_intel.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
group: Engine Intel
depends_on:
- image-build-xpu
steps:
- label: Engine (1 GPU)
timeout_in_minutes: 30
device: intel_gpu
no_plugin: true
working_dir: "."
env:
REGISTRY: "public.ecr.aws/q9t5s3a7"
REPO: "vllm-ci-test-repo"
VLLM_TEST_DEVICE: "xpu"
source_file_dependencies:
- vllm/v1/engine/
- tests/v1/engine/
commands:
- >-
bash .buildkite/scripts/hardware_ci/run-intel-test.sh
'cd tests &&
pytest -v -s v1/engine --ignore v1/engine/test_preprocess_error_handling.py'
21 changes: 21 additions & 0 deletions .buildkite/intel_jobs/kernels_intel.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
group: Kernels Intel
depends_on:
- image-build-xpu
steps:
- label: vLLM IR Tests
timeout_in_minutes: 30
device: intel_gpu
no_plugin: true
working_dir: "."
env:
REGISTRY: "public.ecr.aws/q9t5s3a7"
REPO: "vllm-ci-test-repo"
VLLM_TEST_DEVICE: "xpu"
source_file_dependencies:
- vllm/ir
- vllm/kernels
commands:
- >-
bash .buildkite/scripts/hardware_ci/run-intel-test.sh
'cd tests &&
pytest -v -s kernels/ir'
Loading
Loading