Skip to content

[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops#34244

Closed
tjtanaa wants to merge 18 commits intovllm-project:mainfrom
EmbeddedLLM:fusionpassci
Closed

[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops#34244
tjtanaa wants to merge 18 commits intovllm-project:mainfrom
EmbeddedLLM:fusionpassci

Conversation

@tjtanaa
Copy link
Copy Markdown
Collaborator

@tjtanaa tjtanaa commented Feb 10, 2026

Purpose

To help the effort of vLLM IR Ops and ensure that the fusion pass is also validated on ROCm, this PR add the enable some of the fusion pass. We will roll out more unit tests because the enablement of the tests are non-trivial, we will enable the tests across multiple PRs.

Test Plan

After ensuring the tests passed locally, we will use the AMD CI to validate the tests.

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

tjtanaa and others added 8 commits February 9, 2026 08:28
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
@mergify mergify Bot added ci/build rocm Related to AMD ROCm labels Feb 10, 2026
@github-project-automation github-project-automation Bot moved this to Todo in AMD Feb 10, 2026
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds new CI test cases for fusion passes on ROCm, which is a valuable addition for ensuring correctness on AMD hardware. The changes primarily involve modifications to the Buildkite CI configuration and updates to several test files to enable and parametrize tests for ROCm. I've found two critical issues in the .buildkite/test-amd.yaml file: a duplicated key that will break the CI pipeline, and an incorrect file path in a test command. Please address these issues to ensure the CI runs correctly.

I am having trouble creating individual review comments. Click here to see my feedback.

.buildkite/test-amd.yaml (1712-1735)

critical

The source_file_dependencies key is duplicated in this test job definition. This is invalid YAML and will likely cause issues in the CI pipeline, as the second definition will overwrite the first. It seems the second block with paths under tests/compile/passes/ is the intended one. Please remove the first source_file_dependencies block to resolve the duplication.

  source_file_dependencies:
  - csrc/quantization/fp4/
  - vllm/model_executor/layers/quantization/
  - vllm/model_executor/layers/layernorm.py
  - vllm/model_executor/layers/activation.py
  - vllm/model_executor/layers/attention/attention.py
  - vllm/v1/attention/backends/flashinfer.py
  - vllm/compilation/ # TODO(luka) limit to vllm/compilation/passes
  - tests/compile/passes/test_fusion_attn.py
  - tests/compile/passes/test_silu_mul_quant_fusion.py
  - tests/compile/passes/distributed/test_fusion_all_reduce.py
  - tests/compile/fullgraph/test_full_graph.py

.buildkite/test-amd.yaml (1843)

critical

The path to the test file appears to be incorrect. The vllm/ prefix is likely a mistake and will cause the test to fail. Please correct the path to tests/compile/fusions_e2e/test_tp2_async_tp.py.

    - pytest -v -s tests/compile/fusions_e2e/test_tp2_async_tp.py -k "inductor_partition and not +rms_norm and not +quant_fp8"

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Feb 11, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @tjtanaa.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Feb 11, 2026
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
@tjtanaa tjtanaa closed this Mar 10, 2026
@github-project-automation github-project-automation Bot moved this from Todo to Done in AMD Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants