[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops#34244
[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops#34244tjtanaa wants to merge 18 commits intovllm-project:mainfrom
Conversation
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
There was a problem hiding this comment.
Code Review
This pull request adds new CI test cases for fusion passes on ROCm, which is a valuable addition for ensuring correctness on AMD hardware. The changes primarily involve modifications to the Buildkite CI configuration and updates to several test files to enable and parametrize tests for ROCm. I've found two critical issues in the .buildkite/test-amd.yaml file: a duplicated key that will break the CI pipeline, and an incorrect file path in a test command. Please address these issues to ensure the CI runs correctly.
I am having trouble creating individual review comments. Click here to see my feedback.
.buildkite/test-amd.yaml (1712-1735)
The source_file_dependencies key is duplicated in this test job definition. This is invalid YAML and will likely cause issues in the CI pipeline, as the second definition will overwrite the first. It seems the second block with paths under tests/compile/passes/ is the intended one. Please remove the first source_file_dependencies block to resolve the duplication.
source_file_dependencies:
- csrc/quantization/fp4/
- vllm/model_executor/layers/quantization/
- vllm/model_executor/layers/layernorm.py
- vllm/model_executor/layers/activation.py
- vllm/model_executor/layers/attention/attention.py
- vllm/v1/attention/backends/flashinfer.py
- vllm/compilation/ # TODO(luka) limit to vllm/compilation/passes
- tests/compile/passes/test_fusion_attn.py
- tests/compile/passes/test_silu_mul_quant_fusion.py
- tests/compile/passes/distributed/test_fusion_all_reduce.py
- tests/compile/fullgraph/test_full_graph.py.buildkite/test-amd.yaml (1843)
The path to the test file appears to be incorrect. The vllm/ prefix is likely a mistake and will cause the test to fail. Please correct the path to tests/compile/fusions_e2e/test_tp2_async_tp.py.
- pytest -v -s tests/compile/fusions_e2e/test_tp2_async_tp.py -k "inductor_partition and not +rms_norm and not +quant_fp8"Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Purpose
To help the effort of vLLM IR Ops and ensure that the fusion pass is also validated on ROCm, this PR add the enable some of the fusion pass. We will roll out more unit tests because the enablement of the tests are non-trivial, we will enable the tests across multiple PRs.
Test Plan
After ensuring the tests passed locally, we will use the AMD CI to validate the tests.
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.