Skip to content

[CI] Migrate Attention Backend tests to test/registered/attention/#15563

Merged
Kangyan-Zhou merged 12 commits intomainfrom
migrate-attention-backend-tests
Dec 23, 2025
Merged

[CI] Migrate Attention Backend tests to test/registered/attention/#15563
Kangyan-Zhou merged 12 commits intomainfrom
migrate-attention-backend-tests

Conversation

@alisonshao
Copy link
Copy Markdown
Collaborator

@alisonshao alisonshao commented Dec 21, 2025

Summary

  • Migrate 12 attention-related test files to the new registry-based CI structure (test/registered/attention/)
  • Add new stage-c-test-large-4-gpu suite for 4-GPU H100 tests
  • Part of [Roadmap] CI suites organization #13808

Test Files Migrated

Per-commit (stage-b-test-small-1-gpu): ~8.8 min total

Test File Est. Time
test_radix_cache_unit.py ~5s
test_create_kvindices.py ~10s
test_triton_attention_kernels.py ~30s
test_wave_attention_kernels.py ~60s
test_radix_attention.py ~60s
test_torch_native_attention_backend.py ~90s
test_triton_attention_backend.py ~120s
test_triton_sliding_window.py ~150s

Per-commit (stage-b-test-large-1-gpu): ~11.7 min total

Test File Est. Time
test_fa3.py (SM 90+, skips on non-H100) ~300s
test_hybrid_attn_backend.py (SM 90+, skips on non-H100) ~200s
test_flash_attention_4.py (SM 100+, skips on non-Blackwell) ~200s

Per-commit (stage-c-test-large-4-gpu - NEW): ~3.3 min total

Test File Est. Time
test_local_attn.py (requires 4x H100, SM 90+) ~200s

Infrastructure Changes

  • Added stage-c-test-large-4-gpu suite to test/run_suite.py
  • Added stage-c-test-large-4-gpu job to .github/workflows/pr-test.yml (uses 4-gpu-h100 runner)
  • Added stage to scripts/ci/slash_command_handler.py for /rerun-stage support

Test plan

  • CI passes on stage-b-test-small-1-gpu
  • CI passes on stage-b-test-large-1-gpu
  • CI passes on stage-c-test-large-4-gpu (new)
  • Verify est_time values are accurate after CI run

Migrate 12 attention-related test files to the new registry-based CI structure.

Per-commit tests (stage-b-test-small-1-gpu):
- test_radix_cache_unit.py, test_create_kvindices.py
- test_triton_attention_kernels.py, test_wave_attention_kernels.py
- test_radix_attention.py, test_torch_native_attention_backend.py
- test_triton_attention_backend.py, test_triton_sliding_window.py

Per-commit tests (stage-c-test-large-4-gpu - NEW):
- test_local_attn.py (requires 4x H100 GPUs)

Nightly tests (nightly-1-gpu):
- test_fa3.py, test_hybrid_attn_backend.py, test_flash_attention_4.py

Infrastructure changes:
- Add stage-c-test-large-4-gpu suite to run_suite.py
- Add stage-c-test-large-4-gpu job to pr-test.yml workflow
- Add stage to slash_command_handler.py for /rerun-stage support

Part of #13808
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@alisonshao
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

- Remove 12 attention test entries from legacy test/srt/run_suite.py
  - per-commit-1-gpu: 10 tests removed
  - per-commit-4-gpu: test_local_attn.py removed
  - per-commit-4-gpu-b200: test_flash_attention_4.py removed
  - per-commit-amd: 6 tests removed
- Add CI_MIGRATION_PLAN.md documenting the migration process
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 21, 2025
@alisonshao
Copy link
Copy Markdown
Collaborator Author

alisonshao commented Dec 21, 2025

/rerun-stage stage-b-test-small-1-gpu

@github-actions

This comment was marked as outdated.

@alisonshao

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@alisonshao

This comment was marked as outdated.

@alisonshao
Copy link
Copy Markdown
Collaborator Author

alisonshao and others added 4 commits December 21, 2025 14:54
…1-gpu

These tests have SM 90+/100+ skip decorators and will skip on runners
without appropriate hardware.
…u-b200 suite

The test file was migrated to test/registered/attention/ but the reference
was re-introduced through a merge from main, causing the quantization-test
CI job to fail with 'test file does not exist on disk' error.
@alisonshao
Copy link
Copy Markdown
Collaborator Author

@Kangyan-Zhou Kangyan-Zhou merged commit 883747c into main Dec 23, 2025
199 of 205 checks passed
@Kangyan-Zhou Kangyan-Zhou deleted the migrate-attention-backend-tests branch December 23, 2025 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants