Revert "ci: refactor PR tests to hide failed spot jobs from PR status… by yongwww · Pull Request #2524 · flashinfer-ai/flashinfer

yongwww · 2026-02-08T22:00:18Z

This reverts commit d5eaa42.

Revert PR #2500. The workflow_dispatch architecture requires actions:write on GITHUB_TOKEN, which is read-only for fork PRs. This breaks CI for all fork PR contributors even after run-ci approval.

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

Chores
- Restructured the PR validation pipeline with enhanced permission gating and staging.
- Improved automated failure handling for infrastructure interruptions with dynamic rerun mechanisms.
- Streamlined test result aggregation and reporting across multiple test environments.

…#2500)" This reverts commit d5eaa42.

gemini-code-assist · 2026-02-08T22:00:23Z

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

coderabbitai · 2026-02-08T22:00:34Z

📝 Walkthrough

Walkthrough

The PR consolidates two GitHub Actions workflows by deleting the separate pr-test-runner.yml file and integrating its orchestration logic into pr-test.yml. The refactored workflow introduces a permission gate, multi-stage testing pipeline with spot/on-demand rerun handling, and centralized test results aggregation across AOT and GPU test jobs.

Changes

Cohort / File(s)	Summary
Workflow Consolidation and Reorganization `\.github/workflows/pr-test-runner.yml`, `\.github/workflows/pr-test.yml`	Deleted pr-test-runner.yml and integrated its multi-job orchestration logic into pr-test.yml. Introduced new permission gating, PR-diff-based skip logic in setup stage, spot termination detection with dynamic rerun matrices, and centralized test results aggregation. Modified job permissions from actions:write to actions:read. Reorganized workflow from single orchestrator pattern to modular, multi-stage pipeline with explicit job prerequisites and outputs.

Sequence Diagram

sequenceDiagram
    participant GH as GitHub
    participant Gate as Gate Job
    participant Setup as Setup Job
    participant Test as Test Jobs<br/>(AOT/GPU)
    participant Analyze as Failure Analysis
    participant Rerun as Rerun Jobs
    participant Summary as Summary Job
    
    GH->>Gate: Trigger PR workflow
    Gate->>Gate: Check PR labels &<br/>contributor status
    Gate-->>GH: Output authorization
    
    alt Gate passes
        GH->>Setup: Run setup stage
        Setup->>GH: Compute skip_build &<br/>docker_tag from PR diff
        Setup-->>GH: Output setup config
        
        GH->>Test: Execute test matrix<br/>(spot runners)
        Test->>Test: AOT build & GPU<br/>JIT tests
        Test-->>GH: Test results
        
        GH->>Analyze: Analyze job logs
        Analyze->>Analyze: Detect spot<br/>termination markers
        Analyze-->>GH: Output rerun_matrix
        
        alt Spot termination detected
            GH->>Rerun: Execute rerun matrix<br/>(on-demand instances)
            Rerun->>Rerun: Re-execute tests
            Rerun-->>GH: Updated results
        end
        
        GH->>Summary: Aggregate results
        Summary->>Summary: Compose test summary<br/>& update check runs
        Summary-->>GH: Final status
    else Gate fails
        Gate-->>GH: Skip downstream jobs
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: introduce GitHub Actions workflow for PR testing #2326: Both PRs modify pr-test.yml to overhaul AOT build and GPU JIT test jobs with similar multi-stage orchestration patterns.
ci: refactor PR tests to hide failed spot jobs from PR status #2500: Both PRs refactor CI workflows to separate orchestration logic and implement spot/on-demand rerun handling with check-run updates.
ci: add permission control for public ci tests #2397: Both PRs introduce permission gating in pr-test.yml and gate-controlled job wiring for PR CI execution control.

Suggested labels

run-ci

Suggested reviewers

yzh119
dierksen
bkryu

Poem

🐰 A workflow refined with a gated permission door,
Spot termination worries haunt us no more!
Setup detects changes, tests split and rerun,
With checks aggregated—CI harmony's won! 🎉

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately identifies the primary change—reverting a previous CI workflow refactor—and provides meaningful context through the commit reference.
Description check	✅ Passed	The description explains the revert rationale (workflow_dispatch requires actions:write which breaks fork PRs) but lacks detail in template sections like Related Issues, detailed PR context, and test information.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch revert-pr-2500

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/pr-test.yml:
- Around line 669-694: Remove the duplicated step-summary header by deleting the
final echo that writes "Test Results Summary" (the line echo "Test Results
Summary" >> $GITHUB_STEP_SUMMARY) so only the initial "## Test Results Summary"
remains; ensure you do not modify the check_status function or other echo lines
that add job entries to $GITHUB_STEP_SUMMARY.

🧹 Nitpick comments (2)

.github/workflows/pr-test.yml (2)
159-160: TODO: Re-add ^\.github/ to skip patterns before merging.

The comment on Line 159 explicitly states this pattern needs to be added back. Without it, CI changes (like this PR itself) won't be skippable, wasting spot instance time on unnecessary full builds. This is easy to forget during merge—consider adding it now.
Proposed fix
-          # TODO (yongwww): Add back ^\.github/ before merging to main
-          SKIP_PATTERNS="README.md|^docs/|^docker/|^licenses/|^LICENSE$|^NOTICE$|^version\.txt$"
+          SKIP_PATTERNS="README.md|^docs/|^docker/|^licenses/|^LICENSE$|^NOTICE$|^version\.txt$|^\.github/"
235-294: Significant duplication across the three analyze-failure jobs.

The log-analysis logic (download logs, detect zip format, grep for spot termination markers) is repeated nearly verbatim in analyze-aot-failure, analyze-gpu-a10g-failure, and analyze-gpu-t4-failure. Consider extracting this into a shared composite action or a reusable shell script (e.g., scripts/analyze_spot_failure.sh) that accepts the job-name filter as a parameter. This would reduce ~150 lines of duplication and make the spot-termination detection patterns easier to maintain in one place.

coderabbitai · 2026-02-08T22:05:00Z

.github/workflows/pr-test.yml

+          echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY
+
+          # Check if CI was skipped due to permissions
+          if [ "${{ needs.gate.outputs.authorized }}" != "true" ]; then
+            echo "CI skipped (pending authorization)" >> $GITHUB_STEP_SUMMARY
+            echo "A contributor in @flashinfer-ai/ci-users can comment \`@flashinfer-bot run\` to approve." >> $GITHUB_STEP_SUMMARY
+            exit 0
+          fi
+          # Helper function to check job status
+          check_status() {
+            local name=$1 skip=$2 spot=$3 spot_term=$4 rerun=$5
+            echo "$name" >> $GITHUB_STEP_SUMMARY
+            if [ "$skip" == "true" ]; then
+              echo "- Status: Skipped" >> $GITHUB_STEP_SUMMARY
+            elif [ "$spot" == "success" ]; then
+              echo "- Status: Passed (spot)" >> $GITHUB_STEP_SUMMARY
+            elif [ "$spot_term" == "true" ] && [ "$rerun" == "success" ]; then
+              echo "- Status: Passed (on-demand rerun)" >> $GITHUB_STEP_SUMMARY
+            else
+              echo "- Status: Failed" >> $GITHUB_STEP_SUMMARY
+              return 1
+            fi
+            return 0
+          }
+
+          echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY


⚠️ Potential issue | 🟡 Minor

Duplicate "Test Results Summary" header in step summary.

Line 669 writes ## Test Results Summary and Line 694 writes Test Results Summary again. The second one appears to be a leftover.

Proposed fix

echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY # Check if CI was skipped due to permissions ... fi - echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY - echo "" >> $GITHUB_STEP_SUMMARY + echo "" >> $GITHUB_STEP_SUMMARY

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY

# Check if CI was skipped due to permissions

if [ "${{ needs.gate.outputs.authorized }}" != "true" ]; then

echo "CI skipped (pending authorization)" >> $GITHUB_STEP_SUMMARY

echo "A contributor in @flashinfer-ai/ci-users can comment \`@flashinfer-bot run\` to approve." >> $GITHUB_STEP_SUMMARY

exit 0

fi

# Helper function to check job status

check_status() {

local name=$1 skip=$2 spot=$3 spot_term=$4 rerun=$5

echo "$name" >> $GITHUB_STEP_SUMMARY

if [ "$skip" == "true" ]; then

echo "- Status: Skipped" >> $GITHUB_STEP_SUMMARY

elif [ "$spot" == "success" ]; then

echo "- Status: Passed (spot)" >> $GITHUB_STEP_SUMMARY

elif [ "$spot_term" == "true" ] && [ "$rerun" == "success" ]; then

echo "- Status: Passed (on-demand rerun)" >> $GITHUB_STEP_SUMMARY

else

echo "- Status: Failed" >> $GITHUB_STEP_SUMMARY

return 1

fi

return 0

}

echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY

echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY

# Check if CI was skipped due to permissions

if [ "${{ needs.gate.outputs.authorized }}" != "true" ]; then

echo "CI skipped (pending authorization)" >> $GITHUB_STEP_SUMMARY

echo "A contributor in `@flashinfer-ai/ci-users` can comment \`@flashinfer-bot run\` to approve." >> $GITHUB_STEP_SUMMARY

exit 0

fi

# Helper function to check job status

check_status() {

local name=$1 skip=$2 spot=$3 spot_term=$4 rerun=$5

echo "$name" >> $GITHUB_STEP_SUMMARY

if [ "$skip" == "true" ]; then

echo "- Status: Skipped" >> $GITHUB_STEP_SUMMARY

elif [ "$spot" == "success" ]; then

echo "- Status: Passed (spot)" >> $GITHUB_STEP_SUMMARY

elif [ "$spot_term" == "true" ] && [ "$rerun" == "success" ]; then

echo "- Status: Passed (on-demand rerun)" >> $GITHUB_STEP_SUMMARY

else

echo "- Status: Failed" >> $GITHUB_STEP_SUMMARY

return 1

fi

return 0

}

echo "" >> $GITHUB_STEP_SUMMARY

🤖 Prompt for AI Agents

In @.github/workflows/pr-test.yml around lines 669 - 694, Remove the duplicated step-summary header by deleting the final echo that writes "Test Results Summary" (the line echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY) so only the initial "## Test Results Summary" remains; ensure you do not modify the check_status function or other echo lines that add job entries to $GITHUB_STEP_SUMMARY.

Revert "ci: refactor PR tests to hide failed spot jobs from PR status (…

05455c0

…#2500)" This reverts commit d5eaa42.

coderabbitai bot reviewed Feb 8, 2026

View reviewed changes

yongwww mentioned this pull request Feb 11, 2026

feat: Add TRTLLM-Gen Skip-Softmax kernels for prefill and decode #2477

Merged

5 tasks

yzh119 approved these changes Feb 11, 2026

View reviewed changes

yzh119 merged commit afcdd80 into main Feb 11, 2026
29 of 33 checks passed

yzh119 deleted the revert-pr-2500 branch February 11, 2026 16:37

coderabbitai bot mentioned this pull request Feb 19, 2026

ci: fix H100 cleanup #2590

Merged

5 tasks

coderabbitai bot mentioned this pull request Mar 11, 2026

fix: block PR merge when CI is skipped due to pending authorization #2761

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "ci: refactor PR tests to hide failed spot jobs from PR status…#2524

Revert "ci: refactor PR tests to hide failed spot jobs from PR status…#2524
yzh119 merged 1 commit intomainfrom
revert-pr-2500

yongwww commented Feb 8, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Uh oh!

coderabbitai bot commented Feb 8, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yongwww commented Feb 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Uh oh!

coderabbitai bot commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yongwww commented Feb 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 8, 2026 •

edited

Loading