Skip to content

Revert "ci: refactor PR tests to hide failed spot jobs from PR status…#2524

Merged
yzh119 merged 1 commit intomainfrom
revert-pr-2500
Feb 11, 2026
Merged

Revert "ci: refactor PR tests to hide failed spot jobs from PR status…#2524
yzh119 merged 1 commit intomainfrom
revert-pr-2500

Conversation

@yongwww
Copy link
Member

@yongwww yongwww commented Feb 8, 2026

… (#2500)"

This reverts commit d5eaa42.

Revert PR #2500. The workflow_dispatch architecture requires actions:write on GITHUB_TOKEN, which is read-only for fork PRs. This breaks CI for all fork PR contributors even after run-ci approval.

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

  • Chores
    • Restructured the PR validation pipeline with enhanced permission gating and staging.
    • Improved automated failure handling for infrastructure interruptions with dynamic rerun mechanisms.
    • Streamlined test result aggregation and reporting across multiple test environments.

@gemini-code-assist
Copy link
Contributor

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

📝 Walkthrough

Walkthrough

The PR consolidates two GitHub Actions workflows by deleting the separate pr-test-runner.yml file and integrating its orchestration logic into pr-test.yml. The refactored workflow introduces a permission gate, multi-stage testing pipeline with spot/on-demand rerun handling, and centralized test results aggregation across AOT and GPU test jobs.

Changes

Cohort / File(s) Summary
Workflow Consolidation and Reorganization
\.github/workflows/pr-test-runner.yml, \.github/workflows/pr-test.yml
Deleted pr-test-runner.yml and integrated its multi-job orchestration logic into pr-test.yml. Introduced new permission gating, PR-diff-based skip logic in setup stage, spot termination detection with dynamic rerun matrices, and centralized test results aggregation. Modified job permissions from actions:write to actions:read. Reorganized workflow from single orchestrator pattern to modular, multi-stage pipeline with explicit job prerequisites and outputs.

Sequence Diagram

sequenceDiagram
    participant GH as GitHub
    participant Gate as Gate Job
    participant Setup as Setup Job
    participant Test as Test Jobs<br/>(AOT/GPU)
    participant Analyze as Failure Analysis
    participant Rerun as Rerun Jobs
    participant Summary as Summary Job
    
    GH->>Gate: Trigger PR workflow
    Gate->>Gate: Check PR labels &<br/>contributor status
    Gate-->>GH: Output authorization
    
    alt Gate passes
        GH->>Setup: Run setup stage
        Setup->>GH: Compute skip_build &<br/>docker_tag from PR diff
        Setup-->>GH: Output setup config
        
        GH->>Test: Execute test matrix<br/>(spot runners)
        Test->>Test: AOT build & GPU<br/>JIT tests
        Test-->>GH: Test results
        
        GH->>Analyze: Analyze job logs
        Analyze->>Analyze: Detect spot<br/>termination markers
        Analyze-->>GH: Output rerun_matrix
        
        alt Spot termination detected
            GH->>Rerun: Execute rerun matrix<br/>(on-demand instances)
            Rerun->>Rerun: Re-execute tests
            Rerun-->>GH: Updated results
        end
        
        GH->>Summary: Aggregate results
        Summary->>Summary: Compose test summary<br/>& update check runs
        Summary-->>GH: Final status
    else Gate fails
        Gate-->>GH: Skip downstream jobs
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

run-ci

Suggested reviewers

  • yzh119
  • dierksen
  • bkryu

Poem

🐰 A workflow refined with a gated permission door,
Spot termination worries haunt us no more!
Setup detects changes, tests split and rerun,
With checks aggregated—CI harmony's won! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately identifies the primary change—reverting a previous CI workflow refactor—and provides meaningful context through the commit reference.
Description check ✅ Passed The description explains the revert rationale (workflow_dispatch requires actions:write which breaks fork PRs) but lacks detail in template sections like Related Issues, detailed PR context, and test information.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch revert-pr-2500

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/pr-test.yml:
- Around line 669-694: Remove the duplicated step-summary header by deleting the
final echo that writes "Test Results Summary" (the line echo "Test Results
Summary" >> $GITHUB_STEP_SUMMARY) so only the initial "## Test Results Summary"
remains; ensure you do not modify the check_status function or other echo lines
that add job entries to $GITHUB_STEP_SUMMARY.
🧹 Nitpick comments (2)
.github/workflows/pr-test.yml (2)

159-160: TODO: Re-add ^\.github/ to skip patterns before merging.

The comment on Line 159 explicitly states this pattern needs to be added back. Without it, CI changes (like this PR itself) won't be skippable, wasting spot instance time on unnecessary full builds. This is easy to forget during merge—consider adding it now.

Proposed fix
-          # TODO (yongwww): Add back ^\.github/ before merging to main
-          SKIP_PATTERNS="README.md|^docs/|^docker/|^licenses/|^LICENSE$|^NOTICE$|^version\.txt$"
+          SKIP_PATTERNS="README.md|^docs/|^docker/|^licenses/|^LICENSE$|^NOTICE$|^version\.txt$|^\.github/"

235-294: Significant duplication across the three analyze-failure jobs.

The log-analysis logic (download logs, detect zip format, grep for spot termination markers) is repeated nearly verbatim in analyze-aot-failure, analyze-gpu-a10g-failure, and analyze-gpu-t4-failure. Consider extracting this into a shared composite action or a reusable shell script (e.g., scripts/analyze_spot_failure.sh) that accepts the job-name filter as a parameter. This would reduce ~150 lines of duplication and make the spot-termination detection patterns easier to maintain in one place.

Comment on lines +669 to +694
echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY

# Check if CI was skipped due to permissions
if [ "${{ needs.gate.outputs.authorized }}" != "true" ]; then
echo "CI skipped (pending authorization)" >> $GITHUB_STEP_SUMMARY
echo "A contributor in @flashinfer-ai/ci-users can comment \`@flashinfer-bot run\` to approve." >> $GITHUB_STEP_SUMMARY
exit 0
fi
# Helper function to check job status
check_status() {
local name=$1 skip=$2 spot=$3 spot_term=$4 rerun=$5
echo "$name" >> $GITHUB_STEP_SUMMARY
if [ "$skip" == "true" ]; then
echo "- Status: Skipped" >> $GITHUB_STEP_SUMMARY
elif [ "$spot" == "success" ]; then
echo "- Status: Passed (spot)" >> $GITHUB_STEP_SUMMARY
elif [ "$spot_term" == "true" ] && [ "$rerun" == "success" ]; then
echo "- Status: Passed (on-demand rerun)" >> $GITHUB_STEP_SUMMARY
else
echo "- Status: Failed" >> $GITHUB_STEP_SUMMARY
return 1
fi
return 0
}

echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate "Test Results Summary" header in step summary.

Line 669 writes ## Test Results Summary and Line 694 writes Test Results Summary again. The second one appears to be a leftover.

Proposed fix
          echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY

          # Check if CI was skipped due to permissions
          ...
          fi
-          echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY
-          echo "" >> $GITHUB_STEP_SUMMARY
+          echo "" >> $GITHUB_STEP_SUMMARY
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY
# Check if CI was skipped due to permissions
if [ "${{ needs.gate.outputs.authorized }}" != "true" ]; then
echo "CI skipped (pending authorization)" >> $GITHUB_STEP_SUMMARY
echo "A contributor in @flashinfer-ai/ci-users can comment \`@flashinfer-bot run\` to approve." >> $GITHUB_STEP_SUMMARY
exit 0
fi
# Helper function to check job status
check_status() {
local name=$1 skip=$2 spot=$3 spot_term=$4 rerun=$5
echo "$name" >> $GITHUB_STEP_SUMMARY
if [ "$skip" == "true" ]; then
echo "- Status: Skipped" >> $GITHUB_STEP_SUMMARY
elif [ "$spot" == "success" ]; then
echo "- Status: Passed (spot)" >> $GITHUB_STEP_SUMMARY
elif [ "$spot_term" == "true" ] && [ "$rerun" == "success" ]; then
echo "- Status: Passed (on-demand rerun)" >> $GITHUB_STEP_SUMMARY
else
echo "- Status: Failed" >> $GITHUB_STEP_SUMMARY
return 1
fi
return 0
}
echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY
echo "## Test Results Summary" >> $GITHUB_STEP_SUMMARY
# Check if CI was skipped due to permissions
if [ "${{ needs.gate.outputs.authorized }}" != "true" ]; then
echo "CI skipped (pending authorization)" >> $GITHUB_STEP_SUMMARY
echo "A contributor in `@flashinfer-ai/ci-users` can comment \`@flashinfer-bot run\` to approve." >> $GITHUB_STEP_SUMMARY
exit 0
fi
# Helper function to check job status
check_status() {
local name=$1 skip=$2 spot=$3 spot_term=$4 rerun=$5
echo "$name" >> $GITHUB_STEP_SUMMARY
if [ "$skip" == "true" ]; then
echo "- Status: Skipped" >> $GITHUB_STEP_SUMMARY
elif [ "$spot" == "success" ]; then
echo "- Status: Passed (spot)" >> $GITHUB_STEP_SUMMARY
elif [ "$spot_term" == "true" ] && [ "$rerun" == "success" ]; then
echo "- Status: Passed (on-demand rerun)" >> $GITHUB_STEP_SUMMARY
else
echo "- Status: Failed" >> $GITHUB_STEP_SUMMARY
return 1
fi
return 0
}
echo "" >> $GITHUB_STEP_SUMMARY
🤖 Prompt for AI Agents
In @.github/workflows/pr-test.yml around lines 669 - 694, Remove the duplicated
step-summary header by deleting the final echo that writes "Test Results
Summary" (the line echo "Test Results Summary" >> $GITHUB_STEP_SUMMARY) so only
the initial "## Test Results Summary" remains; ensure you do not modify the
check_status function or other echo lines that add job entries to
$GITHUB_STEP_SUMMARY.

@yzh119 yzh119 merged commit afcdd80 into main Feb 11, 2026
29 of 33 checks passed
@yzh119 yzh119 deleted the revert-pr-2500 branch February 11, 2026 16:37
@coderabbitai coderabbitai bot mentioned this pull request Feb 19, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants