cicd / testing: Add xfails tracker script by kahyunnam · Pull Request #2227 · flashinfer-ai/flashinfer

kahyunnam · 2025-12-16T23:03:34Z

📌 Description

Testing item from open November roadmap todos. Adds xfails reporting/tracking script, to keep track of any technical debt noted in testing.

🔍 Related Issues

Related to this copilot PR which was closed earlier: #1733

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

New Features
- Added a new xfails tracker script that automatically scans your test suite, collects all xfail markers and conditions, and generates a comprehensive report summarizing skipped and expected failures.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-16T23:03:45Z

Walkthrough

A new Python script for scanning test suites to detect and aggregate pytest xfail markers. It uses AST-based analysis to identify decorator-based xfails, parameterized xfails, and runtime xfail calls, then generates a formatted report grouped by failure reason with metadata for each occurrence.

Changes

Cohort / File(s)	Summary
XFail Tracker Script `scripts/xfails_tracker.py`	New script that scans Python test files via AST analysis to detect pytest xfail markers (decorators, parameters, runtime calls), aggregates findings by reason, and outputs a formatted report with test paths, line numbers, conditions, and strictness flags. Includes `XFailInfo` dataclass, `XFailCollector` AST visitor, and utility functions for file discovery, collection, grouping, and formatting.

Sequence Diagram

sequenceDiagram
    participant Main as main()
    participant Discovery as find_test_files()
    participant Parser as AST Parser
    participant Collector as XFailCollector
    participant Aggregator as collect_xfails()
    participant Grouper as group_xfails_by_reason()
    participant Formatter as format_table()
    participant Output as Output Report

    Main->>Discovery: Scan tests directory
    Discovery-->>Main: Return test file paths
    
    Main->>Aggregator: Process all test files
    
    loop For each test file
        Aggregator->>Parser: Parse Python file
        Parser->>Collector: Visit AST nodes
        
        alt Detect Decorator Xfail
            Collector->>Collector: Extract `@pytest.mark.xfail`
        else Detect Parameter Xfail
            Collector->>Collector: Extract pytest.param marks
        else Detect Runtime Xfail
            Collector->>Collector: Extract pytest.xfail() calls
        end
        
        Collector-->>Aggregator: XFailInfo instances
    end
    
    Aggregator-->>Main: List of all XFailInfo
    
    Main->>Grouper: Group by reason
    Grouper-->>Main: Grouped xfails dict
    
    Main->>Formatter: Format for output
    Formatter->>Formatter: Build summary & tables
    Formatter-->>Main: Formatted report string
    
    Main->>Output: Print report

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

AST visitor pattern implementation: Verify correctness of decorator detection (visit_FunctionDef, _check_decorator) and call node handling (visit_Call), especially edge cases with nested or chained decorators
Multi-type xfail detection: Review logic for distinguishing decorator-based, parameterized, and runtime xfail calls; ensure all three paths correctly extract reason, condition, and strict flags
String conversion robustness: Examine _ast_to_string() and _get_string_value() methods for proper AST node serialization and handling of complex expressions
Error handling and edge cases: Check graceful handling of malformed files, missing attributes, and syntax errors during AST traversal
Grouping and formatting logic: Validate data aggregation and report generation, including path normalization and output structure

Poem

🐰 Hark! I've tracked the xfails in your test—
No more guessing which ones need rest,
AST parsed and reasons found,
In a formatted report, all sound!
Your pytest suite's secrets, now on the table—
Tracked by rabbit, reliable and able! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a script for tracking xfail markers in the test suite.
Description check	✅ Passed	The description provides context with roadmap reference and related issue, includes all checklist sections with pre-commit and test items marked complete, though reviewer notes section is empty.
Docstring Coverage	✅ Passed	Docstring coverage is 85.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2025-12-16T23:03:49Z

Summary of Changes

Hello @kahyunnam, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new utility designed to improve the tracking and management of technical debt within the test suite. By adding an automated script to report pytest.mark.xfail markers, the project gains better visibility into expected test failures and their associated reasons, aligning with the November roadmap's focus on testing infrastructure improvements.

Highlights

New XFail Tracking Script: A new Python script, scripts/xfails_tracker.py, has been added to the repository.
Automated XFail Reporting: This script automatically scans the test suite to identify and report all pytest.mark.xfail markers.
Detailed XFail Analysis: The script collects comprehensive information for each xfail, including its file path, line number, test name, reason, condition, strictness, and type (decorator, parameter, or runtime).
Structured Report Generation: It generates a formatted report that summarizes xfails by their reasons and provides a detailed breakdown of each xfailed test.
AST-based Parsing: The implementation leverages Python's ast module for robust parsing of test files to accurately detect xfail markers.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a very useful script for tracking pytest.mark.xfail markers, which will be a great tool for managing technical debt in tests. The implementation using Python's ast module is solid. I've provided a few suggestions to improve robustness, maintainability, and readability of the script. The most important one is to handle aliased imports of pytest to make the script more resilient to different coding styles. Other suggestions focus on refactoring for clarity, adding type hints, and improving compatibility messages for older Python versions. Overall, great work on this utility!

scripts/xfails_tracker.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

scripts/xfails_tracker.py (1)
1-10: Document Python version requirement.

The script uses ast.unparse() (line 209) which requires Python 3.9+. Consider adding this requirement to the docstring to help users understand compatibility.

Apply this diff to document the requirement:
 #!/usr/bin/env python3
 """
 XFails Tracker - Report Generator for pytest.mark.xfail markers
 
 This script scans the test suite for xfail markers and generates a report
 showing the total number of xfails and their reasons.
 
+Requires Python 3.9+ for full functionality (falls back to ast.dump on older versions).
+
 Usage:
     python scripts/xfails_tracker.py
 """

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02b4c5a and 18623fd.

📒 Files selected for processing (1)

scripts/xfails_tracker.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.8)

scripts/xfails_tracker.py

272-272: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Deploy Docs

🔇 Additional comments (10)

scripts/xfails_tracker.py (10)

20-34: LGTM!

The dataclass design is clean and captures all necessary metadata for tracking xfails. The type hint for xfail_type with inline documentation is helpful.

39-66: LGTM!

The state management for nested functions is well-implemented with proper save/restore patterns. The test function detection logic correctly handles both naming conventions and decorator-based tests.

68-111: LGTM!

The AST pattern matching for pytest calls and decorators is implemented correctly. The comment on line 78 helpfully explains why runtime xfails are checked outside test functions.

113-143: LGTM!

The decorator handling correctly distinguishes between parameterized and bare decorators. The marks argument processing properly handles both single markers and collections.

157-188: LGTM!

The xfail information extraction correctly handles pytest.mark.xfail semantics where the first positional argument is the condition. The keyword argument processing properly handles reason, strict, and **kwargs cases.

214-241: LGTM!

The runtime xfail extraction correctly handles the different semantics where the first positional argument is the reason (not condition as in decorator xfails). The comment explaining why condition is None is helpful.

244-253: LGTM!

The test file discovery logic correctly identifies test files by pattern and includes common helper files that may contain xfail calls. The use of set() to remove duplicates is appropriate.

256-275: Broad exception catching is acceptable here.

The broad Exception catch on line 272 (flagged by static analysis) is appropriate for a reporting tool that should continue processing remaining files even if one fails. Error messages are properly logged to stderr.

286-351: LGTM!

The report formatting is well-structured with both summary and detailed sections. The sorting by count, path relativization, and reason truncation all enhance readability.

354-377: LGTM!

The main function correctly determines paths, handles missing directories, and separates progress messages (stderr) from the actual report (stdout), which is good practice for scripts that may be piped.

scripts/xfails_tracker.py

yzh119 · 2025-12-21T04:44:25Z

/bot run

flashinfer-bot · 2025-12-21T04:44:59Z

GitLab MR !208 has been created, and the CI pipeline #40566329 is currently running. I'll report back once the pipeline job completes.

yzh119

I tried the script and it looks cool. The next step could be creating a weekly job running this script and create an github issue.

A reference can be found at https://github.com/flashinfer-ai/flashinfer/blob/main/.github/workflows/update-codeowners.yml

add xfails tracker script

18623fd

kahyunnam requested review from jimmyzho, nvmbreughe and yzh119 as code owners December 16, 2025 23:03

kahyunnam mentioned this pull request Dec 16, 2025

Roadmap (2025 Q4) #1770

Closed

31 tasks

gemini-code-assist bot reviewed Dec 16, 2025

View reviewed changes

scripts/xfails_tracker.py Show resolved Hide resolved

scripts/xfails_tracker.py Show resolved Hide resolved

scripts/xfails_tracker.py Show resolved Hide resolved

scripts/xfails_tracker.py Show resolved Hide resolved

scripts/xfails_tracker.py Show resolved Hide resolved

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

scripts/xfails_tracker.py Show resolved Hide resolved

yzh119 approved these changes Dec 21, 2025

View reviewed changes

yzh119 merged commit 4a2000d into flashinfer-ai:main Dec 21, 2025
4 checks passed

coderabbitai bot mentioned this pull request Dec 30, 2025

cicd: add a github workflow for xfails report script #2273

Merged

5 tasks

This was referenced Jan 8, 2026

Xfails report tracking system #2314

Closed

[Testing CICD] XFails tests tracking system #2318

Closed

coderabbitai bot mentioned this pull request Jan 12, 2026

chore: Update XFails Report #2287

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cicd / testing: Add xfails tracker script#2227

cicd / testing: Add xfails tracker script#2227
yzh119 merged 1 commit intoflashinfer-ai:mainfrom
kahyunnam:knam/xfails_tracker

kahyunnam commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 16, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

yzh119 commented Dec 21, 2025

Uh oh!

flashinfer-bot commented Dec 21, 2025

Uh oh!

yzh119 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kahyunnam commented Dec 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yzh119 commented Dec 21, 2025

Uh oh!

flashinfer-bot commented Dec 21, 2025

Uh oh!

yzh119 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kahyunnam commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2025 •

edited

Loading