Skip to content

cicd / testing: Add xfails tracker script#2227

Merged
yzh119 merged 1 commit intoflashinfer-ai:mainfrom
kahyunnam:knam/xfails_tracker
Dec 21, 2025
Merged

cicd / testing: Add xfails tracker script#2227
yzh119 merged 1 commit intoflashinfer-ai:mainfrom
kahyunnam:knam/xfails_tracker

Conversation

@kahyunnam
Copy link
Copy Markdown
Collaborator

@kahyunnam kahyunnam commented Dec 16, 2025

📌 Description

Testing item from open November roadmap todos. Adds xfails reporting/tracking script, to keep track of any technical debt noted in testing.

🔍 Related Issues

Related to this copilot PR which was closed earlier: #1733

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

  • New Features
    • Added a new xfails tracker script that automatically scans your test suite, collects all xfail markers and conditions, and generates a comprehensive report summarizing skipped and expected failures.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

A new Python script for scanning test suites to detect and aggregate pytest xfail markers. It uses AST-based analysis to identify decorator-based xfails, parameterized xfails, and runtime xfail calls, then generates a formatted report grouped by failure reason with metadata for each occurrence.

Changes

Cohort / File(s) Summary
XFail Tracker Script
scripts/xfails_tracker.py
New script that scans Python test files via AST analysis to detect pytest xfail markers (decorators, parameters, runtime calls), aggregates findings by reason, and outputs a formatted report with test paths, line numbers, conditions, and strictness flags. Includes XFailInfo dataclass, XFailCollector AST visitor, and utility functions for file discovery, collection, grouping, and formatting.

Sequence Diagram

sequenceDiagram
    participant Main as main()
    participant Discovery as find_test_files()
    participant Parser as AST Parser
    participant Collector as XFailCollector
    participant Aggregator as collect_xfails()
    participant Grouper as group_xfails_by_reason()
    participant Formatter as format_table()
    participant Output as Output Report

    Main->>Discovery: Scan tests directory
    Discovery-->>Main: Return test file paths
    
    Main->>Aggregator: Process all test files
    
    loop For each test file
        Aggregator->>Parser: Parse Python file
        Parser->>Collector: Visit AST nodes
        
        alt Detect Decorator Xfail
            Collector->>Collector: Extract `@pytest.mark.xfail`
        else Detect Parameter Xfail
            Collector->>Collector: Extract pytest.param marks
        else Detect Runtime Xfail
            Collector->>Collector: Extract pytest.xfail() calls
        end
        
        Collector-->>Aggregator: XFailInfo instances
    end
    
    Aggregator-->>Main: List of all XFailInfo
    
    Main->>Grouper: Group by reason
    Grouper-->>Main: Grouped xfails dict
    
    Main->>Formatter: Format for output
    Formatter->>Formatter: Build summary & tables
    Formatter-->>Main: Formatted report string
    
    Main->>Output: Print report
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • AST visitor pattern implementation: Verify correctness of decorator detection (visit_FunctionDef, _check_decorator) and call node handling (visit_Call), especially edge cases with nested or chained decorators
  • Multi-type xfail detection: Review logic for distinguishing decorator-based, parameterized, and runtime xfail calls; ensure all three paths correctly extract reason, condition, and strict flags
  • String conversion robustness: Examine _ast_to_string() and _get_string_value() methods for proper AST node serialization and handling of complex expressions
  • Error handling and edge cases: Check graceful handling of malformed files, missing attributes, and syntax errors during AST traversal
  • Grouping and formatting logic: Validate data aggregation and report generation, including path normalization and output structure

Poem

🐰 Hark! I've tracked the xfails in your test—
No more guessing which ones need rest,
AST parsed and reasons found,
In a formatted report, all sound!
Your pytest suite's secrets, now on the table—
Tracked by rabbit, reliable and able! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a script for tracking xfail markers in the test suite.
Description check ✅ Passed The description provides context with roadmap reference and related issue, includes all checklist sections with pre-commit and test items marked complete, though reviewer notes section is empty.
Docstring Coverage ✅ Passed Docstring coverage is 85.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @kahyunnam, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new utility designed to improve the tracking and management of technical debt within the test suite. By adding an automated script to report pytest.mark.xfail markers, the project gains better visibility into expected test failures and their associated reasons, aligning with the November roadmap's focus on testing infrastructure improvements.

Highlights

  • New XFail Tracking Script: A new Python script, scripts/xfails_tracker.py, has been added to the repository.
  • Automated XFail Reporting: This script automatically scans the test suite to identify and report all pytest.mark.xfail markers.
  • Detailed XFail Analysis: The script collects comprehensive information for each xfail, including its file path, line number, test name, reason, condition, strictness, and type (decorator, parameter, or runtime).
  • Structured Report Generation: It generates a formatted report that summarizes xfails by their reasons and provides a detailed breakdown of each xfailed test.
  • AST-based Parsing: The implementation leverages Python's ast module for robust parsing of test files to accurately detect xfail markers.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kahyunnam kahyunnam mentioned this pull request Dec 16, 2025
31 tasks
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a very useful script for tracking pytest.mark.xfail markers, which will be a great tool for managing technical debt in tests. The implementation using Python's ast module is solid. I've provided a few suggestions to improve robustness, maintainability, and readability of the script. The most important one is to handle aliased imports of pytest to make the script more resilient to different coding styles. Other suggestions focus on refactoring for clarity, adding type hints, and improving compatibility messages for older Python versions. Overall, great work on this utility!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
scripts/xfails_tracker.py (1)

1-10: Document Python version requirement.

The script uses ast.unparse() (line 209) which requires Python 3.9+. Consider adding this requirement to the docstring to help users understand compatibility.

Apply this diff to document the requirement:

 #!/usr/bin/env python3
 """
 XFails Tracker - Report Generator for pytest.mark.xfail markers
 
 This script scans the test suite for xfail markers and generates a report
 showing the total number of xfails and their reasons.
 
+Requires Python 3.9+ for full functionality (falls back to ast.dump on older versions).
+
 Usage:
     python scripts/xfails_tracker.py
 """
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02b4c5a and 18623fd.

📒 Files selected for processing (1)
  • scripts/xfails_tracker.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
scripts/xfails_tracker.py

272-272: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Deploy Docs
🔇 Additional comments (10)
scripts/xfails_tracker.py (10)

20-34: LGTM!

The dataclass design is clean and captures all necessary metadata for tracking xfails. The type hint for xfail_type with inline documentation is helpful.


39-66: LGTM!

The state management for nested functions is well-implemented with proper save/restore patterns. The test function detection logic correctly handles both naming conventions and decorator-based tests.


68-111: LGTM!

The AST pattern matching for pytest calls and decorators is implemented correctly. The comment on line 78 helpfully explains why runtime xfails are checked outside test functions.


113-143: LGTM!

The decorator handling correctly distinguishes between parameterized and bare decorators. The marks argument processing properly handles both single markers and collections.


157-188: LGTM!

The xfail information extraction correctly handles pytest.mark.xfail semantics where the first positional argument is the condition. The keyword argument processing properly handles reason, strict, and **kwargs cases.


214-241: LGTM!

The runtime xfail extraction correctly handles the different semantics where the first positional argument is the reason (not condition as in decorator xfails). The comment explaining why condition is None is helpful.


244-253: LGTM!

The test file discovery logic correctly identifies test files by pattern and includes common helper files that may contain xfail calls. The use of set() to remove duplicates is appropriate.


256-275: Broad exception catching is acceptable here.

The broad Exception catch on line 272 (flagged by static analysis) is appropriate for a reporting tool that should continue processing remaining files even if one fails. Error messages are properly logged to stderr.


286-351: LGTM!

The report formatting is well-structured with both summary and detailed sections. The sorting by count, path relativization, and reason truncation all enhance readability.


354-377: LGTM!

The main function correctly determines paths, handles missing directories, and separates progress messages (stderr) from the actual report (stdout), which is good practice for scripts that may be piped.

@yzh119
Copy link
Copy Markdown
Collaborator

yzh119 commented Dec 21, 2025

/bot run

@flashinfer-bot
Copy link
Copy Markdown
Collaborator

GitLab MR !208 has been created, and the CI pipeline #40566329 is currently running. I'll report back once the pipeline job completes.

Copy link
Copy Markdown
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the script and it looks cool. The next step could be creating a weekly job running this script and create an github issue.

A reference can be found at https://github.com/flashinfer-ai/flashinfer/blob/main/.github/workflows/update-codeowners.yml

@yzh119 yzh119 merged commit 4a2000d into flashinfer-ai:main Dec 21, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants