bugfix: fix claude skills by yzh119 · Pull Request #2275 · flashinfer-ai/flashinfer

yzh119 · 2025-12-31T07:26:49Z

📌 Description

Skills defined in #2240 doesn't make effect because of missing metadata and wrong file name.
This PR fixes the issue.

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

New Features
- Scale kernel now available as a public API for end-users
- New benchmarking guide and tools for kernel performance measurement
Documentation
- Updated tutorial documentation with structured metadata
- Added comprehensive benchmarking guidance with examples
Tests
- Implemented unit tests to validate kernel functionality

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-31T07:27:00Z

📝 Walkthrough

Walkthrough

This change introduces a new CUDA scale kernel to the flashinfer public API, including tutorial documentation with benchmarking guidance, a Python API wrapper, test coverage, and a benchmark script demonstrating performance measurement across multiple sizes and data types.

Changes

Cohort / File(s)	Summary
Tutorial Documentation `.claude/skills/add-cuda-kernel/SKILL.md`, `.claude/skills/benchmark-kernel/SKILL.md`, `.claude/skills/debug-cuda-crash/SKILL.md`	Added new "Step 10: Add Benchmark" tutorial section with benchmarks/bench_scale.py guidance; added YAML front matter metadata headers to skill files for better organization and discoverability.
Core API & Registration `flashinfer/scale.py`, `flashinfer/__init__.py`, `flashinfer/aot.py`	Introduced new scale.py module as public Python API for the CUDA scale kernel; updated init.py to export the new API; modified aot.py to register AOT components.
Tests & Benchmarks `tests/test_scale.py`, `benchmarks/bench_scale.py`	Added unit tests for scale kernel validation; created benchmark script measuring flashinfer.scale performance across multiple sizes and data types using CUPTI fallback timing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

agent: add CLAUDE.md and claude skills #2240: Directly related—modifies the same scale implementation and integration points (flashinfer/scale.py, flashinfer/init.py, flashinfer/aot.py) with overlapping test coverage.

Poem

🐰 A kernel scales so bright and new,
With benchmarks charting what it can do,
CUPTI whispers timings true,
From tiny ops to massive crews—
The scale kernel hops right through! ⚡

Pre-merge checks

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'bugfix: fix claude skills' is vague and generic, using non-descriptive terms that don't convey meaningful information about what specific skills are being fixed or what the actual changes accomplish.	Use a more specific title that describes the actual changes, such as 'Add YAML metadata to skill definition files' or 'Fix skill definitions with required metadata and correct filenames'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description explains the issue (missing metadata and wrong filename in skills) and references PR #2240, meeting the basic requirements of the template with a clear explanation and completed checklists.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 835a015 and 48b6e4e.

📒 Files selected for processing (3)

.claude/skills/add-cuda-kernel/SKILL.md
.claude/skills/benchmark-kernel/SKILL.md
.claude/skills/debug-cuda-crash/SKILL.md

🧰 Additional context used

🧠 Learnings (8)

📓 Common learnings

Learnt from: CR
Repo: flashinfer-ai/flashinfer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T09:34:39.900Z
Learning: Keep documentation in CLAUDE.md and `.claude/skills/` files in sync with code changes, including infrastructure changes, new patterns, deprecated approaches, and new error handling utilities

Learnt from: CR
Repo: flashinfer-ai/flashinfer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T09:34:39.900Z
Learning: Applies to include/**/*.cuh : Kernel code in `include/flashinfer/` is automatically picked up by JIT compilation on changes - no pip reinstall needed

Learnt from: CR
Repo: flashinfer-ai/flashinfer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T09:34:39.900Z
Learning: Applies to flashinfer/__init__.py : Export new operations in `flashinfer/__init__.py` to make them available as public API

Learnt from: CR
Repo: flashinfer-ai/flashinfer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T09:34:39.900Z
Learning: Use `FLASHINFER_CUDA_ARCH_LIST` environment variable to specify target GPU architectures (e.g., '8.0 9.0a') and `FLASHINFER_NVCC_THREADS` to control parallel compilation threads

Learnt from: CR
Repo: flashinfer-ai/flashinfer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T09:34:39.900Z
Learning: Applies to tests/**/*.py : Test implementations should use `flashinfer.utils` functions (`get_compute_capability`, `is_sm90a_supported`, `is_sm100a_supported`, etc.) to skip tests on unsupported GPU architectures

Learnt from: CR
Repo: flashinfer-ai/flashinfer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T09:34:39.900Z
Learning: Applies to flashinfer/**/*.py : Use `flashinfer_api` decorator for debugging API calls, enable via `FLASHINFER_LOGLEVEL` environment variable (0=off, 1=basic, 3=detailed, 5=with stats)