-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Document when a rule was added #21035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary
--
This PR adds the version a rule was added, and the file and line where it's
defined to `ViolationMetadata`. The file and line just use the standard `file!`
and `line!` macros, while the more interesting version field uses a new
`violation_metadata(version = ...)` attribute parsed by our `ViolationMetadata`
derive macro. I'm not sure that this is the best way to go, but it seemed like a
reasonable start.
I've also included the scripts I used to generate and add these in the PR for
now. I don't really think we need to keep them around permanently, but it seemed
like they could be helpful if anything looks wrong. I had Claude generate the
first iteration of `generate_rule_metadata.py`, which hopefully explains all of
the emojis, but I had to modify it pretty heavily today to get it working
correctly. I think `run_uvx_rule_check` and `get_ruff_version` are the most
interesting parts. The first because it shows ways to check rule availability
across a wide range of Ruff versions, and the latter because it filters out some
versions that uv couldn't install.
As a curiosity and a bit of a sanity check, I also plotted the rule numbers over
time:
TODO
I think this looks pretty reasonable and avoids some of the artifacts the
earlier versions of the script ran into, such as the `rule` sub-command not
being available or `--explain` requiring a file argument.
<details><summary>Script and summary data</summary>
```shell
gawk --csv '
NR > 1 {
split($2, a, ".")
major = a[1]; minor = a[2]; micro = a[3]
# sum the number of rules added per minor version
versions[minor] += 1
}
END {
tot = 0
for (i = 0; i <= 14; i++) {
tot += versions[i]
print i, tot
}
}
' ruff_rules_metadata.csv > summary.dat
```
```
0 696
1 768
2 778
3 803
4 822
5 848
6 855
7 865
8 893
9 915
10 916
11 924
12 929
13 932
14 933
```
</details>
Test Plan
--
I temporarily modified the `fix_title` of one of the rules to print the version,
file, and line number, just to make sure the macro was working. The real test
will be using this as part of our documentation generation, which I can either
add on here or in an immediate follow-up PR.
claude's version only included versions back to 0.5.0, I guess because it discarded everything with a `v` prefix, so there was a huge cluster of rules "added" in 0.5.0 this is still not perfect because the `rule` subcommand was added in 0.0.237 but it's closer than it was
f9120ee to
65048e9
Compare
|
|
That's fun, I wasn't aware Related to that, this may be a stupid question, but will the changes in this PR be reflected in |
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! That's what you were up to yesterday :)
ViolationMetadata seems the right spot for file and location but I'm less sure about the version. Mainly because the rule status is now spread over two places:
code_to_rule: Defines the status of the rule (preview, removed, stable)violation_metadata: Defines the version.
This has the downside that It's unclear what the version represents. Is it when the rule was added, deprecated or removed?
That's why I'm inclined to combine the version and the RuleStatus, similar to LintStatus in ty. Whether we simply extend the RuleStatus variants to have extra fields or move RuleStatus to ViolationMetadata (that's probably what I'd do if I would do it anew) isn't important and I'd probably go with what's easier.
@MeGaGiGaGon I think that's a great question and a good idea to add. It crossed my mind early on but then I kind of forgot about it. I think it should be straightforward to add here. @MichaReiser that makes a lot of sense. I did like how |
use a string for the group to avoid trying to import RuleGroup
65048e9 to
2e3081f
Compare
0903e76 to
1f980af
Compare
|
As Micha and I discussed on Discord, the #[derive(ViolationMetadata)]
#[violation_metadata(stable_since = "v0.0.155")]
pub(crate) struct UnusedNOQA {
pub codes: Option<UnusedCodes>,
}One interesting and somewhat intentional aspect of the attribute is that the macro will preserve the last entry, so we could theoretically stack versions if the status of a rule changes over time. For example, this would be accepted by the current macro implementation but only use the #[derive(ViolationMetadata)]
#[violation_metadata(
preview_since = "v0.0.123",
stable_since = "v0.0.155",
deprecated_since = "v1.2.3",
removed_since = "v4.5.6",
)]
pub(crate) struct UnusedNOQA {
pub codes: Option<UnusedCodes>,
}Another alternative we considered was just using the #[derive(ViolationMetadata)]
#[violation_metadata(status = RuleGroup::Stable { since: "v0.0.155" })]
pub(crate) struct UnusedNOQA {
pub codes: Option<UnusedCodes>,
}which is also pretty nice. If we ever decide to add a I also updated the JSON output format for And I rebased a bit more to keep the big commits at the end. The last commit is still |
1f980af to
44ed2c1
Compare
add group metadata to rules add metadata to test rules
44ed2c1 to
32d8f45
Compare
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me. I think having status = might be slightly easier to discover as you can type RuleStatus:: to see all the variants that are supported. But I don't feel strongly about the syntax (having all versions stacked looks a bit nosiy and having multiple fields might be easier to parse).
|
Sounds good! I guess I'll go ahead and land this once I fix the doctests and remove the scripts and CSV since it should be easy to iterate on the syntax now that we have the infrastructure in place. Maybe this is just my editor, but I wasn't getting completions when I tried typing |
|
One last check to make sure my scripts didn't change any of the rule statuses: diff \
<(./target/debug/ruff rule --all --output-format json | jq '[ .[] | { code, status: ( .status | keys[0] ) } ]') \
<(../worktrees/ruff1/target/debug/ruff rule --all --output-format json | jq '[ .[] | { code, status} ]')passes cleanly with main checked out in my other worktree. |
|
Oh, I just realized that many of the versions are actually wrong after the |
|
All of the |
Summary
Inspired by #20859, this PR adds the version a rule was added, and the file and line where it was defined, to
ViolationMetadata. The file and line just use the standardfile!andline!macros, while the more interesting version field uses a newviolation_metadataattribute parsed by ourViolationMetadataderive macro.I moved the commit modifying all of the rule files to the end, so it should be a lot easier to review by omitting that one.
As a curiosity and a bit of a sanity check, I also plotted the rule numbers over time:
I think this looks pretty reasonable and avoids some of the artifacts the earlier versions of the script ran into, such as the
rulesub-command not being available or--explainrequiring a file argument.Script and summary data
Test Plan
I built and viewed the documentation locally, and it looks pretty good!
The spacing seems a bit awkward following the
h1at the top, so I'm wondering if this might look nicer as a footer in Ruff. The links work well too:The last one even works on
mainnow since it points to thederive(ViolationMetadata)line.In terms of binary size, this branch is a bit bigger than main with 38,654,520 bytes compared to 38,635,728 (+20 KB). I guess that's not too much of an increase, but I wanted to check since we're generating a lot more code with macros.