Apply fix availability and applicability when adding to `DiagnosticGuard` and remove `NoqaCode::rule` by ntBre · Pull Request #18834 · astral-sh/ruff

ntBre · 2025-06-20T20:26:35Z

Summary

This PR removes the last two places we were using NoqaCode::rule in linter.rs (see #18391 (comment) and #18391 (comment)) by checking whether fixes are actually desired before adding them to a DiagnosticGuard. I implemented this by storing a Violation's Rule on the DiagnosticGuard so that we could check if it was enabled in the embedded LinterSettings when trying to set a fix.

All of the corresponding set_fix methods on OldDiagnostic were now unused (except in tests where I just set .fix directly), so I moved these to the guard instead of keeping both sets.

The very last place where we were using NoqaCode::rule was in the cache. I just reverted this to parsing the Rule from the name. I had forgotten to update the comment there anyway. Hopefully this doesn't cause too much of a perf hit.

In terms of binary size, we're back down almost to where main was two days ago (#18391 (comment)):

41,559,344 bytes for main 2 days ago
41,669,840 bytes for #18391
41,653,760 bytes for main now (after #18391 merged)
41,602,224 bytes for this branch

Only 43 kb up, but that shouldn't all be me this time :)

Test Plan

Existing tests and benchmarks on this PR

I never updated the comment anyway...

from 08a227e in #18391

github-actions · 2025-06-20T20:36:00Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser · 2025-06-21T16:05:40Z

crates/ruff/src/cache.rs

            // this also serves to filter them out, but we shouldn't be caching files with syntax
            // errors anyway.
-            .filter_map(|msg| Some((msg.noqa_code().and_then(|code| code.rule())?, msg)))
+            .filter_map(|msg| Some((msg.name().parse().ok()?, msg)))


This now seems to be the only reason why we need ::strum_macros::EnumString which requires a fair amount of code just to make this work.

I think we can get something comparable without going through rule by building an ad-hoc interner by rule id:

// Create an index map that maps rule name to index let mut rules = indexmap::IndexMap::new(); // ... // In filter map, retrieve the index or insert the rule with a new index let len = rules.len(); let rule_index = rules.entry(msg.noqa_code()?).or_insert(len); Some((*rule_index, msg)) // ... // Store the rule index on `LintCachedData let rules = rules.into_keys().map(|code| code.to_string()).collect(); Self { messages, source, rules, notebook_index, }

The deserialization can then lookup the rule in rules and clone the value.

This will also allow us to remove the Serialize and Deserialize derive from Rule

(And I think we can remove Ord, PartialOrd, and CacheKey too

What is the type of the IndexMap here? I think we still need to get back to the Rule because it's needed to reconstruct the OldDiagnostic in OldDiagnostic::lint.

Another option here could be Rule::from_code(&msg.noqa_code()?.to_string()).ok()?). And then we could store the Rule as its repr(u16) in the cache. That should allow us to remove all of the same derives? (Although it does add back one derive: strum_macros::FromRepr for the conversion in the other direction)

Patch

diff --git a/crates/ruff/src/cache.rs b/crates/ruff/src/cache.rs index fdaf6b4a7..14710027a 100644 --- a/crates/ruff/src/cache.rs +++ b/crates/ruff/src/cache.rs @@ -356,7 +356,8 @@ impl FileCache { msg.parent, file.clone(), msg.noqa_offset, - msg.rule, + Rule::from_repr(msg.rule) + .expect("Expected a valid rule repr in the cache"), ) }) .collect() @@ -442,7 +443,7 @@ impl LintCacheData { // Parse the kebab-case rule name into a `Rule`. This will fail for syntax errors, so // this also serves to filter them out, but we shouldn't be caching files with syntax // errors anyway. - .filter_map(|msg| Some((msg.name().parse().ok()?, msg))) + .filter_map(|msg| Some((Rule::from_code(&msg.noqa_code()?.to_string()).ok()?, msg))) .map(|(rule, msg)| { // Make sure that all message use the same source file. assert_eq!( @@ -451,7 +452,7 @@ impl LintCacheData { "message uses a different source file" ); CacheMessage { - rule, + rule: rule as u16, body: msg.body().to_string(), suggestion: msg.suggestion().map(ToString::to_string), range: msg.range(), @@ -475,7 +476,7 @@ impl LintCacheData { pub(super) struct CacheMessage { /// The rule for the cached diagnostic. #[bincode(with_serde)] - rule: Rule, + rule: u16, /// The message body to display to the user, to explain the diagnostic. body: String, /// The message to display to the user, to explain the suggested fix. diff --git a/crates/ruff_macros/src/map_codes.rs b/crates/ruff_macros/src/map_codes.rs index 39993af6a..6f9b0d043 100644 --- a/crates/ruff_macros/src/map_codes.rs +++ b/crates/ruff_macros/src/map_codes.rs @@ -433,13 +433,8 @@ fn register_rules<'a>(input: impl Iterator<Item = &'a Rule>) -> TokenStream { Copy, Clone, Hash, - PartialOrd, - Ord, - ::ruff_macros::CacheKey, ::strum_macros::IntoStaticStr, - ::strum_macros::EnumString, - ::serde::Serialize, - ::serde::Deserialize, + ::strum_macros::FromRepr, )] #[repr(u16)] #[strum(serialize_all = "kebab-case")]

Hmm I see. It's because of the &'static str in LintId. I don't think what I suggested then makes sense. But this does feel unfortunate. It particularly annoys me that filter_map seems a bit like a footgun. It's very easy to filter out too many diagnostics (e.g. if Ruff ever starts using any other DiagnosticId other than syntax error). It even becomes more fragile when we make noqa_code (secondary code) an Option on ruff_db::Diagnostic.

I think the options are:

Make LintName use a String. Which doesn't feel great as well

implement a serialize and deserialize function (in ruff) for LintId. That's probably the way to go but I'm okay to defer this to another PR.

It's unfortunate that we can't remove all FromRepr calls or is there a way to get to the rule from the LintName? I think that would be better if possible

I think if we kept only EnumString, we could serialize the LintName as a String here and then parse back to a rule when deserializing. That should eliminate all of the derives except EnumString. Would that be any better?

Then we could also replace this filter_map with a filter(|msg| !msg.is_syntax_error()), I think. Although that still doesn't really address using other DiagnosticIds since I don't think those could be parsed back to Rules either.

But yeah, we're just using the Rule to retrieve the &'static str lint name. Otherwise we could just (de)serialize the lint name and the Option<NoqaCode> (well, probably as an Option<String>) directly.

How are we planning to handle this caching in ty? It seems like there would be issues with the &'static str there too.

I think if we kept only EnumString, we could serialize the LintName as a String here and then parse back to a rule when deserializing. That should eliminate all of the derives except EnumString. Would that be any better?

I'm not sure what you mean by EnumString.

How are we planning to handle this caching in ty? It seems like there would be issues with the &'static str there too.

Yes, I think we would have the same issue. I think going through Rules here is probably fine (that's what I would do in ty).

I mean ::strum_macros::EnumString, which I thought spawned this thread. We're currently using EnumString to parse a rule name to a Rule when serializing. But if the goal is to remove several derive macros, we could instead serialize the string and call name.parse().unwrap() when deserializing. That would remove the need for all of these macros¹:

ruff/crates/ruff_macros/src/map_codes.rs

Lines 456 to 462 in 291413b

PartialOrd,

Ord,

::ruff_macros::CacheKey,

::strum_macros::IntoStaticStr,

::strum_macros::EnumString,

::serde::Serialize,

::serde::Deserialize,

except EnumString, which was the original target. I may have lost the thread here, though, if you meant something else entirely.

Footnotes

We also still need IntoStaticStr, which we use for the Display implementation, but that's a bit beside the point. ↩

Ahh, sorry. That's on me. My main goal was to avoid needing to go back to Rule.

I don't think I feel very strongly about this. Removing EnumString would definitely be nice and I could see other ways to avoid the name.parse() overhead (... bring back my interner ;)).

Sounds good, I think I'll merge this for now then. I think there will definitely be an interner in our future! 😄

MichaReiser

Nice, I think we can remove some more code by not using Rule in the cache.

We should also double check if this exactly captures today's behavior. E.g. does the LSP not show fixes for rules with should_fix = false? Do we show the fixes in diagnostics?

MichaReiser · 2025-06-21T16:11:47Z

crates/ruff_linter/src/checkers/ast/mod.rs

+        if !self.context.rules.should_fix(self.rule) {
+            self.fix = None;
+            return;
+        }


Does this match Ruff's current behavior? Does that mean, we won't even show fixes when the fix is disabled. This seems wrong to me.

I'm not suggesting we should necessarily change this in this PR but we should at least open an issue for what I consider a bug. Instead, Ruff should keep the fix (but mark it in some way) and we then skip it when applying automatic fixes (and we don't show it in the LSP although I think even that would be fine).

Which makes me wonder. Should should_fix == False resolve to manual applicability?

Yes, this does seem to match the current behavior in both the playground and in VS Code, with the caveat that I couldn't get a ruff.toml like

[lint] unfixable = ["F401"]

to work in VS Code, only setting it in my settings.json actually applied. This might be a bug too, unless I was doing something wrong.

I think the LSP filters out fixes with display-only applicability here:

ruff/crates/ruff_server/src/lint.rs

Line 247 in b2b65e7

let fix = fix.and_then(|fix| fix.applies(Applicability::Unsafe).then_some(fix));

But if we change that to DisplayOnly and change the code here to

if !self.context.rules.should_fix(self.rule) { self.fix = Some(fix.with_applicability(Applicability::DisplayOnly)); return; }

we can show display-only fixes in the editor, even if they're disabled. That definitely makes sense to me.

Let's create a follow up issue for this. I think it requires some design work on how we want to handle them in the LSP.

* main: [ty] Fix false positives when subscripting an object inferred as having an `Intersection` type (#18920) [`flake8-use-pathlib`] Add autofix for `PTH202` (#18763) [ty] Add relative import completion tests [ty] Clarify what "cursor" means [ty] Add a cursor test builder [ty] Enforce sort order of completions (#18917) [formatter] Fix missing blank lines before decorated classes in .pyi files (#18888) Apply fix availability and applicability when adding to `DiagnosticGuard` and remove `NoqaCode::rule` (#18834) py-fuzzer: allow relative executable paths (#18915) [ty] Change `environment.root` to accept multiple paths (#18913) [ty] Rename `src.root` setting to `environment.root` (#18760) Use file path for detecting package root (#18914) Consider virtual path for various server actions (#18910) [ty] Introduce `UnionType::try_from_elements` and `UnionType::try_map` (#18911) [ty] Support narrowing on `isinstance()`/`issubclass()` if the second argument is a dynamic, intersection, union or typevar type (#18900) [ty] Add decorator check for implicit attribute assignments (#18587) [`ruff`] Trigger `RUF037` for empty string and byte strings (#18862) [ty] Avoid duplicate diagnostic in unpacking (#18897) [`pyupgrade`] Extend version detection to include `sys.version_info.major` (`UP036`) (#18633) [`ruff`] Frozen Dataclass default should be valid (`RUF009`) (#18735)

ntBre added 4 commits June 20, 2025 16:00

resolve fix applicability when adding fixes via DiagnosticGuard

a5a6bee

resolve rules marked as unfixable in DiagnosticGuard

c814520

go back to parsing the rule name in the cache

b47daff

I never updated the comment anyway...

revert RuleNamespace and NoqaCode::rule changes

b2b65e7

from 08a227e in #18391

ntBre added internal An internal refactor or improvement diagnostics Related to reporting of diagnostics. labels Jun 20, 2025

ntBre marked this pull request as ready for review June 20, 2025 20:45

ntBre requested a review from MichaReiser June 20, 2025 20:45

MichaReiser reviewed Jun 21, 2025

View reviewed changes

MichaReiser approved these changes Jun 21, 2025

View reviewed changes

ntBre mentioned this pull request Jun 23, 2025

Consider showing unfixable fixes in an editor context #18896

Open

ntBre merged commit 02ae8e1 into main Jun 24, 2025
37 checks passed

ntBre deleted the brent/fix-availability branch June 24, 2025 14:08

	PartialOrd,
	Ord,
	::ruff_macros::CacheKey,
	::strum_macros::IntoStaticStr,
	::strum_macros::EnumString,
	::serde::Serialize,
	::serde::Deserialize,

Conversation

ntBre commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

github-actions bot commented Jun 20, 2025

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

MichaReiser Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntBre Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Footnotes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ntBre commented Jun 20, 2025 •

edited

Loading

`ruff-ecosystem` results

MichaReiser Jun 21, 2025 •

edited

Loading

ntBre Jun 21, 2025 •

edited

Loading