Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion crates/ruff/src/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ impl LintCacheData {
// Parse the kebab-case rule name into a `Rule`. This will fail for syntax errors, so
// this also serves to filter them out, but we shouldn't be caching files with syntax
// errors anyway.
.filter_map(|msg| Some((msg.noqa_code().and_then(|code| code.rule())?, msg)))
.filter_map(|msg| Some((msg.name().parse().ok()?, msg)))
Copy link
Member

@MichaReiser MichaReiser Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now seems to be the only reason why we need ::strum_macros::EnumString which requires a fair amount of code just to make this work.

I think we can get something comparable without going through rule by building an ad-hoc interner by rule id:

			// Create an index map that maps rule name to index
			let mut rules = indexmap::IndexMap::new();
			
			// ...

			// In filter map, retrieve the index or insert the rule with a new index

                let len = rules.len();
                let rule_index = rules.entry(msg.noqa_code()?).or_insert(len);

                Some((*rule_index, msg))

			// ...

			// Store the rule index on `LintCachedData
			let rules = rules.into_keys().map(|code| code.to_string()).collect();
		 
       Self {
            messages,
            source,
            rules,
            notebook_index,
        }

The deserialization can then lookup the rule in rules and clone the value.

This will also allow us to remove the Serialize and Deserialize derive from Rule

(And I think we can remove Ord, PartialOrd, and CacheKey too

Copy link
Contributor Author

@ntBre ntBre Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the type of the IndexMap here? I think we still need to get back to the Rule because it's needed to reconstruct the OldDiagnostic in OldDiagnostic::lint.

Another option here could be Rule::from_code(&msg.noqa_code()?.to_string()).ok()?). And then we could store the Rule as its repr(u16) in the cache. That should allow us to remove all of the same derives? (Although it does add back one derive: strum_macros::FromRepr for the conversion in the other direction)

Patch
diff --git a/crates/ruff/src/cache.rs b/crates/ruff/src/cache.rs
index fdaf6b4a7..14710027a 100644
--- a/crates/ruff/src/cache.rs
+++ b/crates/ruff/src/cache.rs
@@ -356,7 +356,8 @@ impl FileCache {
                             msg.parent,
                             file.clone(),
                             msg.noqa_offset,
-                            msg.rule,
+                            Rule::from_repr(msg.rule)
+                                .expect("Expected a valid rule repr in the cache"),
                         )
                     })
                     .collect()
@@ -442,7 +443,7 @@ impl LintCacheData {
             // Parse the kebab-case rule name into a `Rule`. This will fail for syntax errors, so
             // this also serves to filter them out, but we shouldn't be caching files with syntax
             // errors anyway.
-            .filter_map(|msg| Some((msg.name().parse().ok()?, msg)))
+            .filter_map(|msg| Some((Rule::from_code(&msg.noqa_code()?.to_string()).ok()?, msg)))
             .map(|(rule, msg)| {
                 // Make sure that all message use the same source file.
                 assert_eq!(
@@ -451,7 +452,7 @@ impl LintCacheData {
                     "message uses a different source file"
                 );
                 CacheMessage {
-                    rule,
+                    rule: rule as u16,
                     body: msg.body().to_string(),
                     suggestion: msg.suggestion().map(ToString::to_string),
                     range: msg.range(),
@@ -475,7 +476,7 @@ impl LintCacheData {
 pub(super) struct CacheMessage {
     /// The rule for the cached diagnostic.
     #[bincode(with_serde)]
-    rule: Rule,
+    rule: u16,
     /// The message body to display to the user, to explain the diagnostic.
     body: String,
     /// The message to display to the user, to explain the suggested fix.
diff --git a/crates/ruff_macros/src/map_codes.rs b/crates/ruff_macros/src/map_codes.rs
index 39993af6a..6f9b0d043 100644
--- a/crates/ruff_macros/src/map_codes.rs
+++ b/crates/ruff_macros/src/map_codes.rs
@@ -433,13 +433,8 @@ fn register_rules<'a>(input: impl Iterator<Item = &'a Rule>) -> TokenStream {
             Copy,
             Clone,
             Hash,
-            PartialOrd,
-            Ord,
-            ::ruff_macros::CacheKey,
             ::strum_macros::IntoStaticStr,
-            ::strum_macros::EnumString,
-            ::serde::Serialize,
-            ::serde::Deserialize,
+            ::strum_macros::FromRepr,
         )]
         #[repr(u16)]
         #[strum(serialize_all = "kebab-case")]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see. It's because of the &'static str in LintId. I don't think what I suggested then makes sense. But this does feel unfortunate. It particularly annoys me that filter_map seems a bit like a footgun. It's very easy to filter out too many diagnostics (e.g. if Ruff ever starts using any other DiagnosticId other than syntax error). It even becomes more fragile when we make noqa_code (secondary code) an Option on ruff_db::Diagnostic.

I think the options are:

  • Make LintName use a String. Which doesn't feel great as well
  • implement a serialize and deserialize function (in ruff) for LintId. That's probably the way to go but I'm okay to defer this to another PR.

It's unfortunate that we can't remove all FromRepr calls or is there a way to get to the rule from the LintName? I think that would be better if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we kept only EnumString, we could serialize the LintName as a String here and then parse back to a rule when deserializing. That should eliminate all of the derives except EnumString. Would that be any better?

Then we could also replace this filter_map with a filter(|msg| !msg.is_syntax_error()), I think. Although that still doesn't really address using other DiagnosticIds since I don't think those could be parsed back to Rules either.

But yeah, we're just using the Rule to retrieve the &'static str lint name. Otherwise we could just (de)serialize the lint name and the Option<NoqaCode> (well, probably as an Option<String>) directly.

How are we planning to handle this caching in ty? It seems like there would be issues with the &'static str there too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we kept only EnumString, we could serialize the LintName as a String here and then parse back to a rule when deserializing. That should eliminate all of the derives except EnumString. Would that be any better?

I'm not sure what you mean by EnumString.

How are we planning to handle this caching in ty? It seems like there would be issues with the &'static str there too.

Yes, I think we would have the same issue. I think going through Rules here is probably fine (that's what I would do in ty).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean ::strum_macros::EnumString, which I thought spawned this thread. We're currently using EnumString to parse a rule name to a Rule when serializing. But if the goal is to remove several derive macros, we could instead serialize the string and call name.parse().unwrap() when deserializing. That would remove the need for all of these macros1:

PartialOrd,
Ord,
::ruff_macros::CacheKey,
::strum_macros::IntoStaticStr,
::strum_macros::EnumString,
::serde::Serialize,
::serde::Deserialize,

except EnumString, which was the original target. I may have lost the thread here, though, if you meant something else entirely.

Footnotes

  1. We also still need IntoStaticStr, which we use for the Display implementation, but that's a bit beside the point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, sorry. That's on me. My main goal was to avoid needing to go back to Rule.

I don't think I feel very strongly about this. Removing EnumString would definitely be nice and I could see other ways to avoid the name.parse() overhead (... bring back my interner ;)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I think I'll merge this for now then. I think there will definitely be an interner in our future! 😄

.map(|(rule, msg)| {
// Make sure that all message use the same source file.
assert_eq!(
Expand Down
53 changes: 50 additions & 3 deletions crates/ruff_linter/src/checkers/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ use itertools::Itertools;
use log::debug;
use rustc_hash::{FxHashMap, FxHashSet};

use ruff_diagnostics::IsolationLevel;
use ruff_diagnostics::{Applicability, Fix, IsolationLevel};
use ruff_notebook::{CellOffsets, NotebookIndex};
use ruff_python_ast::helpers::{collect_import_from_member, is_docstring_stmt, to_module_path};
use ruff_python_ast::identifier::Identifier;
Expand Down Expand Up @@ -3115,7 +3115,6 @@ pub(crate) struct LintContext<'a> {
diagnostics: RefCell<Vec<OldDiagnostic>>,
source_file: SourceFile,
rules: RuleTable,
#[expect(unused, reason = "TODO(brent) use this instead of Checker::settings")]
settings: &'a LinterSettings,
}

Expand Down Expand Up @@ -3152,6 +3151,7 @@ impl<'a> LintContext<'a> {
DiagnosticGuard {
context: self,
diagnostic: Some(OldDiagnostic::new(kind, range, &self.source_file)),
rule: T::rule(),
}
}

Expand All @@ -3165,10 +3165,12 @@ impl<'a> LintContext<'a> {
kind: T,
range: TextRange,
) -> Option<DiagnosticGuard<'chk, 'a>> {
if self.is_rule_enabled(T::rule()) {
let rule = T::rule();
if self.is_rule_enabled(rule) {
Some(DiagnosticGuard {
context: self,
diagnostic: Some(OldDiagnostic::new(kind, range, &self.source_file)),
rule,
})
} else {
None
Expand Down Expand Up @@ -3220,6 +3222,7 @@ pub(crate) struct DiagnosticGuard<'a, 'b> {
///
/// This is always `Some` until the `Drop` (or `defuse`) call.
diagnostic: Option<OldDiagnostic>,
rule: Rule,
}

impl DiagnosticGuard<'_, '_> {
Expand All @@ -3232,6 +3235,50 @@ impl DiagnosticGuard<'_, '_> {
}
}

impl DiagnosticGuard<'_, '_> {
fn resolve_applicability(&self, fix: &Fix) -> Applicability {
self.context
.settings
.fix_safety
.resolve_applicability(self.rule, fix.applicability())
}

/// Set the [`Fix`] used to fix the diagnostic.
#[inline]
pub(crate) fn set_fix(&mut self, fix: Fix) {
if !self.context.rules.should_fix(self.rule) {
self.fix = None;
return;
}
Comment on lines +3249 to +3252
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this match Ruff's current behavior? Does that mean, we won't even show fixes when the fix is disabled. This seems wrong to me.

I'm not suggesting we should necessarily change this in this PR but we should at least open an issue for what I consider a bug. Instead, Ruff should keep the fix (but mark it in some way) and we then skip it when applying automatic fixes (and we don't show it in the LSP although I think even that would be fine).

Which makes me wonder. Should should_fix == False resolve to manual applicability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this does seem to match the current behavior in both the playground and in VS Code, with the caveat that I couldn't get a ruff.toml like

[lint]
unfixable = ["F401"]

to work in VS Code, only setting it in my settings.json actually applied. This might be a bug too, unless I was doing something wrong.

I think the LSP filters out fixes with display-only applicability here:

let fix = fix.and_then(|fix| fix.applies(Applicability::Unsafe).then_some(fix));

But if we change that to DisplayOnly and change the code here to

        if !self.context.rules.should_fix(self.rule) {
            self.fix = Some(fix.with_applicability(Applicability::DisplayOnly));
            return;
        }

we can show display-only fixes in the editor, even if they're disabled. That definitely makes sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create a follow up issue for this. I think it requires some design work on how we want to handle them in the LSP.

let applicability = self.resolve_applicability(&fix);
self.fix = Some(fix.with_applicability(applicability));
}

/// Set the [`Fix`] used to fix the diagnostic, if the provided function returns `Ok`.
/// Otherwise, log the error.
#[inline]
pub(crate) fn try_set_fix(&mut self, func: impl FnOnce() -> anyhow::Result<Fix>) {
match func() {
Ok(fix) => self.set_fix(fix),
Err(err) => log::debug!("Failed to create fix for {}: {}", self.name(), err),
}
}

/// Set the [`Fix`] used to fix the diagnostic, if the provided function returns `Ok`.
/// Otherwise, log the error.
#[inline]
pub(crate) fn try_set_optional_fix(
&mut self,
func: impl FnOnce() -> anyhow::Result<Option<Fix>>,
) {
match func() {
Ok(None) => {}
Ok(Some(fix)) => self.set_fix(fix),
Err(err) => log::debug!("Failed to create fix for {}: {}", self.name(), err),
}
}
}

impl std::ops::Deref for DiagnosticGuard<'_, '_> {
type Target = OldDiagnostic;

Expand Down
9 changes: 5 additions & 4 deletions crates/ruff_linter/src/fix/edits.rs
Original file line number Diff line number Diff line change
Expand Up @@ -739,15 +739,16 @@ x = 1 \
let diag = {
use crate::rules::pycodestyle::rules::MissingNewlineAtEndOfFile;
let mut iter = edits.into_iter();
OldDiagnostic::new(
let mut diagnostic = OldDiagnostic::new(
MissingNewlineAtEndOfFile, // The choice of rule here is arbitrary.
TextRange::default(),
&SourceFileBuilder::new("<filename>", "<code>").finish(),
)
.with_fix(Fix::safe_edits(
);
diagnostic.fix = Some(Fix::safe_edits(
iter.next().ok_or(anyhow!("expected edits nonempty"))?,
iter,
))
));
diagnostic
};
assert_eq!(apply_fixes([diag].iter(), &locator).code, expect);
Ok(())
Expand Down
5 changes: 3 additions & 2 deletions crates/ruff_linter/src/fix/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -186,12 +186,13 @@ mod tests {
edit.into_iter()
.map(|edit| {
// The choice of rule here is arbitrary.
let diagnostic = OldDiagnostic::new(
let mut diagnostic = OldDiagnostic::new(
MissingNewlineAtEndOfFile,
edit.range(),
&SourceFileBuilder::new(filename, source).finish(),
);
diagnostic.with_fix(Fix::safe_edit(edit))
diagnostic.fix = Some(Fix::safe_edit(edit));
diagnostic
})
.collect()
}
Expand Down
27 changes: 1 addition & 26 deletions crates/ruff_linter/src/linter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -378,32 +378,7 @@ pub fn check_path(

let (mut diagnostics, source_file) = context.into_parts();

if parsed.has_valid_syntax() {
// Remove fixes for any rules marked as unfixable.
for diagnostic in &mut diagnostics {
if diagnostic
.noqa_code()
.and_then(|code| code.rule())
.is_none_or(|rule| !settings.rules.should_fix(rule))
{
diagnostic.fix = None;
}
}

// Update fix applicability to account for overrides
if !settings.fix_safety.is_empty() {
for diagnostic in &mut diagnostics {
if let Some(fix) = diagnostic.fix.take() {
if let Some(rule) = diagnostic.noqa_code().and_then(|code| code.rule()) {
let fixed_applicability = settings
.fix_safety
.resolve_applicability(rule, fix.applicability());
diagnostic.set_fix(fix.with_applicability(fixed_applicability));
}
}
}
}
} else {
if !parsed.has_valid_syntax() {
// Avoid fixing in case the source code contains syntax errors.
for diagnostic in &mut diagnostics {
diagnostic.fix = None;
Expand Down
35 changes: 0 additions & 35 deletions crates/ruff_linter/src/message/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -186,41 +186,6 @@ impl OldDiagnostic {
)
}

/// Consumes `self` and returns a new `Diagnostic` with the given `fix`.
#[inline]
#[must_use]
pub fn with_fix(mut self, fix: Fix) -> Self {
self.set_fix(fix);
self
}

/// Set the [`Fix`] used to fix the diagnostic.
#[inline]
pub fn set_fix(&mut self, fix: Fix) {
self.fix = Some(fix);
}

/// Set the [`Fix`] used to fix the diagnostic, if the provided function returns `Ok`.
/// Otherwise, log the error.
#[inline]
pub fn try_set_fix(&mut self, func: impl FnOnce() -> anyhow::Result<Fix>) {
match func() {
Ok(fix) => self.fix = Some(fix),
Err(err) => log::debug!("Failed to create fix for {}: {}", self.name(), err),
}
}

/// Set the [`Fix`] used to fix the diagnostic, if the provided function returns `Ok`.
/// Otherwise, log the error.
#[inline]
pub fn try_set_optional_fix(&mut self, func: impl FnOnce() -> anyhow::Result<Option<Fix>>) {
match func() {
Ok(None) => {}
Ok(Some(fix)) => self.fix = Some(fix),
Err(err) => log::debug!("Failed to create fix for {}: {}", self.name(), err),
}
}

/// Consumes `self` and returns a new `Diagnostic` with the given parent node.
#[inline]
#[must_use]
Expand Down
6 changes: 6 additions & 0 deletions crates/ruff_linter/src/registry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,12 @@ pub enum Linter {
}

pub trait RuleNamespace: Sized {
/// Returns the prefix that every single code that ruff uses to identify
/// rules from this linter starts with. In the case that multiple
/// `#[prefix]`es are configured for the variant in the `Linter` enum
/// definition this is the empty string.
fn common_prefix(&self) -> &'static str;

/// Attempts to parse the given rule code. If the prefix is recognized
/// returns the respective variant along with the code with the common
/// prefix stripped.
Expand Down
1 change: 1 addition & 0 deletions crates/ruff_linter/src/rule_selector.rs
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@ mod schema {
use strum::IntoEnumIterator;

use crate::RuleSelector;
use crate::registry::RuleNamespace;
use crate::rule_selector::{Linter, RuleCodePrefix};

impl JsonSchema for RuleSelector {
Expand Down
22 changes: 1 addition & 21 deletions crates/ruff_macros/src/map_codes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -254,11 +254,9 @@ fn generate_rule_to_code(linter_to_rules: &BTreeMap<Ident, BTreeMap<String, Rule
}

let mut rule_noqa_code_match_arms = quote!();
let mut noqa_code_rule_match_arms = quote!();
let mut rule_group_match_arms = quote!();
let mut noqa_code_consts = quote!();

for (i, (rule, codes)) in rule_to_codes.into_iter().enumerate() {
for (rule, codes) in rule_to_codes {
let rule_name = rule.segments.last().unwrap();
assert_eq!(
codes.len(),
Expand Down Expand Up @@ -294,14 +292,6 @@ See also https://github.com/astral-sh/ruff/issues/2186.
#(#attrs)* Rule::#rule_name => NoqaCode(crate::registry::Linter::#linter.common_prefix(), #code),
});

let const_ident = quote::format_ident!("NOQA_PREFIX_{}", i);
noqa_code_consts.extend(quote! {
const #const_ident: &str = crate::registry::Linter::#linter.common_prefix();
});
noqa_code_rule_match_arms.extend(quote! {
#(#attrs)* NoqaCode(#const_ident, #code) => Some(Rule::#rule_name),
});

rule_group_match_arms.extend(quote! {
#(#attrs)* Rule::#rule_name => #group,
});
Expand Down Expand Up @@ -350,16 +340,6 @@ See also https://github.com/astral-sh/ruff/issues/2186.
}
}
}

impl NoqaCode {
pub fn rule(&self) -> Option<Rule> {
#noqa_code_consts
match self {
#noqa_code_rule_match_arms
_ => None
}
}
}
};
rule_to_code
}
Expand Down
14 changes: 4 additions & 10 deletions crates/ruff_macros/src/rule_namespace.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,10 @@ pub(crate) fn derive_impl(input: DeriveInput) -> syn::Result<proc_macro2::TokenS
None
}

fn common_prefix(&self) -> &'static str {
match self { #common_prefix_match_arms }
}

fn name(&self) -> &'static str {
match self { #name_match_arms }
}
Expand All @@ -126,16 +130,6 @@ pub(crate) fn derive_impl(input: DeriveInput) -> syn::Result<proc_macro2::TokenS
match self { #url_match_arms }
}
}

impl #ident {
/// Returns the prefix that every single code that ruff uses to identify
/// rules from this linter starts with. In the case that multiple
/// `#[prefix]`es are configured for the variant in the `Linter` enum
/// definition this is the empty string.
pub const fn common_prefix(&self) -> &'static str {
match self { #common_prefix_match_arms }
}
}
})
}

Expand Down
2 changes: 1 addition & 1 deletion crates/ruff_workspace/src/configuration.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ use ruff_cache::cache_dir;
use ruff_formatter::IndentStyle;
use ruff_graph::{AnalyzeSettings, Direction};
use ruff_linter::line_width::{IndentWidth, LineLength};
use ruff_linter::registry::{INCOMPATIBLE_CODES, Rule, RuleSet};
use ruff_linter::registry::{INCOMPATIBLE_CODES, Rule, RuleNamespace, RuleSet};
use ruff_linter::rule_selector::{PreviewOptions, Specificity};
use ruff_linter::rules::{flake8_import_conventions, isort, pycodestyle};
use ruff_linter::settings::fix_safety_table::FixSafetyTable;
Expand Down
Loading