Apply fix availability and applicability when adding to DiagnosticGuard and remove NoqaCode::rule#18834
Apply fix availability and applicability when adding to DiagnosticGuard and remove NoqaCode::rule#18834
DiagnosticGuard and remove NoqaCode::rule#18834Conversation
I never updated the comment anyway...
|
| // this also serves to filter them out, but we shouldn't be caching files with syntax | ||
| // errors anyway. | ||
| .filter_map(|msg| Some((msg.noqa_code().and_then(|code| code.rule())?, msg))) | ||
| .filter_map(|msg| Some((msg.name().parse().ok()?, msg))) |
There was a problem hiding this comment.
This now seems to be the only reason why we need ::strum_macros::EnumString which requires a fair amount of code just to make this work.
I think we can get something comparable without going through rule by building an ad-hoc interner by rule id:
// Create an index map that maps rule name to index
let mut rules = indexmap::IndexMap::new();
// ...
// In filter map, retrieve the index or insert the rule with a new index
let len = rules.len();
let rule_index = rules.entry(msg.noqa_code()?).or_insert(len);
Some((*rule_index, msg))
// ...
// Store the rule index on `LintCachedData
let rules = rules.into_keys().map(|code| code.to_string()).collect();
Self {
messages,
source,
rules,
notebook_index,
}The deserialization can then lookup the rule in rules and clone the value.
This will also allow us to remove the Serialize and Deserialize derive from Rule
(And I think we can remove Ord, PartialOrd, and CacheKey too
There was a problem hiding this comment.
What is the type of the IndexMap here? I think we still need to get back to the Rule because it's needed to reconstruct the OldDiagnostic in OldDiagnostic::lint.
Another option here could be Rule::from_code(&msg.noqa_code()?.to_string()).ok()?). And then we could store the Rule as its repr(u16) in the cache. That should allow us to remove all of the same derives? (Although it does add back one derive: strum_macros::FromRepr for the conversion in the other direction)
Patch
diff --git a/crates/ruff/src/cache.rs b/crates/ruff/src/cache.rs
index fdaf6b4a7..14710027a 100644
--- a/crates/ruff/src/cache.rs
+++ b/crates/ruff/src/cache.rs
@@ -356,7 +356,8 @@ impl FileCache {
msg.parent,
file.clone(),
msg.noqa_offset,
- msg.rule,
+ Rule::from_repr(msg.rule)
+ .expect("Expected a valid rule repr in the cache"),
)
})
.collect()
@@ -442,7 +443,7 @@ impl LintCacheData {
// Parse the kebab-case rule name into a `Rule`. This will fail for syntax errors, so
// this also serves to filter them out, but we shouldn't be caching files with syntax
// errors anyway.
- .filter_map(|msg| Some((msg.name().parse().ok()?, msg)))
+ .filter_map(|msg| Some((Rule::from_code(&msg.noqa_code()?.to_string()).ok()?, msg)))
.map(|(rule, msg)| {
// Make sure that all message use the same source file.
assert_eq!(
@@ -451,7 +452,7 @@ impl LintCacheData {
"message uses a different source file"
);
CacheMessage {
- rule,
+ rule: rule as u16,
body: msg.body().to_string(),
suggestion: msg.suggestion().map(ToString::to_string),
range: msg.range(),
@@ -475,7 +476,7 @@ impl LintCacheData {
pub(super) struct CacheMessage {
/// The rule for the cached diagnostic.
#[bincode(with_serde)]
- rule: Rule,
+ rule: u16,
/// The message body to display to the user, to explain the diagnostic.
body: String,
/// The message to display to the user, to explain the suggested fix.
diff --git a/crates/ruff_macros/src/map_codes.rs b/crates/ruff_macros/src/map_codes.rs
index 39993af6a..6f9b0d043 100644
--- a/crates/ruff_macros/src/map_codes.rs
+++ b/crates/ruff_macros/src/map_codes.rs
@@ -433,13 +433,8 @@ fn register_rules<'a>(input: impl Iterator<Item = &'a Rule>) -> TokenStream {
Copy,
Clone,
Hash,
- PartialOrd,
- Ord,
- ::ruff_macros::CacheKey,
::strum_macros::IntoStaticStr,
- ::strum_macros::EnumString,
- ::serde::Serialize,
- ::serde::Deserialize,
+ ::strum_macros::FromRepr,
)]
#[repr(u16)]
#[strum(serialize_all = "kebab-case")]There was a problem hiding this comment.
Hmm I see. It's because of the &'static str in LintId. I don't think what I suggested then makes sense. But this does feel unfortunate. It particularly annoys me that filter_map seems a bit like a footgun. It's very easy to filter out too many diagnostics (e.g. if Ruff ever starts using any other DiagnosticId other than syntax error). It even becomes more fragile when we make noqa_code (secondary code) an Option on ruff_db::Diagnostic.
I think the options are:
- Make
LintNameuse aString. Which doesn't feel great as well - implement a serialize and deserialize function (in ruff) for
LintId. That's probably the way to go but I'm okay to defer this to another PR.
It's unfortunate that we can't remove all FromRepr calls or is there a way to get to the rule from the LintName? I think that would be better if possible
There was a problem hiding this comment.
I think if we kept only EnumString, we could serialize the LintName as a String here and then parse back to a rule when deserializing. That should eliminate all of the derives except EnumString. Would that be any better?
Then we could also replace this filter_map with a filter(|msg| !msg.is_syntax_error()), I think. Although that still doesn't really address using other DiagnosticIds since I don't think those could be parsed back to Rules either.
But yeah, we're just using the Rule to retrieve the &'static str lint name. Otherwise we could just (de)serialize the lint name and the Option<NoqaCode> (well, probably as an Option<String>) directly.
How are we planning to handle this caching in ty? It seems like there would be issues with the &'static str there too.
There was a problem hiding this comment.
I think if we kept only EnumString, we could serialize the LintName as a String here and then parse back to a rule when deserializing. That should eliminate all of the derives except EnumString. Would that be any better?
I'm not sure what you mean by EnumString.
How are we planning to handle this caching in ty? It seems like there would be issues with the &'static str there too.
Yes, I think we would have the same issue. I think going through Rules here is probably fine (that's what I would do in ty).
There was a problem hiding this comment.
I mean ::strum_macros::EnumString, which I thought spawned this thread. We're currently using EnumString to parse a rule name to a Rule when serializing. But if the goal is to remove several derive macros, we could instead serialize the string and call name.parse().unwrap() when deserializing. That would remove the need for all of these macros1:
ruff/crates/ruff_macros/src/map_codes.rs
Lines 456 to 462 in 291413b
except EnumString, which was the original target. I may have lost the thread here, though, if you meant something else entirely.
Footnotes
-
We also still need
IntoStaticStr, which we use for theDisplayimplementation, but that's a bit beside the point. ↩
There was a problem hiding this comment.
Ahh, sorry. That's on me. My main goal was to avoid needing to go back to Rule.
I don't think I feel very strongly about this. Removing EnumString would definitely be nice and I could see other ways to avoid the name.parse() overhead (... bring back my interner ;)).
There was a problem hiding this comment.
Sounds good, I think I'll merge this for now then. I think there will definitely be an interner in our future! 😄
MichaReiser
left a comment
There was a problem hiding this comment.
Nice, I think we can remove some more code by not using Rule in the cache.
We should also double check if this exactly captures today's behavior. E.g. does the LSP not show fixes for rules with should_fix = false? Do we show the fixes in diagnostics?
| if !self.context.rules.should_fix(self.rule) { | ||
| self.fix = None; | ||
| return; | ||
| } |
There was a problem hiding this comment.
Does this match Ruff's current behavior? Does that mean, we won't even show fixes when the fix is disabled. This seems wrong to me.
I'm not suggesting we should necessarily change this in this PR but we should at least open an issue for what I consider a bug. Instead, Ruff should keep the fix (but mark it in some way) and we then skip it when applying automatic fixes (and we don't show it in the LSP although I think even that would be fine).
Which makes me wonder. Should should_fix == False resolve to manual applicability?
There was a problem hiding this comment.
Yes, this does seem to match the current behavior in both the playground and in VS Code, with the caveat that I couldn't get a ruff.toml like
[lint]
unfixable = ["F401"]to work in VS Code, only setting it in my settings.json actually applied. This might be a bug too, unless I was doing something wrong.
I think the LSP filters out fixes with display-only applicability here:
ruff/crates/ruff_server/src/lint.rs
Line 247 in b2b65e7
But if we change that to DisplayOnly and change the code here to
if !self.context.rules.should_fix(self.rule) {
self.fix = Some(fix.with_applicability(Applicability::DisplayOnly));
return;
}we can show display-only fixes in the editor, even if they're disabled. That definitely makes sense to me.
There was a problem hiding this comment.
Let's create a follow up issue for this. I think it requires some design work on how we want to handle them in the LSP.
* main: [ty] Fix false positives when subscripting an object inferred as having an `Intersection` type (#18920) [`flake8-use-pathlib`] Add autofix for `PTH202` (#18763) [ty] Add relative import completion tests [ty] Clarify what "cursor" means [ty] Add a cursor test builder [ty] Enforce sort order of completions (#18917) [formatter] Fix missing blank lines before decorated classes in .pyi files (#18888) Apply fix availability and applicability when adding to `DiagnosticGuard` and remove `NoqaCode::rule` (#18834) py-fuzzer: allow relative executable paths (#18915) [ty] Change `environment.root` to accept multiple paths (#18913) [ty] Rename `src.root` setting to `environment.root` (#18760) Use file path for detecting package root (#18914) Consider virtual path for various server actions (#18910) [ty] Introduce `UnionType::try_from_elements` and `UnionType::try_map` (#18911) [ty] Support narrowing on `isinstance()`/`issubclass()` if the second argument is a dynamic, intersection, union or typevar type (#18900) [ty] Add decorator check for implicit attribute assignments (#18587) [`ruff`] Trigger `RUF037` for empty string and byte strings (#18862) [ty] Avoid duplicate diagnostic in unpacking (#18897) [`pyupgrade`] Extend version detection to include `sys.version_info.major` (`UP036`) (#18633) [`ruff`] Frozen Dataclass default should be valid (`RUF009`) (#18735)
Summary
This PR removes the last two places we were using
NoqaCode::ruleinlinter.rs(see #18391 (comment) and #18391 (comment)) by checking whether fixes are actually desired before adding them to aDiagnosticGuard. I implemented this by storing aViolation'sRuleon theDiagnosticGuardso that we could check if it was enabled in the embeddedLinterSettingswhen trying to set a fix.All of the corresponding
set_fixmethods onOldDiagnosticwere now unused (except in tests where I just set.fixdirectly), so I moved these to the guard instead of keeping both sets.The very last place where we were using
NoqaCode::rulewas in the cache. I just reverted this to parsing theRulefrom the name. I had forgotten to update the comment there anyway. Hopefully this doesn't cause too much of a perf hit.In terms of binary size, we're back down almost to where
mainwas two days ago (#18391 (comment)):Only 43 kb up, but that shouldn't all be me this time :)
Test Plan
Existing tests and benchmarks on this PR