[pyflakes] Handle some common submodule import situations for unused-import (F401)#20200
[pyflakes] Handle some common submodule import situations for unused-import (F401)#20200dylwil3 merged 33 commits intoastral-sh:mainfrom
pyflakes] Handle some common submodule import situations for unused-import (F401)#20200Conversation
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| F401 | 245 | 245 | 0 | 0 | 0 |
f955acb to
42cb8c8
Compare
There was a problem hiding this comment.
Note that we have restricted our analyses to the present scope, so this snapshot is possibly surprising. The main reason to restrict to one scope at a time is that the unused import rule already acts on the scope level. So the current behavior is already to emit two diagnostics even for something like:
import a
def foo():
import a
MichaReiser
left a comment
There was a problem hiding this comment.
This is a masive improvement with very low overhead. Well done.
I've a couple of nit comments but I also found a correctness bug that we need to look into.
I haven't reviewed all tests yet.
42cb8c8 to
56a2843
Compare
8f7b7df to
be2edc1
Compare
MichaReiser
left a comment
There was a problem hiding this comment.
This is great and now with a 0 perf regression. Well done!
There was a problem hiding this comment.
There's a hidden unwrap call when you access segments[0]. There's an implicit assumption that segments is never empty.
binding
.as_any_import()
.expect("binding to be import binding since current function called after restricting to these in `unused_imports_in_scope`")
.qualified_name()
.segments()[0];
I haven't checked all code paths but I think the assumption is true for all QualifiedName. If so, then having a method on QualifiedName that returns the first segment (and documents why doing so without checking the length of segments is okay) seems generally useful (e.g. first_segment, root_name, what not :))
Summary
The PR under review attempts to make progress towards the age-old problem of submodule imports, specifically with regards to their treatment by the rule
unused-import(F401).Some related issues:
Prior art:
One distinguishing feature of this PR is that we have attempted to be less ambitious in the hopes that this will improve the chances of landing nonzero progress into
main.What the People Want
Users expect to see the following behavior in these common situations:
The following situations are unintuitive to users, but are valid Python and we are probably forced to proceed as indicated:
The goal of this PR is to modify the implementation of
unused-importto match this expectation.Modeling the Python in Users' Brains
Given the expected behavior above, how might a typical user be modeling Python in their minds? One possibility is as follows. Upon seeing the statement:
the user thinks:
This happens to be incorrect. Rather than restrict what is available to access on
a, the opposite happens:Similarly, when this user sees:
they think:
when really the opposite is true:
These views are actually incompatible, so it is not possible to model the user's mind without running afoul of Examples 4 and 5 above.
The Worst of Both Worlds
To model both the behavior that the user expects and the unintuitive behavior that Python allows, we adopt an approach I'll call "Thanks I Hate It (TIHI)". It is the worst of both worlds, and it alters how we think about accessing members of a module. Again we will consider the example:
In general the TIHI model will resolve an attribute load of the form
a.a1...an(for a modulea) by:import statements
import a.b1...bm,of these.
What we actually do
Rather than force Ruff's semantic model to be incorrect by actually implementing TIHI, we hack TIHI into the logic of
unused-importitself. Doing this is somewhat awkward due to the nature of the implementation ofunused-import.Let me briefly review the current implementation and then explain what we do differently.
Current implementation
After having the semantic model visit the entire module, we look at the current live import bindings that have no references. There are various other filters and logic then applied, but our essential starting bucket for this rule consists of that: bindings to import statements that are live at the end of the file and have no references.
In particular, if the module consists solely of:
then our starting "bucket" where we might emit a lint does not include
import asince the symbolais shadowed by the statementimport a.b.This PR
From now on we will refer to the current behavior as the "stable" behavior. In this PR we introduce preview behavior under very specific assumptions designed to limit the scope of this PR to handle the most common cases where users get tripped up. If any of these assumptions are not met then we revert to the stable behavior.
To begin, instead of only considering import bindings that are live at the end of the module, we also consider all bindings shadowed by these. We assume that all of these bindings are also import bindings - specifically simple imports and submodule imports without aliases, not
fromimports. If not - revert to stable behavior. Next, we have to decide which of these statements have been "used" in the sense of the "TIHI" model above. So we iterate through the references to the last binding and decide which import statement is intuitively being referenced, according to the matching logic we described earlier.As a technical note here, a call like
a.b.foo()will create a reference to the symbola- so we will not see the full attribute access. We must therefore crawl up the AST and collect the qualified name of the attribute access. We assume that all references arise from aNamenode or from appearance in a binding to__all__- if we find some other kind of reference, we revert to stable behavior.Having marked each import statement as used or unused, we collect the unused ones and then proceed as in the stable behavior.
Questions
We already don't model side-effects, even for simple module imports like
import a. So the user would be required to suppress the lint for such an anti-pattern.Maybe! The benchmarks did not show a regression, but if we want to pre-emptively do that I'm happy to. Some places where it may make sense are:
SmallVecwhen collecting candidate unused import bindings shadowing a given one, since there will often not be very manyVec<bool>otherwise