-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RemoveUnusedImports] Support string type annotations #353
Conversation
This PR adds support for detecting imports being used by string type annotations, as well as imports suppressed by comments. It breaks up the existing visitor into multiple smaller, single-purpose visitors, and composes them together.
|
||
|
||
DEFAULT_SUPPRESS_COMMENT_REGEX = ( | ||
r".*\W(lint-ignore: ?unused-import|noqa|lint-ignore: ?F401)(\W.*)?$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest this one instead:
r".*\W(noqa[^:]|noqa: ?F401|lint-ignore: ?unused-import|lint-ignore: ?F401)(\W.*)?$"
So it ignores things like # noqa
, # noqa: F401
, but not # noqa: IG61
self.context.scratch.get( | ||
self.SUPPRESS_COMMENT_REGEX_CONTEXT_KEY, DEFAULT_SUPPRESS_COMMENT_REGEX, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing with suppressing comments at GatherUnusedImportsVisitor
level is that in some use cases (linter rules in frameworks that already silent lines with such comments) this ends up in doing more work than needed.
Maybe the suppression of comments could be moved to some other visitor which adds lines to be excluded, so one could choose whether to have them collected or not by using that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you can set the regex to something that will never match (like ^$
), but we could add an explicit bool option to disable it. Would that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, faster code is code never executed. Best thing would be if we could put these in a whole different, also reusable, place so they're never executed when they're not needed. Something like GatherIgnoredLines
or something like that. But that's only a suggestion for those use cases where this isn't really needed.
- move lint suppression out from visitor into codemod - change suppression regex
) | ||
|
||
METADATA_DEPENDENCIES = ( | ||
PositionProvider, | ||
*GatherCommentsVisitor.METADATA_DEPENDENCIES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only need to list the direct metadata dependency of RemoveUnusedImportsCommand
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimmylai, so the line *GatherCommentsVisitor.METADATA_DEPENDENCIES,
should be removed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Overall looks good and the test coverage is high. I'll let @Kronuz to do another review and accept to see if he need more changes for reusability. |
DEFAULT_SUPPRESS_COMMENT_REGEX = ( | ||
r".*\W(noqa|lint-ignore: ?unused-import|lint-ignore: ?F401)(\W.*)?$" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, don't forget about this one as well, I think it's better than the current one:
DEFAULT_SUPPRESS_COMMENT_REGEX = (
r".*\W(noqa[^:]|noqa: ?F401|lint-ignore: ?unused-import|lint-ignore: ?F401)(\W.*)?$"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't match # noqa
though, which is the most common case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right!
What do you think of:
r".*\W(noqa|lint-ignore)(: ?(F401|unused-import)|(?= )|$)(\W.*)?$"
) | ||
|
||
METADATA_DEPENDENCIES = ( | ||
PositionProvider, | ||
*GatherCommentsVisitor.METADATA_DEPENDENCIES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimmylai, so the line *GatherCommentsVisitor.METADATA_DEPENDENCIES,
should be removed, right?
|
||
def visit_Module(self, node: cst.Module) -> bool: | ||
export_collector = GatherExportsVisitor(self.context) | ||
node.visit(export_collector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimmylai, just wondering... won't calling node.visit(export_collector)
and node.visit(annotation_visitor)
here traverses the tree multiple times? wouldn't this be inefficient?
I was thinking maybe using multiple inheritance instead of this, so we make it that it only traverses the tree once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will traverse the tree multiple times. I don't think that's a big deal for a codemod like this. Running the codemod on a bunch of large files is still not visibly slower than before. I don't have benchmarks for you though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use this in a linter, which will run over and over, while the user types, we want this as fast as possible, otherwise linting times will hit the experience. Although I’m not sure how expensive traversing the tree is, processing time would be exponential, potentially noticeable on big files. What do you think @jimmylai ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance is a concern for lint but not for codemod. In lint, we try to avoid creating a visitor and calling node.visit(visitor)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @m.visit
used in this codemod is not supported in the lint framework since we worried about the the pattern can introduce some expensive checks. So we only use m.matches()
in lint.
return True | ||
|
||
@m.visit( | ||
m.Import() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could we add other exceptions to this reusable class, like if we wanted to ignore import __strict__
, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to apply filtering on the results of this visitor (just like how you pointed out that the suppression filtering belongs outside of this class). So I would filter for special module names after running GatherUnusedImportsVisitor
, and keep the exceptions here to a minimum (i.e. the ones implied by Python itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make GatherCommentsVisitor
, GatherNamesFromStringAnnotationsVisitor
and GatherUnusedImportsVisitor
reusable in lint framework and address the concerns on performance.
My proposal is to not use @m.visit
and @m.call_if_inside
in the implementation (just like GatherImportsVisitor
and GatherExportsVisitor
). Then we can configure to run those helper visitors in lint framework before running lint rules. So the computed data will be accessible in each lint rule. What do you think? @Kronuz @zsol
I'm ok with that but won't have time to implement it this week. How about we land this PR and open a new one for these proposed refactors? |
Summary
This PR adds support for detecting imports being used by string type
annotations, as well as imports suppressed by comments.
It breaks up the existing visitor into multiple smaller, single-purpose
visitors, and composes them together.
TODO:
RemoveImportsVisitor
Test Plan
existing and new tests