[pyflakes] Fix check of unused imports#13965
[pyflakes] Fix check of unused imports#13965gpilikin wants to merge 7 commits intoastral-sh:mainfrom
Conversation
5d3aaba to
1dea0e6
Compare
|
Wow, nice fix! This is one of the oldest issues that is still open! Unfortunately, I'm not a good fit to review this PR because I have little knowledge of our semantic model. @charliermarsh is probably the best person to review this contribution, but he's currently very busy. That's why it might take a while before we get to it. |
|
This is awesome! If you resolve the merge conflicts then the ecosystem check will run and we can get a sense if there are any unintended consequences of the change in the semantic model (while we await the review from on high). |
939492a to
60281d3
Compare
CodSpeed Performance ReportMerging #13965 will degrade performances by 24.48%Comparing Summary
Benchmarks breakdown
|
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| F401 | 155 | 155 | 0 | 0 | 0 |
| F823 | 132 | 132 | 0 | 0 | 0 |
| TC004 | 9 | 9 | 0 | 0 | 0 |
| RUF100 | 2 | 2 | 0 | 0 | 0 |
| TC002 | 1 | 1 | 0 | 0 | 0 |
| TC003 | 1 | 1 | 0 | 0 | 0 |
| F811 | 1 | 0 | 1 | 0 | 0 |
Linter (preview)
ℹ️ ecosystem check detected linter changes. (+300 -1 violations, +0 -0 fixes in 28 projects; 27 projects unchanged)
DisnakeDev/disnake (+1 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ disnake/voice_client.py:48:12: F401 `nacl.utils` imported but unused; consider using `importlib.util.find_spec` to test for availability
RasaHQ/rasa (+67 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ rasa/cli/data.py:22:8: F401 [*] `rasa.utils.common` imported but unused + rasa/cli/data.py:6:8: F401 [*] `rasa.shared.core.domain` imported but unused + rasa/cli/export.py:11:8: F401 [*] `rasa.utils.common` imported but unused + rasa/cli/interactive.py:22:8: F401 [*] `rasa.utils.common` imported but unused + rasa/cli/telemetry.py:7:8: F401 [*] `rasa.cli.utils` imported but unused + rasa/cli/test.py:28:8: F401 [*] `rasa.utils.common` imported but unused + rasa/cli/test.py:8:8: F401 [*] `rasa.shared.data` imported but unused + rasa/cli/train.py:11:8: F401 [*] `rasa.utils.common` imported but unused + rasa/cli/utils.py:107:12: F401 [*] `rasa.utils.io` imported but unused + rasa/core/actions/action.py:17:8: F401 [*] `rasa.core` imported but unused + rasa/core/actions/action.py:84:8: F401 [*] `rasa.shared.utils.io` imported but unused ... 56 additional changes omitted for project
Snowflake-Labs/snowcli (+2 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ src/snowflake/cli/_plugins/nativeapp/codegen/setup/native_app_setup_processor.py:19:8: F401 [*] `os.path` imported but unused + tests/test_utils.py:22:8: F401 [*] `snowflake.cli._plugins.snowpark.models` imported but unused
apache/airflow (+29 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL
+ airflow/dag_processing/manager.py:47:8: F401 [*] `airflow.models` imported but unused + airflow/models/taskinstance.py:3722:18: F823 Local variable `operator` referenced before assignment + airflow/task/standard_task_runner.py:125:39: F823 Local variable `subprocess` referenced before assignment + airflow/utils/helpers.py:295:16: F823 Local variable `jinja2` referenced before assignment + airflow/utils/helpers.py:36:12: TC004 Move import `jinja2` out of type-checking block. Import is used for more than type hinting. + dev/breeze/src/airflow_breeze/utils/recording.py:54:5: F823 Local variable `click` referenced before assignment + providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:37:8: TC002 Move third-party import `botocore.waiter` into a type-checking block + providers/src/airflow/providers/amazon/aws/hooks/batch_waiters.py:35:8: F401 [*] `botocore.client` imported but unused + providers/src/airflow/providers/celery/executors/celery_executor_utils.py:112:12: F401 [*] `airflow.jobs.local_task_job_runner` imported but unused + providers/src/airflow/providers/celery/executors/celery_executor_utils.py:113:12: F401 [*] `airflow.macros` imported but unused ... 19 additional changes omitted for project
apache/superset (+2 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL
+ superset/utils/logging_configurator.py:21:8: F401 [*] `flask.app` imported but unused + tests/integration_tests/cli_tests.py:28:8: F401 [*] `superset.cli.importexport` imported but unused
binary-husky/gpt_academic (+1 -1 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
- request_llms/com_sparkapi.py:12:22: F811 Redefinition of unused `datetime` from line 2 + toolbox.py:983:21: F823 Local variable `importlib` referenced before assignment
bokeh/bokeh (+38 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL
+ tests/support/defaults.py:87:12: F401 [*] `bokeh.models` imported but unused + tests/test_cross.py:39:8: F401 [*] `_pytest.mark` imported but unused + tests/unit/bokeh/server/test_server__server.py:246:16: F823 Local variable `server` referenced before assignment + tests/unit/bokeh/server/test_server__server.py:273:5: F823 Local variable `server` referenced before assignment + tests/unit/bokeh/server/test_server__server.py:299:35: F823 Local variable `server` referenced before assignment + tests/unit/bokeh/server/test_server__server.py:309:20: F823 Local variable `server` referenced before assignment + tests/unit/bokeh/server/test_server__server.py:322:20: F823 Local variable `server` referenced before assignment + tests/unit/bokeh/server/test_server__server.py:335:20: F823 Local variable `server` referenced before assignment ... 31 additional changes omitted for rule F823 ... 30 additional changes omitted for project
docker/docker-py (+3 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ tests/unit/api_test.py:643:16: F823 Local variable `socket` referenced before assignment + tests/unit/api_test.py:653:16: F823 Local variable `socket` referenced before assignment + tests/unit/api_test.py:663:16: F823 Local variable `socket` referenced before assignment
freedomofpress/securedrop (+2 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ securedrop/tests/test_source_user.py:33:16: F823 Local variable `source_user` referenced before assignment + securedrop/tests/test_source_user.py:91:51: F823 Local variable `source_user` referenced before assignment
ibis-project/ibis (+23 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ ibis/backends/__init__.py:222:21: F823 Local variable `pa` referenced before assignment + ibis/backends/__init__.py:30:23: TC004 Move import `pyarrow` out of type-checking block. Import is used for more than type hinting. + ibis/backends/bigquery/__init__.py:45:23: TC004 Move import `pyarrow` out of type-checking block. Import is used for more than type hinting. + ibis/backends/bigquery/__init__.py:805:16: F823 Local variable `pa` referenced before assignment + ibis/backends/datafusion/__init__.py:171:38: F823 Local variable `df` referenced before assignment + ibis/backends/datafusion/__init__.py:536:16: F823 Local variable `pa` referenced before assignment + ibis/backends/impala/__init__.py:1370:16: F823 Local variable `pa` referenced before assignment + ibis/backends/impala/__init__.py:18:8: F401 `ibis.config` imported but unused; consider removing, adding to `__all__`, or using a redundant alias + ibis/backends/impala/__init__.py:39:23: TC004 Move import `pyarrow` out of type-checking block. Import is used for more than type hinting. + ibis/backends/postgres/tests/test_functions.py:989:17: F823 Local variable `dt` referenced before assignment ... 11 additional changes omitted for rule F823 ... 13 additional changes omitted for project
langchain-ai/langchain (+11 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ libs/community/langchain_community/callbacks/arthur_callback.py:123:20: F823 Local variable `arthurai` referenced before assignment + libs/community/langchain_community/callbacks/arthur_callback.py:63:17: F823 Local variable `arthurai` referenced before assignment + libs/community/langchain_community/callbacks/clearml_callback.py:396:33: F823 Local variable `pd` referenced before assignment + libs/community/langchain_community/callbacks/clearml_callback.py:466:48: F823 Local variable `pd` referenced before assignment + libs/community/langchain_community/callbacks/flyte_callback.py:135:21: F823 Local variable `flytekit` referenced before assignment + libs/community/langchain_community/document_loaders/mastodon.py:59:20: F823 Local variable `mastodon` referenced before assignment + libs/community/langchain_community/document_loaders/reddit.py:68:18: F823 Local variable `praw` referenced before assignment + libs/community/langchain_community/document_loaders/twitter.py:100:16: F823 Local variable `tweepy` referenced before assignment + libs/community/langchain_community/document_loaders/twitter.py:48:15: F823 Local variable `tweepy` referenced before assignment + libs/community/langchain_community/document_loaders/twitter.py:81:16: F823 Local variable `tweepy` referenced before assignment ... 1 additional changes omitted for project
mlflow/mlflow (+41 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ dev/update_ml_package_versions.py:16:8: F401 [*] `urllib.error` imported but unused + examples/hyperparam/search_hyperopt.py:75:20: F401 [*] `mlflow.tracking` imported but unused + mlflow/exceptions.py:123:22: F823 Local variable `json` referenced before assignment + mlflow/fastai/__init__.py:26:8: F401 `mlflow.tracking` imported but unused; consider removing, adding to `__all__`, or using a redundant alias + mlflow/fastai/callback.py:10:8: F401 [*] `mlflow.tracking` imported but unused + mlflow/langchain/utils/__init__.py:279:12: F401 `langchain.agents.agent` imported but unused + mlflow/langchain/utils/__init__.py:280:12: F401 `langchain.chains.base` imported but unused ... 25 additional changes omitted for rule F401 + mlflow/lightgbm/__init__.py:914:5: F823 Local variable `mlflow` referenced before assignment + mlflow/tracking/client.py:2183:36: F823 Local variable `matplotlib` referenced before assignment + mlflow/tracking/client.py:2372:38: F823 Local variable `PIL` referenced before assignment ... 31 additional changes omitted for project
pandas-dev/pandas (+13 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview
+ pandas/_libs/__init__.py:16:8: F401 [*] `pandas._libs.pandas_parser` imported but unused; consider removing, adding to `__all__`, or using a redundant alias + pandas/io/parquet.py:164:16: F401 [*] `pyarrow.parquet` imported but unused + pandas/tests/indexes/datetimes/methods/test_to_pydatetime.py:7:8: F401 [*] `dateutil.tz` imported but unused + pandas/tests/io/test_parquet.py:1178:18: F823 Local variable `pyarrow` referenced before assignment + pandas/tests/io/test_sql.py:56:12: TC004 Move import `sqlalchemy` out of type-checking block. Import is used for more than type hinting. + pandas/tests/io/test_sql.py:607:14: F823 Local variable `sqlalchemy` referenced before assignment + pandas/tests/io/test_sql.py:655:14: F823 Local variable `sqlalchemy` referenced before assignment + pandas/tests/io/test_sql.py:760:14: F823 Local variable `sqlalchemy` referenced before assignment + pandas/tests/io/test_sql.py:778:14: F823 Local variable `sqlalchemy` referenced before assignment + pandas/tests/io/test_sql.py:801:14: F823 Local variable `sqlalchemy` referenced before assignment ... 4 additional changes omitted for rule F823 ... 3 additional changes omitted for project
... Truncated remaining completed project reports due to GitHub comment length restrictions
Changes by rule (7 rules affected)
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| F401 | 155 | 155 | 0 | 0 | 0 |
| F823 | 132 | 132 | 0 | 0 | 0 |
| TC004 | 9 | 9 | 0 | 0 | 0 |
| RUF100 | 2 | 2 | 0 | 0 | 0 |
| TC002 | 1 | 1 | 0 | 0 | 0 |
| TC003 | 1 | 1 | 0 | 0 | 0 |
| F811 | 1 | 0 | 1 | 0 | 0 |
Thanks, I'll fix the formatter and clippy and come back. |
It's not great but it requires understanding if the regression is due to Ruff doing more (solving a more complicated problem or if the regression isn't inherent to the problem itself, and instead related to how the code is written |
Got it, thanks, I'll see what I can optimize then and maybe come back with questions. |
| let mut full_name = format!("{}", attribute.attr.id); | ||
| let mut current_expr = &*attribute.value; | ||
| let mut result = None; | ||
| let mut is_name_exist = false; | ||
| let mut already_checked_imports: HashSet<String> = HashSet::new(); | ||
|
|
||
| while let Expr::Attribute(expr_attr) = ¤t_expr { | ||
| full_name = format!("{}.{}", expr_attr.attr.id, full_name); | ||
| current_expr = &*expr_attr.value; | ||
| } | ||
|
|
||
| if let Expr::Name(ref expr_name) = current_expr { | ||
| full_name = format!("{}.{}", expr_name.id, full_name); | ||
| name_expr = Some(expr_name); | ||
| } else { | ||
| return ReadResult::NotFound; | ||
| } | ||
|
|
||
| if name_expr.is_none() { | ||
| return ReadResult::NotFound; | ||
| } | ||
|
|
||
| full_name = full_name.trim_end_matches('.').to_string(); |
There was a problem hiding this comment.
I suspect that building the full_name here is part of the reason why there's a 20% perf regression.
There was a problem hiding this comment.
Quite possibly, I tried to implement it on vectors.
It's also probably worth building a full_name when creating attributes, but I haven't figured out how to do that yet.
This comment was marked as off-topic.
This comment was marked as off-topic.
51fea5e to
1c443cb
Compare
dylwil3
left a comment
There was a problem hiding this comment.
Can you see if changing the HashSet to have &str entries helps the performance regression?
| if already_checked_imports.contains(name) { | ||
| continue; | ||
| } | ||
| { | ||
| already_checked_imports.insert(name.to_string()); | ||
| } |
There was a problem hiding this comment.
| if already_checked_imports.contains(name) { | |
| continue; | |
| } | |
| { | |
| already_checked_imports.insert(name.to_string()); | |
| } | |
| if already_checked_imports.contains(&name) { | |
| continue; | |
| } | |
| { | |
| already_checked_imports.insert(name); | |
| } |
| // Collect all unused imports by statement. | ||
| let mut unused: BTreeMap<(NodeId, Exceptions), Vec<ImportBinding>> = BTreeMap::default(); | ||
| let mut ignored: BTreeMap<(NodeId, Exceptions), Vec<ImportBinding>> = BTreeMap::default(); | ||
| let mut already_checked_imports: HashSet<String> = HashSet::new(); |
There was a problem hiding this comment.
| let mut already_checked_imports: HashSet<String> = HashSet::new(); | |
| let mut already_checked_imports: HashSet<&str> = HashSet::new(); |
dylwil3
left a comment
There was a problem hiding this comment.
Thanks for the changes! I think there's room for some more performance improvements. I've pointed out a few but I can try to take a closer look later.
| let mut current_expr = &*attribute.value; | ||
| let mut result = None; | ||
| let mut is_name_exist = false; | ||
| let mut already_checked_imports: HashSet<String> = HashSet::new(); |
There was a problem hiding this comment.
Does this also need to be a String?
| return ReadResult::NotFound; | ||
| }; | ||
|
|
||
| if name_expr.is_none() { |
There was a problem hiding this comment.
isn't this already handled right above?
| for (binding_id, scope_id) in &binding_ids { | ||
| if let BindingKind::SubmoduleImport(binding_kind) = &self.binding(*binding_id).kind { | ||
| if binding_kind.qualified_name.to_string() == full_name { | ||
| if let Some(result) = | ||
| self.resolve_binding(*binding_id, name_expr.unwrap(), *scope_id) | ||
| { | ||
| return result; | ||
| } | ||
| } | ||
| } | ||
| if let BindingKind::Import(_) = &self.binding(*binding_id).kind { | ||
| is_name_exist = true; | ||
| } | ||
| } | ||
|
|
||
| // TODO: need to move the block implementation to resolve_load, but carefully | ||
| // start check module import | ||
| for (binding_id, scope_id) in &binding_ids { |
There was a problem hiding this comment.
Do we really need to loop over this twice?
There was a problem hiding this comment.
Ideally, this should of course be in the method resolve_load.
As well as mark all imports with submodules if the root module is used.
| let ancestor_scope_ids: Vec<_> = self.scopes.ancestor_ids(self.scope_id).collect(); | ||
| let mut binding_ids: Vec<(BindingId, ScopeId)> = vec![]; | ||
|
|
||
| for scope_id in ancestor_scope_ids { | ||
| for binding_id in self.scopes[scope_id].get_all(name_expr.unwrap().id.as_str()) { | ||
| binding_ids.push((binding_id, scope_id)); | ||
| } | ||
| } |
There was a problem hiding this comment.
Do we need to collect this into a vector if we're just gonna iterate over it anyway?
There was a problem hiding this comment.
Even better, if you're only doing this to get binding_ids, which again we just iterate over, why not use iterator constructions directly?
There was a problem hiding this comment.
Thanks for review.
Okay, let's try to implement it in a different way.
I agree that it is possible not to store a vector with ancestor_ids.
There was a problem hiding this comment.
I'm trying to fix this in https://github.com/gpilikin/ruff/blob/fix/unused_imports_v2/crates/ruff_python_semantic/src/model.rs#L571
But I'm not sure if that's it at all.
I've tested the cases you mentioned in PR #16186 locally - they all pass. Can you take a look at gpilikin#3 |
Fix error when importing modules without submodules. Add attribute check for similar imports with submodule.
2f238e1 to
8e09a96
Compare
|
hey guys! thanks for your work on that very old issue! are you still looking into it to get it landed? |
Thank you. Yes, I come back to this problem from time to time, since I don't have much free time. I am glad to hear any suggestions. |
…ed-import` (`F401`) (#20200) # Summary The PR under review attempts to make progress towards the age-old problem of submodule imports, specifically with regards to their treatment by the rule [`unused-import` (`F401`)](https://docs.astral.sh/ruff/rules/unused-import/). Some related issues: - #60 - #4656 Prior art: - #13965 - #5010 - #5011 - #666 See the PR summary for a detailed description.
|
@dylwil3 should we close this PR now that your F401 improvement landed under preview? |
It looks like @dylwil3's improvement was tailored to F401 and thus didn't improve the situation with |
|
After taking a closer look I can confirm that @dylwil3's improvement can't really easily be used for the Even though I can see the issues with the TIHI model for import bindings, it would be a lot easier to perform this sort of analysis, if it were represented in the semantic model itself, rather than having to transform the model for multiple rules. But maybe we can just do something like with annotations where there's a cache of the TIHI representation of bindings that can be created and accessed by any rule that needs it. It definitely seems faster to do the transformation in a second pass, since overlapping imports are rare, but attribute accesses aren't, so you definitely don't want to pay an additional cost for every attribute access, if you can instead just check the overlapping imports and then revisit all their references and re-assign them. |
Summary
Fixed incorrect detection of unused imports with submodules. They were always marked as used even though they are not.
For example:
The solution:
Test Plan
cargo test
Closes: #60