Skip to content

[ty] Rework module resolution to be breadth-first instead of depth-first#22449

Merged
BurntSushi merged 7 commits intomainfrom
gankra/namespace-auto
Mar 4, 2026
Merged

[ty] Rework module resolution to be breadth-first instead of depth-first#22449
BurntSushi merged 7 commits intomainfrom
gankra/namespace-auto

Conversation

@Gankra
Copy link
Contributor

@Gankra Gankra commented Jan 8, 2026

Summary

By making the algorithm breadth-first/incremental, the logic has a coherent translation to computing all_modules. Thus this is ground-work for auto-complete and auto-import adding various missing import semantics (extremely optimistically, we could actually make it share the same code!).

In addition, this fixes two corner-case issues with our module resolution:

  • A regular package (or module) in a later search-path now properly shadows a namespace package in an earlier one, which matches runtime behaviour aiui.
  • A regular package in a later search-path now properly(?) shadows a module in an earlier one (previously we treated foo.py similarly to a legacy namespace package where import foo would find it but import foo.bar would ignore it (and find any regular-package or namespace-package foo on subsequent search-paths).
  • We now consider all stub-packages to have higher priority than non-stub-packages, independent of search-path ordering. In most cases this means our behaviour will now be "search all paths for stubs, then search all paths for implementations" -- this isn't strictly true in this current implementation because two namespace packages could exist where one specifies a.pyi and the other specifies a.py and we still respect search-path order in that case (I think the impl is actually really close to fixing that but I finally got the tests passing and I didn't want to poke the bear).

Test Plan

I need to write more tests to cover interesting cases I've thought of, for now I'm content with the existing tests passing (a few have been changed for now, I think all those changes are acceptable, but ymmv).

@Gankra Gankra added ty Multi-file analysis & type inference ecosystem-analyzer labels Jan 8, 2026
@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 8, 2026

Typing conformance results

No changes detected ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 8, 2026

mypy_primer results

Changes were detected when running on open source projects
pytest (https://github.com/pytest-dev/pytest)
+ testing/_py/test_local.py:12:6: error[unresolved-import] Cannot resolve imported module `py.path`
+ testing/_py/test_local.py:928:18: error[unresolved-import] Cannot resolve imported module `py._process.cmdexec`
- testing/_py/test_local.py:851:16: error[unresolved-attribute] Attribute `check` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:854:41: error[unresolved-attribute] Attribute `dirpath` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:905:16: error[unresolved-attribute] Attribute `check` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:907:16: error[unresolved-attribute] Attribute `check` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:916:16: error[unresolved-attribute] Attribute `basename` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:917:16: error[unresolved-attribute] Attribute `dirpath` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:922:15: error[unresolved-attribute] Attribute `sysexec` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:930:31: error[invalid-assignment] Object of type `<class 'RuntimeError'>` is not assignable to `<class 'ExecutionFailed'>`
- testing/_py/test_local.py:933:13: error[unresolved-attribute] Attribute `sysexec` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:1169:29: error[unresolved-import] Module `py.path` has no member `isimportable`
+ testing/_py/test_local.py:1169:14: error[unresolved-import] Cannot resolve imported module `py.path`
+ testing/_py/test_local.py:1171:14: error[unresolved-import] Cannot resolve imported module `py._path.local`
- testing/_py/test_local.py:1186:12: error[unresolved-attribute] Class `local` has no attribute `_gethomedir`
- testing/_py/test_local.py:1192:15: error[unresolved-attribute] Class `local` has no attribute `_gethomedir`
- testing/_py/test_local.py:1274:16: error[unresolved-attribute] Attribute `new` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:1276:31: error[unresolved-attribute] Attribute `relto` is not defined on `None` in union `local | None`
- testing/_py/test_local.py:1277:20: error[unresolved-attribute] Attribute `check` is not defined on `None` in union `local | None`
- Found 390 diagnostics
+ Found 379 diagnostics

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
- src/scikit_build_core/build/wheel.py:99:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 58 diagnostics
+ Found 57 diagnostics

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 8, 2026

ecosystem-analyzer results

Lint rule Added Removed Changed
invalid-await 0 40 0
unresolved-attribute 0 13 0
unresolved-import 3 0 1
invalid-assignment 0 1 0
invalid-return-type 0 1 0
Total 3 55 1

Full report with detailed diff (timing results)

@Gankra
Copy link
Contributor Author

Gankra commented Jan 8, 2026

https://github.com/DataDog/dd-trace-py/blob/fd559433b6cc12ad3205de1a16915a7ea124b277/ddtrace/sourcecode/setuptools_auto.py#L5-L9

DEV: We have to import setuptools first who will make sure distutils is available for us.

Hey what the fuck? What on earth is happening here that we ever actually handled this properly.

@charliermarsh
Copy link
Member

Hahaha

@Gankra
Copy link
Contributor Author

Gankra commented Jan 9, 2026

I've dropped the reordering of -stubs packages because I couldn't figure out how to make it coherent (notably: setuptools-types introduces distutils-stubs, but distutils-stubs.core doesn't exist, so hoisting it over the stdlib is wrong, but trying to treat typeshed as a -stubs to fix that ordering means we break "first party packages can shadow typeshed".

@Gankra
Copy link
Contributor Author

Gankra commented Jan 9, 2026

I've tentatively introduced a rule that foo/__init__.py should shadow foo.py regardless of search-path order in attempt to handle a situation that occurs in pytest's own tests:

py.py is an evil polyfill that we cannot understand the submodules of, so if we ever resolve the module py as py.py we will fail to resolve py.path. py/__init__.pyi and py/path.pyi are totally comprehensible and we can easily resolve py.path as either an import or an attribute access.

Previously this worked because when resolving import py.path the search algorithm would try to find a package py and actually ignore py.py in the top-level, concluding its search-path has nothing as if it was a namespace package.

@Gankra
Copy link
Contributor Author

Gankra commented Jan 9, 2026

I need to confirm if this reflects runtime behaviour.

Also this is a bit sad because it now means finding functools.pyi in typeshed should not be considered the end of the story, and we need to keep checking search-paths for functools/__init__.py(i). I've adjusted the invalidation test to use functools/__init__.pyi in typeshed as that still short-circuits and is reasonably common.

(unshadowable builtin modules also defacto short-circuit by filtering out all non-stdlib searchpaths)

@astral-sh-bot
Copy link

astral-sh-bot bot commented Feb 20, 2026

Memory usage report

Summary

Project Old New Diff Outcome
flake8 48.10MB 48.09MB -0.01% (7.38kB) ⬇️
sphinx 266.92MB 266.91MB -0.00% (10.27kB) ⬇️
trio 118.51MB 118.49MB -0.02% (20.19kB) ⬇️
prefect 694.49MB 694.45MB -0.01% (38.40kB) ⬇️

Significant changes

Click to expand detailed breakdown

flake8

Name Old New Diff Outcome
File 275.23kB 266.96kB -3.00% (8.27kB) ⬇️
resolve_module_query 36.53kB 37.42kB +2.44% (912.00B) ⬇️

sphinx

Name Old New Diff Outcome
File 1.01MB 1013.48kB -2.43% (25.20kB) ⬇️
resolve_module_query 236.53kB 251.46kB +6.31% (14.93kB) ⬇️

trio

Name Old New Diff Outcome
File 1.00MB 1013.03kB -1.18% (12.14kB) ⬇️
resolve_module_query 163.00kB 152.55kB -6.41% (10.45kB) ⬇️
parsed_module 27.07MB 27.08MB +0.01% (2.17kB) ⬇️
source_text 3.75MB 3.75MB +0.01% (243.00B) ⬇️

prefect

Name Old New Diff Outcome
File 2.66MB 2.60MB -2.28% (62.11kB) ⬇️
resolve_module_query 538.18kB 561.89kB +4.41% (23.71kB) ⬇️

@BurntSushi BurntSushi self-assigned this Feb 24, 2026
from foo.bar.both import Both
from foo.bar.impl import Impl
from foo.bar.fake import Fake # error: "Cannot resolve"
from foo.bar.impl import Impl # error: [unresolved-import]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gankra Was this change intended? I think this test is trying to assert that foo.bar.impl is found becaue there is a partial py.typed in foo.bar overriding the full py.typed at the top-level of foo.

@BurntSushi
Copy link
Member

I've tentatively introduced a rule that foo/__init__.py should shadow foo.py regardless of search-path order in attempt to handle a situation that occurs in pytest's own tests:

* [pytest defines `src/py.py` as a polyfill for `py` (aka pylib) not being installed](https://github.com/pytest-dev/pytest/blob/main/src/py.py), with a comment claiming `py/__init__.py` will shadow it if it exists.

* mypy-primer ensures `py` is installed, which provides: `py/__init__.py`, `py__init__.pyi`, and `py/path.pyi`. However this is almost assuredly on a lower-priority search-path since pytest is presumably first-party code here.

py.py is an evil polyfill that we cannot understand the submodules of, so if we ever resolve the module py as py.py we will fail to resolve py.path. py/__init__.pyi and py/path.pyi are totally comprehensible and we can easily resolve py.path as either an import or an attribute access.

Previously this worked because when resolving import py.path the search algorithm would try to find a package py and actually ignore py.py in the top-level, concluding its search-path has nothing as if it was a namespace package.

From what I can tell testing this out, this isn't how Python imports actually work unfortunately. It looks like as soon as you can find a module, it stops. Such that there isn't a search path independent preference for foo/__init__.py over foo.py.

Here is my test setup:

$ tree -a
.
├── main.py
├── searchpath-bar
│   └── mymod
│       └── __init__.py
└── searchpath-foo
    └── mymod.py

4 directories, 3 files
$ cat main.py
import mymod
$ cat searchpath-bar/mymod/__init__.py
print('mymod module package')
$ cat searchpath-foo/mymod.py
print('mymod.py module file')

Now test that you get the print statement you expect when you only set one of the directories as a search path:

$ PYTHONPATH=./searchpath-bar python main.py
mymod module package
$ PYTHONPATH=./searchpath-foo python main.py
mymod.py module file

And now set both directories:

$ PYTHONPATH=./searchpath-bar:./searchpath-foo python main.py
mymod module package
$ PYTHONPATH=./searchpath-foo:./searchpath-bar python main.py
mymod.py module file

I believe that if mymod/__init__.py had search path independent priority, then both commands above would output mymod module package.

@BurntSushi
Copy link
Member

Also, if I do touch searchpath-bar/mymod/submod.py and echo 'import mymod.submod' > main.py, then:

$ PYTHONPATH=./searchpath-bar python main.py
mymod module package
$ PYTHONPATH=./searchpath-foo python main.py
mymod.py module file
Traceback (most recent call last):
  File "/home/andrew/astral/ruff/playground/learning/python-search-path-precedence/main.py", line 1, in <module>
    import mymod.submod
ModuleNotFoundError: No module named 'mymod.submod'; 'mymod' is not a package

So far so good. When we combine the search paths:

$ PYTHONPATH=./searchpath-bar:./searchpath-foo python main.py
mymod module package
$ PYTHONPATH=./searchpath-foo:./searchpath-bar python main.py
mymod.py module file
Traceback (most recent call last):
  File "/home/andrew/astral/ruff/playground/learning/python-search-path-precedence/main.py", line 1, in <module>
    import mymod.submod
ModuleNotFoundError: No module named 'mymod.submod'; 'mymod' is not a package

So basically as soon as mymod is found on a search path, the import system sticks to it. Even if it could continue to find a lower priority package with a submodule.

@BurntSushi
Copy link
Member

  • A regular package (or module) in a later search-path now properly shadows a namespace package in an earlier one, which matches runtime behaviour aiui.

Testing this out this does indeed seem correct! Here's my test setup:

$ tree -a
.
├── main.py
├── searchpath-namespace
│   └── mymod
│       └── submod.py
└── searchpath-regular
    └── mymod
        ├── __init__.py
        └── submod.py

5 directories, 4 files
$ cat main.py
import mymod.submod
$ cat searchpath-namespace/mymod/submod.py
print("namespace")
$ cat searchpath-regular/mymod/submod.py
print("regular")

Then I can verify I get the expected behavior when I use each of the search path directories on their own:

$ PYTHONPATH=./searchpath-namespace python main.py
namespace
$ PYTHONPATH=./searchpath-regular python main.py
regular

So far so good. And now when I combine them in any order:

$ PYTHONPATH=./searchpath-regular:./searchpath-namespace python main.py
regular
$ PYTHONPATH=./searchpath-namespace:./searchpath-regular python main.py
regular

If search path order allowed for imports on namespace packages to have higher precedence than regular packages, then we'd expect the second command to print namespace. But the regular package is found in both instances.

.with_python_version(PythonVersion::PY38)
.build();

let existing_modules = create_module_names(&["asyncio", "functools", "xml.etree"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I think this PR can no longer resolve xml.etree since we call it a namespace package and put it in stdlib. And module resolution (as it kind of did before) specifically declines to support namespace packages in the stdlib since we know there aren't any in there. So that in turn I think makes this change (along with the others I think) probably fine?

@BurntSushi
Copy link
Member

All righty, I've updated this PR and I think it's ready for review. In particular, I believe this will fix astral-sh/ty#1749 (as @Gankra mentioned in the OP, but this PR now has a regression test for it) but it should also now fix astral-sh/ty#1967 (along with a regression test). I also unwound a couple of things in this PR that I think we don't want. See the commits following the first.

@Gankra Would definitely appreciate a quick review on the commits I added here. I believe I left your initial commit alone, so you should be able to see the crimes I committed pretty easily by just looking at the subsequent commits.

@AlexWaygood Would also appreciate review. My plan is to port the ideas in this PR (in a follow-up change) over to the all_modules implementation. The ultimate goal here is to allow us to support namespace packages in various places (like auto-import and type hierarchy sub-types).

@carljm carljm removed their request for review February 27, 2026 19:58
@BurntSushi BurntSushi force-pushed the gankra/namespace-auto branch from 5f2b2d5 to f174673 Compare February 27, 2026 20:00
BurntSushi added a commit that referenced this pull request Mar 4, 2026
In #22449, I added a check to our "did open" handler to effectively
ignore notifications for text documents that we were sure weren't
Python. This was meant to fix a case where we could return diagnostics
for non-Python files, which was undesirable.

However, it seems like that might have been too big of a hammer. It
seems like we might still want to track non-Python text files in our
index but not our project. Otherwise subsequent requests regarding that
non-Python file result in log messages saying that ty doesn't know about
the file. i.e., a state synchronization issue.

Addresses #23121 (comment)
BurntSushi added a commit that referenced this pull request Mar 4, 2026
In #22449, I added a check to our "did open" handler to effectively
ignore notifications for text documents that we were sure weren't
Python. This was meant to fix a case where we could return diagnostics
for non-Python files, which was undesirable.

However, it seems like that might have been too big of a hammer. It
seems like we might still want to track non-Python text files in our
index but not our project. Otherwise subsequent requests regarding that
non-Python file result in log messages saying that ty doesn't know about
the file. i.e., a state synchronization issue.

Addresses #23121 (comment)
BurntSushi added a commit that referenced this pull request Mar 4, 2026
In #22449, I added a check to our "did open" handler to effectively
ignore notifications for text documents that we were sure weren't
Python. This was meant to fix a case where we could return diagnostics
for non-Python files, which was undesirable.

However, it seems like that might have been too big of a hammer. It
seems like we might still want to track non-Python text files in our
index but not our project. Otherwise subsequent requests regarding that
non-Python file result in log messages saying that ty doesn't know about
the file. i.e., a state synchronization issue.

Addresses #23121 (comment)
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests here LGTM! One pathological edge case you could also add would be something like this:

```toml
extra-paths = ["/path-one", "/path-two"]
```

`/path-one/shapes/foo.py`:

```py
X = 42
```

`/path-two/shapes/bar.pyi`:

```pyi
```

`/path-two/shapes/py.typed`:

```
partial = true
```

`main.py`:

```py
from shapes.foo import X

reveal_type(X)  # revealed: Literal[42]
```

Because shapes-stubs is a stubs package, it must take priority over shapes in the first search path even though shapes-stubs appears in the second search path. But because shapes-stubs is a partial = true namespace package, when we fail to find the foo submodule in shapes-stubs, we must fallback to shapes/foo.py when resolving the module.

Comment on lines +318 to +319
According to [import resolution ordering], a `foo-stubs` stub package should have priority over a
`foo` package regardless of search path ordering.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense to me, but FWIW I think the spec is a bit vague here. This principle can be inferred from the language in the spec, but I don't think the "regardless of search path ordering" point is specifically addressed anywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I had interpreted (4) as being about "regardless of search path ordering":

Stub packages - these packages SHOULD supersede any installed inline package. They can be found in directories named foopkg-stubs for package foopkg.

But fair that this is an interpretation. I'll loosen the wording a bit here.

Gankra and others added 7 commits March 4, 2026 13:04
…ull py.typed

I *think* changing this test was wrong, because it seems intentional
that we implement a basic inheritance scheme?
…rch path ordering

This includes a regression test that fails on current `main`.

This also simplifies the "candidate retain" logic a bit.

While I've been trying not to also make changes to the "list all
modules" implementation (saving that for later), it was hard to avoid
here because of our consistency check ensuring that listing modules and
resolving a module always return the same thing. I tried to make the
simplest change I could there.

Fixes astral-sh/ty#1967
It exists at the intersection of namespace packages, partially typed
packages and stubs.
@BurntSushi BurntSushi force-pushed the gankra/namespace-auto branch from 2d72a11 to 56c38ab Compare March 4, 2026 18:04
@BurntSushi BurntSushi merged commit 149c578 into main Mar 4, 2026
50 checks passed
@BurntSushi BurntSushi deleted the gankra/namespace-auto branch March 4, 2026 18:10
carljm added a commit that referenced this pull request Mar 16, 2026
* main:
  [ty] Split up `types/class.rs` (#23714)
  [ty] Rework module resolution to be breadth-first instead of depth-first (#22449)
  [ty] Move tests and type-alias-related code out of `types.rs` (#23711)
  [ty] Move TypeVar-related code to a `types::typevar` submodule (#23710)
  Fail CI on new linter ecosystem panics (#23597)
  [ty] Fix handling of non-Python text documents
  [ty] Move `CallableType`, and related methods/types, to a new `types::callable` submodule (#23707)
  [ty] Avoid stack overflow with recursive typevar (#23652)
  [ty] Add a diagnostic for an unused awaitable (#23650)
  [ty] Fix union `*args` binding for optional positional parameters (#23124)
  [`refurb`] Fix `FURB101` and `FURB103` false positives when I/O variable is used later (#23542)
  [ty] Fix type checking for multi-member enums within in a function block (#23683)
  [ty] Improve folding for decorators (#23543)
  [`airflow`] Extract common utilities for use in new rules (#23630)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ecosystem-analyzer ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants