-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How should package collection work? #7777
Comments
I'd like to have a more detailed discussion about structuring the collection tree, we accrued a mess that requires some decomposition to begin with |
Thanks @bluetech for putting up this detailed description. Indeed the Package node was not thought through before being included, and I'm at fault for not giving the attention it deserved at the time.
I think packages should be nested, just like the other nodes. I'm surprised that no one noticed this inconsistency before (including myself). Then package-scoped fixtures follow the normal rules that other fixtures follow. The question about namespace packages unfortunately is not that simple. Besides being a pain to detect at runtime (at least that I recall the discussions around that subject), we need to wonder how they would be represented given that in the same namespace package, sub-packages might be located in completely different directories, and the collection tree is directory-based. In other words, in your example if we collect I think however we can postpone what to do with namespace packages to some other time, and focus on natural packages (those with
Good idea @RonnyPfannschmidt. You want to do that here or somewhere else? I think here is fine IMHO. |
Found this issue while trying to figure out why my nested test packages all report as top level packages. I, for one, if I went through the trouble to nest them, I want them to be nested when reported. Is there a way to force that to happen right now for my situation as a workaround? I've tried so many things that I'm at a loss. |
Ok, so I wrote a plugin as a workaround for what I needed since I couldn't find anything else. Code is MIT license. Hopefully it's useful to other people! |
Possible PlanLooked at this again. The main trouble here stems from the fact that we have two recursive collectors -- I'm increasingly convinced that we shouldn't have any recursive collector. Huh? Well, if you think about it, we already have a concept to handle the recursion for us -- the collection tree. The collectors are already recursively expanded on their own, that is, a What does it mean in practice? At a high level:
This looks like this:
Some points:
I have a very simple POC implementation, but there are a bunch of details to figure out. Interested to hear opinions. I vaguely remember @RonnyPfannschmidt mentioned a |
It's been more than a decade, that directory node existed It was undone for good reasons, but I think it is reasonable to reintroduce it |
Do you remember what they were? I do suspect there is some issue here; the Directory solution is pretty natural and harmonious with the entire collection tree concept, that I'm sure it's been tried. If there's something fundamental I'm sure I'll run into it eventually but sooner is better than later :) |
Pytest prior to the refactoring back then had no concept of separation of test running and test execution Tests where executed as found While changing the details, it was simpler to do collect towards if there where only files and session |
Suggestion regarding the design above: Directory nodes wrap Packages that are not inside other Packages, such as:
I feel like this (1) makes it clear that a session can be multiple directories (since it can already be multiple "top level" packages, this is roughly equivalent), and (2) Should(tm) make it easier to track file system scoping b/c it Should(tm) only need to find the closest parent Directory node. NOTE: The suggestion implies that we would SKIP Directory Nodes for sub-packages and modules BUT include directory nodes for "grafted sub-packages". In other words, (I think) only add a Directory Node if the directory lacks an init.py (grafted case) OR does not have a parent package (top level packages). |
@MarximusMaximus right. I'm not entirely sure how it would look like exactly, but that's the basic idea - |
In case I get hit by a bus, my WIP branch is here: https://github.com/bluetech/pytest/commits/pkg-collect It seems viable for sure, but very delicate. |
|
|
Amusingly I just realized that |
After a lot of tinkering with obscure pytest behaviors ( |
Abstract
|
Order of files vs. directories in collectionPreviously, files in a directory were collected before sub-directories in the directory. That is, given a filesystem tree
would collect as
After my changes, the order that naturally flows from the code is that files and sub-directories are orderer jointly, that is the tree is
For now I am keeping the joint ordering, but let me know if you think ordering files before subdirs is better. |
How about leaving the recursive FS Walk to a single utility that's collects directory/package/file nodes and provides the correct parents The collect function of each node could then returns the items/definitions within |
This would restrict Directory collectors to only handle their files. We not give them control over the sub-directories? What's the advantage? |
No id actually like to go further and leave the scanning of the files tree in the hand off pytest I'd strictly want to avoid a situation where multiple collectors recursively have to cooperate to scan all the files Walking the file tree and mapping directories and files to nodes ought to be handled in a single place |
I understand, but why do you want it to be handled in one place? My idea is to give control to plugins (i.e. custom nodes, including our own, e.g. python, which currently requires hardcoding in |
Any extension that implements file collection Will have to share implementations of selection with pytest, which implies exposing the apis for correctly handling it,and thrusting plugin to correctly use it Based on how people implemented things that doesn't seem safe |
I agree with this concern. I somewhat procrastinated on this comment, but I tried various things, and in the end I don't have a solution that alleviates this concern without making the hook/custom directory collector mostly useless. So instead of holding up pytest 8 for even longer, I decided to submit what I have (PR #11646), which I think is pretty good, and hope that plugins do the right thing. I tried to document the expectations in the |
The pytest collection mechanism has been overhauled in pytest 8.0.0, resulting in a different node tree when collecting the tests. Ensure the paths / names we're using that are derived from the node tree are consistent across different pytest versions. Particularly, this has affected the convenience symlink name (which is supposed to be in the form of e.g. dns64_sh_dns64 for the dns64 module and tests_sh_dns64.py module) and the test name that's logged at the start of the test, which is supposed to include the system test directory relative to the root system test directory as well as the module name (e.g. dns64/tests_sh_dns64.py). Related pytest-dev/pytest#7777
The pytest collection mechanism has been overhauled in pytest 8.0.0, resulting in a different node tree when collecting the tests. Ensure the paths / names we're using that are derived from the node tree are consistent across different pytest versions. Particularly, this has affected the convenience symlink name (which is supposed to be in the form of e.g. dns64_sh_dns64 for the dns64 module and tests_sh_dns64.py module) and the test name that's logged at the start of the test, which is supposed to include the system test directory relative to the root system test directory as well as the module name (e.g. dns64/tests_sh_dns64.py). Related pytest-dev/pytest#7777 (cherry picked from commit 7118cbe)
I am working on cleaning up our collection code, but the current behavior seems odd and incidental, therefore I'd first like to discuss how it should behave.
pytest and operating system versions: pytest 6.0.2/master on Linux.
Current behavior
Consider the following filesystem tree:
This has several nested packages, but note that the
c
level doesn't have an__init__.py
(just to make it more interesting).This results in the following collection tree:
The
Package
s are all flat, not nested. Namespace packages are not consideredPackage
s. Files not inside packages are collected as standaloneModule
s.This however does not reveal the real story. See what happens when
--keep-duplicates
is used:This has all of the previous collectors, but also duplicates which are nested under each Package.
Technical details
For this interested, the code details are as follows:
pytest has two recursive filesystem collectors,
Session
andPackage
.Session.collect()
walks the entire trees (of the given command line arguments) recursively in BFS order and creates collectors forPackage
s andModule
s. It has various obscure code to special-casePackage
s and exclude files belonging directly to the package. TheCollector
s it creates always have theSession
as parent (i.e. flat).Note that the
Collector
s themselves are only recursively expanded toItem
s after the above is finished. (This step is done bySession.genitems()
which calls eachCollector
's owncollect()
method).Package.collect()
also walks the package's directory recursively. It has some code to try to avoid collectingModules
belonging to sub-packages, but otherwise createsModule
s andPackage
s with itself as parent.Since the stuff collected by
Package
s was already collected by theSession
(with the exception of a package's own direct files) they are mostly discarded as duplicates unless--keep-duplicates
is used.Question
My question is, what do we want to happen?
Evidently the nesting is not super important, otherwise we would have heard loud complaints by now (though there are some issues about this). But the nesting does have an effect on how package-scope fixtures are applied - reuse from super-package or not?
The duplication seems definitely broken.
And there is a question on how PEP 382 namespace packages fit into this (if it all).
Would be happy for any thoughts!
The text was updated successfully, but these errors were encountered: