Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixtures: register finalizers with all previous fixtures in the stack (take 3) #7511
fixtures: register finalizers with all previous fixtures in the stack (take 3) #7511
Changes from 2 commits
2e48455
5cf32ec
b222988
2f70561
6943c4f
e59e383
009e4a3
06b2f59
cbcea9e
b095542
bb69a50
d331a57
0aa66f8
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this using
getattr
instead of direct attribute access because ofPsuedoFixtureDef
? If so, it would be better to add_finalizers
toPsuedoFixtureDef
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was the reason. But I checked it out and it doesn't look like a
PseudoFixtureDef
could even get there, so I'm now just tapped into the attribute directly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relying on the finalizers being
functools.partial
s here seems fragile to me. Would prefer if some other more direct mechanism for this was used, like using a proper type instead offunctools.partial
, or storing tuples in_finalizers
, etc.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this isn't a super clean approach. But I think getting a proper type/mechanism in place could make this a much larger change. I'm not sure how you wanna proceed with it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend using
"""
strings withtextwrap.dedent
here, then the\n
s can be removed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right, why is
c_fix
(class-scoped, without dependencies, being requested bytest_a
andtest_b
directly), being torn down at this point?It should be torn down at the end of the session only, regardless of the involvement of other autouse fixtures, or am I missing something? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix basically makes it so that the fixture order is a stack, and a fixture cannot be torn down unless all the fixtures that were setup after it have also been torn down. In this case, I believe
m_fix
is trying to tear down to go fromm1
tom2
, but to do that, bothc_fix
andanother_c_fix
need to be torn down first.This way, there's no surprises related to side effects.
Without this (if
c_fix
would otherwise tear down only at the end of the session), ifc_fix
was unwittingly affected by somethingm_fix
did when its param wasm1
, then whenm2
comes around, ifc_fix
wasn't torn down, it would still be influenced bym1
, whilem_fix
kept doing doings thatc_fix
would no longer be affected by. That would mean that if we ran that class directly with pytest,test_auto.py::TestFixtures::test_a[p1-m2-f1]
would potentially be doing something different than had we rantest_auto.py::TestFixtures::test_a[p1-m2-f1]
directly.For example, if
m_fix
was parameterized so that it was setting up different versioned installations of chrome and chromedriver, andc_fix
was launching a chrome driver session (not that I recommend handling drivers this way haha),c_fix
would be executed once and stay as the first driver session made which is running off the first driver/browser version installed. So both tests would be using the same driver session, whilem_fix
was reinstalling different versions that aren't used by anything. And iftest_auto.py::TestFixtures::test_a[p1-m2-f1]
was run directly,c_fix
would be executed while a different version of chrome and chromedriver were installed. I'm guessing running the class with pytest (rather than the function directly) would actually throw an error since the driver executable would still be running whenm_fix
tried to uninstall it, but that's the general idea. This fix prevents that sort of thing from happening.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment nailed the change for me. I agree with your argument, that the new behavior is the correct one. This is indeed a subtle but significant change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the explanation @SalmonMode, and sorry about the delay, it helped understand the actual behavior behind the change!
Again, really appreciate the patience on this, given that this is taking months now to review, but is a very sensitive and complex part of the codebase.
The remark about it being now a stack helps a lot.
I understand the picture you are painting here, the problem I see here is "unwittingly", which is a synonym for implicitly... if
c_fix
is affected bym_fix
, thenc_fix
should explicitly depend onm_fix
, I think.When I see this code:
In my understanding on how things should work,
c_fix
should only be torn down after all tests inTestFixtures
have finished, regardless of what other autouse fixtures are in effect in this session, becausec_fix
doesn't depend on anything else.If
c_fix
has an implicit dependency onm_fix
, it should explicitly declare that dependency.The same argument can also be made that this is a bug: if
c_fix
does something costly but completely unrelated to any other fixture, why is it being torn down just becausem_fix
needs to be recreated with a different argument? I understand the new behavior is safer, but also I believe someone might point out this is a regression for the use case above.A stripped down example:
Given that
p
doesn't depend on anything, I expect it to be setup/torndown exactly once in this session, and this is the behavior onmaster
. If I changep()
top(s)
then it is correctly setup and torn down twice, to account fors[1]
ands[2]
.The change here is implicitly adding a dependency to lower scoped fixtures to all their upper scoped fixtures, which I'm not sure it is correct/desirable.
What do you folks think?
Also I need to mention the sheer quality of the PR is remarkable! The documentation improvements are an amazing addition by themselves. 😍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi folks,
Gentle ping here in case this got lost through the cracks. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bit vague, but let me word that differently: autouse fixtures are automatically applied to functions, not to other fixtures. When I mentioned that I was talking specifically about autouse fixtures, which I belive is the main point we are discussing.
Sorry, I don't see any inconsistency. 😕
In that example, pytest needs to execute
test()
. In the same scope we have two autouse fixtures,s
andp
, so the test is considered to have been written as if the user explicitly wrotetest(s, p)
. Given thats
is parametrized, pytest executes two tests,test(s[1], p)
andtest(s[2], p)
. Becausep
doesn't depend on anything else, there's no reason for pytest to recreatep
.I disagree...
p
doesn't really depend on anything. Just because a fixture was created before it, doesn't imply a dependency IMHO.Thanks for the examples, but all the output shown after them (with setup/teardown order of the fixtures involved) makes sense to me, so it is hard for me to understand what is the argument behind them, sorry.
I believe the main point of disagreement between us is that you believe that fixtures should explicitly depend on any autouse fixture acting on it, and I don't think they should.
To exemplify:
What happens today, is that pytest executes as if the code was written like this:
What you argue is that pytest should be doing this:
Is my undertanding correct? Please correct me if I'm wrong.
If my understanding is right, then I argue that the current behavior is correct, and it will definitely break existing suites that are working, and working not by accident, but by design (I say that from our codebase at work).
It is common for an autouse fixture to execute something which is system-wide, but not necessarily affects other fixtures in the system.
I will give an example derived from real code from work:
In the current behavior,
set_system_locale
doesn't affectinitialize_petsc
(and it should not, petsc doesn't care about the locale at all). We tests withset_system_locale[br]
andset_system_locale[en]
, andinitialize_petsc
is executed once. Under your proposal,initialize_pestc
would execute twice, which is wrong and probably would crash the system.I'm not familiar with the issue with pytest-django you mention... perhaps you can exemplify it to clarify the point better?
Again sorry for all the discussion; I'm not trying to be difficult, I'm really making an effort to make sure I'm not missing the point, but if I'm understanding the proposal correctly, I don't think it is correct.
Perhaps a new example of how things work now, and how you belive they should work, but with "real" examples, might help me understand what you believe the problem is (in case I'm still missing your point, as I tried to exemplify above)? It might be I'm missing something due to the fact that we are using abstract examples (like
s
,p
and so on)? In that case the pytest-django case might help.Again thanks for the patience!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, I'm not saying that the autouse fixture has to be dependent on the higher level fixture. I'm saying that if there's a need for (in the example),
s[1]
to happen beforep
, thenp
must be dependent ons[1]
, but if it isn't, then it can happen afterp
, and if it can happen afterp
, then it can be restructured slightly so thats
happens afterp
and each param fors
could apply to a separate version of that test withoutp
having to be run again. This is where my second example output came in, i.e.:The goal with my change is to get fixtures working like a stack so it's easier to manage/debug fixtures. I'm not saying there is always a dependent implied, but if the setup/teardown plan is jumping around, that suggests to me that things are being tested in contexts that have more going on than is necessary, or because scopes are being leveraged in a certain way to control ordering of behavior (and I believe that way could be made more effective/convenient/easy to read).
In regards to your work example, things get tricky, because I don't know where that first fixture is defined. If it's in your top-level contest, for example, then I would expect that second fixture to run twice, and if that's not desired, then I wouldn't have that first fixture in the top level conftest.
That said, I've been thinking about this for a while because a
session
level fixture isn't really asession
level fixture depending on where it's defined, and I think we could solve a _lot_of problems/confusion by leaning into that behavior, and having scopes work like relative scopes when they can't really be applying to the full scope the claim to be. For example, if I define a fixture inside a class, and give it a scope ofsession
, it can't really be anything more thanclass
, but I could see it being used that way to make sure it happens beforeclass
scoped fixtures for tests in that class. In line with this, in the setup plan output, these could be indented according to the scope their effectively limited to in order to make a cleaner, easier to follow setup plan.Regarding the pytest-django stuff, this is the PR I made to solve it in response to the merging if the original take of this PR:
pytest-dev/pytest-django#807
I'm on mobile, so hopefully the description there explains it well.
All that said, I find myself losing the forest for the trees a bit, so I'm definitely gonna have to sleep on this, and come back with some fresh eyes and re-read everything haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so I took a look back, and read through things a bit more carefully (although I still need a bit more caffeine haha). That said, I did spot an issue with how things currently work in regards to your work example, and this bit in the docs:
Given these fixtures:
One would expect
c
to only happen once and only once per class because it's autouse. The problem comes in when parameterization is involved (at least for a higher scope fixture).Given these test classes, though:
Because pytest has to execute every test once for the first param set of
s
before it can do it again for the second param set, that means it'll have to setupc
twice for each class because pytest had to tear it down after each class for the first param set and those tests are still dependent onc
because of autouse. In total,c
would be executed 4 times.Here's the setup plan (from pytest 6.1.2)
This means that your
package
scoped fixture for work could easily be run more than once per package if more than one package depends on it.Unfortunately, because of parameterization, there's no way to guarantee an autouse fixture only happens once per its defined scope.
NOW, I have an alternate proposal that I think might suit everyone's needs.
In @bluetech's example:
setup plan
the
m
fixture, to me, happens out of place, because it doesn't respect the ordering of scopes, until teardown. This is also a problem for your scenario, where you definitely only want a fixture running once ever.Ideally, I would like this to be the setup plan for that example:
But what if only
test_f1
is to be run? Then I would want this to be the setup plan:Since the
m
fixture wasn't needed by anything, it could be cut out.To me, this is much easier to not just reason about, but also to orchestrate.
This has pretty large implications, and could be incredibly challenging to figure out how to do this not just effectively, but in a way that folks can leverage in a predictable way for more advanced usage. But, I have an idea, and I think you'll get quite a kick out of it.
Fixture Resolution Order (FRO)
It works almost exactly like MRO (so it's a familiar concept), except the "child" we're trying to figure out the FRO of, is the master FRO. After the first pass, it can be sorted by scope, and this can be used to ensure stack-like behavior.
Keep in mind, that entries in the FRO aren't required to be in the stack. It'd be more of a reference for knowing what fixtures should be executed after others, and how to teardown to appropriate common points. It can also look at the tests that would actually run (so not just the ones that were collected) and considered their actual dependencies to find out the actual scopes that running certain fixtures would be necessary for, even if the first test in those scopes isn't actually dependent on them, so they can be preemptively run for those scopes while also not running for scopes that don't need them.
We can also apply some tweaks to it to make things operate more efficiently. For example, if autouse fixtures are shifted left in the FRO, then parameterized fixtures can be shifted as far right as can be (so as to affect as few other fixtures as possible), and if they're autouse parameterized fixtures, they can be shifted as right as possible among the autouse fixtures of their scope.
For your use case, since you only want
initialize_petsc
to run once ever, you would just need to change its scope tosession
, and sinceset_system_locale
is parameterized, it would happen afterinitialize_petsc
.Nothing would ever get executed that didn't need to be, parameterized fixtures would have as reduced a footprint as they reasonably could, stack-like behavior would be achieved, reasoning about and planning setup plans would be easier (IMO), and I'm not sure, but I think this could also offer some opportunities for optimization in pytest's code.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC what you mean, then I agree and it is what I've been trying to convey: if a certain fixture requires the work from another fixture to happen before it, then it should explicitly declare a dependency on it. If
p
needss
in some way (due to a side effect like DB initialization, or because it needs the actual value produced bys
) thenp
should depend ons
(IOW, declares
as a paramater).I took a look at pytest-dev/pytest-django#807 (thanks for the link), and it seems what happened there was exactly that: the
django_db_blocker
implicitly depended on the database (meaning: it needed the database to be setup at that point, but did not received any other fixture as parameter), which broke once a bug in pytest was fixed. The solution to the problem was to explicitly makedjango_db_blocker
depend ondjango_db_setup
, which is also what I advocate here is correct and how things should work.After reading back a few messages, I think an interpretation (which is also yours) is a bit incorrect and leads to some conclusions which are not true:
When in fact what it means in pytest (currently implemented) is:
"be requested" here also imply being requested by other fixtures, not only tests.
You are right; once parametrization plays in, the weak guarantee that a fixture at a high-level scope is executed once per scope breaks.
Indeed, it follows the rule I described above... but given
m
andc
don't depend on each other, I actually don't think it is a problem, becausem
apparently doesn't care about which order it will be torn down related toc
. A bit weird I agree, but I don't thinkm
should be setup beforetest_f2
, as it is only requested then.Yeah I think you are onto something, indeed a clear view or different approach on how pytest deals with fixtures is something we have been wanting to tackle in a while: #4871.
Currently the fixture mechanism is implemented as:
There's definitely room for improvement there, however I do think we should keep 1) working as is.
Perhaps we should move the discussion there and close this PR, or do you think we should analyse this some more? Also I didn't hear from others in while here (understandable, keeping up with the discussion here takes a lot of time!).
Again, thanks for the rich discussion and patience here @SalmonMode!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha yes, I agree. Getting here using the GitHub mobile app is also pricing challenging (I wanted to give it a try, but it definitely needs some work).
And I agree. Given the nuances of this, the needs of pytest itself, its needs going forward, and backwards compatibility, I don't believe this PR is a sufficient fix. I've also started trying to implement the FRO approach to see how it could be done, and I can tell there's a lot of things that will need to be discussed. I'll move my proposal over to #4871, and this can be closed.