-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warnings for side effects during import #3837
Conversation
Hm, possibly we could hook into f.x. I don't have any experience with pytest plugins, any advice @Zac-HD ? We'd want to catch things done by the plugins themselves, not by the user in conftest.py etc. |
Hmm, if we're just issuing warnings I think it's reasonable to warn about stuff in conftest.py too? That also adds latency in time-to-starting-first-test, which is equally annoying for feedback loops no matter where it comes from. I'd actually taken a stab at this myself too, master...Zac-HD:hypothesis:plugin-checks; feel free to crib from that if you want. I do prefer your warnings approach and think deprecations are probably a mistake though; I'd be OK with an eventual error for plugins but not conftest stuff etc. |
Right! It looks like we're following similar paths, with the major differences being
I'll gladly continue this branch with improvements from yours, or vice versa. For clarity: By "prefer your warnings approach" you're thinking of the warn-at-callsite part of it, not the registration part? And I'm afraid I don't fully comprehend the implication of "deprecations are probably a mistake", either, sorry ;-) |
3e3cf7d
to
6289589
Compare
FYI: This is mostly feature complete. I'll have to leave it for a little while now, the remaining tasks:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll gladly continue this branch with improvements from yours, or vice versa. For clarity: By "prefer your warnings approach" you're thinking of the warn-at-callsite part of it, not the registration part? And I'm afraid I don't fully comprehend the implication of "deprecations are probably a mistake", either, sorry ;-)
- I'd suggest continuing with your branch, and stealing any inspiration you want from mine
- I think try/except checking for lazy strategies is better than the way I implemented it; the direct check AFAICT currently works but it's more fragile
- I'd avoid bringing our pytest plugin into it; people also use Hypothesis with
unittest
and there's nothing pytest-specific about these checks. - I'm actually not too worried about regression tests for this; the environment management is enough of a pain that (at least until we have a regression) I'm OK with leaning more on manual testing before we merge. Just write up a description of what you manually tested + how so we can do it again next time!
if os.environ.get("HYPOTHESIS_WARN_SIDEEFFECT"): | ||
|
||
def sideeffect_should_warn(): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want to always warn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bit of a long story, so I'll explain. Pardon the verbosity! I agree there's nothing pytest-specific about the check for plugins that we load ourselves, but pytest plugins (and conftest) are an instance of a wider check: Side effects during initialization, where initialization is not just the initial import.
We can't warn directly about side effects during post-import initialization, but we can detect them. Even at no additional cost, since we can check whether the patching has been done - which happens on the first side-effect-inducing operation after import. And then the pytest plugin (or other test runners) can quite easily check if this is the case at start of session.
Now, it can warn that it has happened, but it can't see who did it or where, at least not easily [*]. That's where this setting comes in:
"initialization, possibly causing slowdown or creation of files. To pinpoint and explain "
"the problem, execute with environment 'PYTHONWARNINGS=error HYPOTHESIS_WARN_SIDEEFFECT=1'",
...which will cause the warning to happen, even after import is finished, and show the relevant stack trace.
[*] Yeah, we could record the stack of the first such call, but that's more effort than it's worth IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[*] Yeah, we could record the stack of the first such call, but that's more effort than it's worth IMO
Hm, or maybe. I guess it's easy enough to stash the stacktrace and message. Would simplify the wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, there are basically two ways to go about this:
- Warn at the end of initialization, if we can see that a side-effectful operation happened.
- Warn on a side-effectful operation, if we're not yet finished initializing.
I generally prefer (2), because the stack trace from -Werror
makes it really easy to tell when this is happening.
I guess the problem is how to tell whether we're in post-import initialization, but I think there we could set a global from the earliest possible pytest setup hook, and then unset it at the end? We might still miss a few cases due to the ordering of hooks between our and other pytest plugins, but I think that applies equally to (1) for side effects triggered by pytest plugins.
At that point I suppose there's still a case for (1) to warn if we detect that there was a side-effect between finishing import hypothesis
and delivery of the first pytest hook... but that seems concerningly prone to false alarms (c.f. testdir.runpytest_inprocess()
test changes!). Let's just skip detection in this case then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree (2) would be best, and we could do that as soon as possible (I was thinking even plugin-module execution, not wait for the import hook). I haven't done this because I am concerned that there is a significant chance of something happening during that hole in coverage. But hey, maybe it's not that important compared to having a simpler best-effort variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... if there is a hole, we must check and deal with that anyway to revert the patching, so we may as well warn in that case, too. I'll follow this path then.
Thanks for the review and suggestions @Zac-HD ! I'll clean up and complete, and ping you when ready. Note: I'd like to keep the pytest stuff in, it makes it at least partially testable because we can then check detection of side-effects after import which is fairly simple infrastructure-wise. |
4321469
to
2c6461a
Compare
2c6461a
to
c3c77e6
Compare
Right, ready for next round now @Zac-HD . I'm reasonably happy with how it ended up, if you feel it's too complicated it would be possible to simplify a bit at the price of reduced coverage. I did take the liberty of creating a new toplevel module for globals, since I didn't find any existing appropriate module. Test coverage is actually fairly complete, yay! The earlier "false positives" were a bug, I didn't realize that plugin hooks would run multiple times for All tests should pass soon, with the exception of the windows memory error, which might be related to #3820 (which I recall saw and provisionally fixed the same failure). It is triggered by generating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great to me - thanks so much for your work here Joachim!
I think all we have left to do is the two small comments below, and the mysterious-but-repeated failure on 32-bit Windows (and Python 3.11, but I bet it's the first bit causing trouble). Also suspicious that it's a MemoryError...
Co-authored-by: Zac Hatfield-Dodds <[email protected]>
I didn't really follow #3820, but I did check this memory error. It was caused by It does however point out that the repr is potentially expensive, so maybe we should not construct it unless the warning is actually emitted. I.e., input format string and arguments separately, logger-style. |
"to get a traceback and show which plugin is responsible." + extra, | ||
HypothesisSideeffectWarning, | ||
stacklevel=3, | ||
) | ||
else: | ||
_first_postinit_what = what | ||
_first_postinit_what = (what, fmt_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to stored the formatted what
? Slightly concerned about holding onto a potentially mutable object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH we know exactly what comes in, no user will call this function. Thanks for getting to all green 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @jobh 🥰
pytest-dev/pytest#11825 (review) - looks like a false alarm inside I also noticed that the warning unconditionally mentions "import time", we should be specific about import time / between import and loading the pytest plugin / during pytest initialization. |
That's weird. It's only reproducible when running the full test suite instead of just the python test folder, and by tracing all increments/decrements I can see that |
Seriously... I renamed the global to ensure all access is via logged accessors, and see the following sequence:
What could possibly cause this? |
Right. I suspect what happens is that the final plugin decrement happens within a somehow auto-mocked pytester environment, so it is magically reverted afterwards. The minimal test sequence to reproduce is
This means we can't balance our incs/decs, so the only solution is to accept imbalance like the suggested fail-safe solution for thread safety above. |
I submitted a hotfix PR to fix it. It passes pytest's tests, if it passes ours as well then it should be good. Btw: Impressive early spotting of this problem @Zac-HD ! |
Not that it matters now, but checking this with a clearer head, for the benefit of better understanding:
the real cause for the mystery was much less complicated, it was just the logging output being hidden by |
Warn if the storage directory is accessed, or if a lazy strategy is materialized, during import. This is intended to target plugins that are loaded during import of hypothesis through importlib entrypoints, but the scope is expanded to the full duration of the import (because why not).
For pytest plugins, it's not clear that it is possible since we don't control the plugin loading order. We could add side-effect detection at the time of loading the hypothesis pytest plugin, to detect if those running before have done anything unwanted, but I don't see how we can do anything about those loaded after us. Not sure if this kind of flaky detection is better than none at all.
Closes #3836.
I don't know how to test the during-import behaviour beyond superficial "mock to pretend we're still importing"