Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conflict with pytest-django and pytest-xdist #277

Closed
raphaelm opened this issue Oct 25, 2024 · 1 comment
Closed

Conflict with pytest-django and pytest-xdist #277

raphaelm opened this issue Oct 25, 2024 · 1 comment
Labels

Comments

@raphaelm
Copy link

We just spent a lot of time debugging a really weird issue and we still haven't quite understood it. However, it occurs only in the combination of

  • pytest-django
  • pytest-rerunfailures
  • pytest-xdist
  • GitHub actions

Two months ago, a flaky test made it into our codebase. Since then, a good 75% of our GitHub actions runs failed, however only on our test matrix elements that test against PostgreSQL. All of these test failures had dozens, if not hundreds, of error messages like this:

| __________________ ERROR at teardown of test_position_queries __________________
| [gw2] linux -- Python 3.9.20 /opt/hostedtoolcache/Python/3.9.20/x64/bin/python
| 
| self = <DatabaseWrapper vendor='postgresql' alias='default'>, name = None
| 
|     def _cursor(self, name=None):
|         self.close_if_health_check_failed()
|         self.ensure_connection()
|         with self.wrap_database_errors:
| >           return self._prepare_cursor(self.create_cursor(name))
| 
| /opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/django/db/backends/base/base.py:308: 
| _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
| /opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/django/utils/asyncio.py:26: in inner
|     return func(*args, **kwargs)
| _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
| 
| self = <DatabaseWrapper vendor='postgresql' alias='default'>, name = None
| 
|     @async_unsafe
|     def create_cursor(self, name=None):
|         if name:
|             # In autocommit mode, the cursor will be used outside of a
|             # transaction, hence use a holdable cursor.
|             cursor = self.connection.cursor(
|                 name, scrollable=False, withhold=self.connection.autocommit
|             )
|         else:
| >           cursor = self.connection.cursor()
| E           psycopg2.InterfaceError: connection already closed
...

A sample run with full log can be found e.g. here:
https://github.com/pretix/pretix/actions/runs/11322864056/job/31484383103

Which is a run of the repository at this commit:
https://github.com/pretix/pretix/tree/40c8d014dfba6e97af0ad40d0b7f4abfd087082a

As you can see, the test ends with a summary of

= 5365 passed, 17 skipped, 2 xfailed, 26 errors, 41 rerun in 772.74s (0:12:52) =

This is the environment running in there:

platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0
django: version: 4.2.16, settings: tests.settings (from ini)
rootdir: /home/runner/work/pretix/pretix/src
configfile: setup.cfg
plugins: django-4.9.0, asyncio-0.24.0, rerunfailures-14.0, xdist-3.6.1, cov-5.0.0, mock-3.14.0
asyncio: mode=strict, default_loop_scope=None
created: 3/3 workers
3 workers [5397 items]

However, in reality, only one test should be failing – and this incorrect test is not even part of the failures listed in the output. Probably because it was retried successfully and therefore not listed as a failure.

All of the listed failures are from the same pytest-xdist worker. It looks like the failing test is somehow leaving the database connection in a broken state, and all tests subsequently run on the same worker are failing.

Now, after days of search, we have found the faulty test and after we fixed it, all of the other failures vanished as well.

We have now put some more research in and discovered: If we roll back our fix and then set --reruns 0 or uninstall pytest-rerunfailures, only the failing test fails, and no other tests, as it should be.

In conclusion, I believe that when pytest-rerunfailures causes a test to be retried, not all necessary setup/teardown logic is called. (This could of course just as well be a bug in pytest-django, I did not figure out a way of determining that.)

Has anyone experienced something similar before?

@asottile-sentry
Copy link
Contributor

dupe here: #267 (comment)

@icemac icemac closed this as not planned Won't fix, can't repro, duplicate, stale Oct 30, 2024
@icemac icemac added the invalid label Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants