Improve reliability of certain managed python tests on Windows CI by EliteTK · Pull Request #17177 · astral-sh/uv

EliteTK · 2025-12-18T15:44:34Z

Summary

Results:

Test	Time before	Time after	Change
install_lower_patch_automatically	79.317s	20.093s	-59.200s
install_multiple_patches	70.086s	20.126s	-50.000s
install_no_transparent_upgrade_with_venv_patch_specification	47.971s	9.607s	-38.400s
install_transparent_patch_upgrade_uv_venv	74.886s	12.600s	-62.300s
install_transparent_patch_upgrade_venv_module	57.888s	10.854s	-47.000s
python_find_prerelease	78.424s	18.302s	-60.100s
python_install	47.827s	4.229s	-43.600s
python_install_automatic	71.138s	11.836s	-59.300s
python_install_build_version	35.983s	3.214s	-32.800s
python_install_build_version_pypy	80.750s	17.774s	-63.000s
python_install_cached	6.752s	2.247s	-04.500s
python_install_default	3.748s	4.202s	+00.454s
python_install_default_from_env	2.829s	2.190s	-00.639s
python_install_default_prerelease	1.136s	0.911s	-00.225s
python_install_default_preview	5.709s	3.601s	-02.110s
python_install_emulated_windows_x86_on_x64	90.025s	35.652s	-54.400s
python_install_force	1.523s	1.365s	-00.158s
python_install_freethreaded	36.136s	4.793s	-31.300s
python_install_invalid_request	0.299s	0.269s	-00.030s
python_install_minor	1.496s	1.309s	-00.187s
python_install_multiple_patch	3.033s	1.415s	-01.620s
python_install_no_cache	2.853s	2.311s	-00.542s
python_install_prerelease	2.018s	2.006s	-00.012s
python_install_preview	40.594s	6.870s	-33.700s
python_install_preview_no_bin	1.080s	1.177s	+00.097s
python_install_preview_upgrade	6.609s	6.295s	-00.314s
python_install_unknown	0.211s	0.206s	-00.005s
python_install_upgrade	9.153s	10.648s	+01.490s
python_install_upgrade_version_file	55.920s	12.943s	-43.000s
python_reinstall	4.939s	6.225s	+01.290s
python_reinstall_patch	2.716s	2.899s	+00.183s
python_upgrade_not_allowed	0.215s	0.215s	+00.000s
regression_cpython	30.779s	4.403s	-26.400s
uninstall_highest_patch	40.716s	3.790s	-36.900s
uninstall_last_patch	19.093s	2.965s	-16.100s

Comparing https://github.com/astral-sh/uv/actions/runs/20342580132/job/58446553156?pr=17177 against https://github.com/astral-sh/uv/actions/runs/20338690535/job/58431797575.

Overall test time went down from 279.163s to 258.427s - presumably not related though as this should be marginally slower. Here are some local tests (on linux):

just python_install:
- current config:
  Summary [ 34.452s] 45 tests run: 44 passed (12 slow), 1 failed, 3314 skipped
- threads-required = "num-test-threads":
  Summary [ 117.991s] 45 tests run: 44 passed (1 slow), 1 failed, 3314 skipped
- threads-required = 4:
  Summary [ 48.793s] 45 tests run: 44 passed (4 slow), 1 failed, 3314 skipped
all tests:
- current config:
  Summary [ 371.911s] 3357 tests run: 3356 passed (49 slow), 1 failed, 2 skipped
- threads-required = "num-test-threads":
  Summary [ 451.323s] 3357 tests run: 3356 passed (26 slow), 1 failed, 2 skipped
- threads-required = 4:
  Summary [ 386.213s] 3357 tests run: 3356 passed (31 slow), 1 failed, 2 skipped

(Failed test is not related)

Conclusion

I think this is a good way to gain more reliability for these tests on all platforms, but I am not certain that we should be setting this override in the config. I think realistically you want something more like num-test-threads / 3 rather than just 4.

.config/nextest.toml

EliteTK · 2025-12-18T17:36:08Z

Looks like it's only half working now...?

(after the change to using max-threads)

EliteTK · 2025-12-18T18:17:53Z

Okay, it looks like max-threads = N will limit the number of concurrent runs of that group of tests. But that limit only applies to that group. So if you're currently executing N threads worth of work from a group, nextest will fill num-test-threads - N worth of slots with other work, which isn't exactly what we want here...

Whereas, if you have a bunch of tests with threads-required = num-test-threads / N, they will prevent anything else from running until nextest reaches almost the end of that block of tests. (which is actually not exactly what we want either, but it's closer)

It would be good if nextest had a way of running these completely separately... I think for now maybe we can just use threads-required = 4...

EliteTK · 2025-12-18T19:05:05Z

Original results with an extra columns for max-threads and for serialising all the python_install tests:

Test	Current	Heavy	Change	max-threads	Change	Serial	Change
`install_lower_patch_automatically`	79.317s	20.093	-74.7%	37.393	-52.9%	40.937	-48.4%
`install_multiple_patches`	70.086s	20.126	-71.3%	61.599	-12.1%	12.926	-81.6%
`install_no_transparent_...`	47.971s	9.607	-80.0%	53.257	11.0%	5.193	-89.2%
`install_transparent_..._uv_venv`	74.886s	12.600	-83.2%	20.155	-73.1%	8.228	-89.0%
`install_transparent_..._venv_module`	57.888s	10.854	-81.2%	16.659	-71.2%	18.331	-68.3%
`python_find_prerelease`	78.424s	18.302	-76.7%	21.267	-72.9%	14.981	-80.9%
`python_install`	47.827s	4.229	-91.2%	6.091	-87.3%	3.238	-93.2%
`python_install_automatic`	71.138s	11.836	-83.4%	19.308	-72.9%	4.717	-93.4%
`python_install_build_version`	35.983s	3.214	-91.1%	6.757	-81.2%	1.910	-94.7%
`python_install_build_version_pypy`	80.750s	17.774	-78.0%	22.177	-72.5%	1.736	-97.9%
`python_install_cached`	6.752s	2.247	-66.7%	3.543	-47.5%	1.960	-71.0%
`python_install_default`	3.748s	4.202	12.1%	3.846	2.6%	2.599	-30.7%
`python_install_default_from_env`	2.829s	2.190	-22.6%	3.402	20.3%	1.860	-34.3%
`python_install_default_prerelease`	1.136s	0.911	-19.8%	1.799	58.4%	0.840	-26.1%
`python_install_default_preview`	5.709s	3.601	-36.9%	5.788	1.4%	3.421	-40.1%
`python_install_emulated_windows_x86_on_x64`	90.025s	35.652	-60.4%	48.517	-46.1%	3.824	-95.8%
`python_install_force`	1.523s	1.365	-10.4%	2.037	33.7%	1.041	-31.6%
`python_install_freethreaded`	36.136s	4.793	-86.7%	10.325	-71.4%	2.656	-92.6%
`python_install_invalid_request`	0.299s	0.269	-10.0%	0.802	168.2%	0.253	-15.4%
`python_install_minor`	1.496s	1.309	-12.5%	2.391	59.8%	1.255	-16.1%
`python_install_multiple_patch`	3.033s	1.415	-53.3%	3.268	7.7%	1.421	-53.1%
`python_install_no_cache`	2.853s	2.311	-19.0%	3.893	36.5%	1.865	-34.6%
`python_install_prerelease`	2.018s	2.006	-0.6%	3.546	75.7%	1.657	-17.9%
`python_install_preview`	40.594s	6.870	-83.1%	32.439	-20.1%	4.958	-87.8%
`python_install_preview_no_bin`	1.080s	1.177	9.0%	1.479	36.9%	0.881	-18.4%
`python_install_preview_upgrade`	6.609s	6.295	-4.8%	7.962	20.5%	3.882	-41.3%
`python_install_unknown`	0.211s	0.206	-2.4%	0.285	35.1%	0.197	-6.6%
`python_install_upgrade`	9.153s	10.648	16.3%	13.040	42.5%	7.188	-21.5%
`python_install_upgrade_version_file`	55.920s	12.943	-76.9%	33.699	-39.7%	2.494	-95.5%
`python_reinstall`	4.939s	6.225	26.0%	7.859	59.1%	3.988	-19.3%
`python_reinstall_patch`	2.716s	2.899	6.7%	4.737	74.4%	2.568	-5.4%
`python_upgrade_not_allowed`	0.215s	0.215	0.0%	0.281	30.7%	0.198	-7.9%
`regression_cpython`	30.779s	4.403	-85.7%	14.943	-51.5%	1.106	-96.4%
`uninstall_highest_patch`	40.716s	3.790	-90.7%	17.303	-57.5%	2.263	-94.4%
`uninstall_last_patch`	19.093s	2.965	-84.5%	4.293	-77.5%	1.674	-91.2%

Some of this could be noise, but some tests seem to be slow running when ran with a bunch of other tests, and some seem to be very slow only with other python_install tests. But most seem to fall into the latter group.

As a bonus, however, in a full run with no filters, there seems to be almost no overhead to running these all as serialised.

(Updated with change %)

zanieb · 2025-12-22T21:41:34Z

Unfortunately this will miss, e.g., your new test case in #17218

We should think about how we can do better in the future. Maybe we should even just write a tool that generates the nextest config from feature flagged tests, if we can?

EliteTK added the no-build Disable building binaries in CI label Dec 18, 2025

EliteTK force-pushed the tk/remedy_windows_ci_timeouts branch from 155f3fd to efcbc01 Compare December 18, 2025 17:07

EliteTK changed the title ~~Attempt to improve reliability of python_install tests~~ Improve python_install test reliability on Windows Dec 18, 2025

EliteTK added internal A refactor or improvement that is not user-facing and removed no-build Disable building binaries in CI labels Dec 18, 2025

EliteTK temporarily deployed to uv-test-registries December 18, 2025 17:10 — with GitHub Actions Inactive

EliteTK marked this pull request as ready for review December 18, 2025 17:24

EliteTK requested a review from zanieb December 18, 2025 17:24

zanieb reviewed Dec 18, 2025

View reviewed changes

.config/nextest.toml Outdated Show resolved Hide resolved

EliteTK marked this pull request as draft December 18, 2025 18:31

EliteTK temporarily deployed to uv-test-registries December 18, 2025 18:34 — with GitHub Actions Inactive

EliteTK added the windows Specific to the Windows platform label Dec 18, 2025

EliteTK force-pushed the tk/remedy_windows_ci_timeouts branch from b2db2ad to e724c2c Compare December 22, 2025 20:50

EliteTK changed the base branch from main to tk/ci-profiles December 22, 2025 20:51

EliteTK temporarily deployed to uv-test-registries December 22, 2025 20:52 — with GitHub Actions Inactive

EliteTK temporarily deployed to uv-test-publish December 22, 2025 20:53 — with GitHub Actions Inactive

EliteTK force-pushed the tk/remedy_windows_ci_timeouts branch from e724c2c to 405b315 Compare December 22, 2025 20:58

EliteTK had a problem deploying to uv-test-registries December 22, 2025 21:01 — with GitHub Actions Error

EliteTK mentioned this pull request Dec 22, 2025

Use nextest profiles to configure CI #17220

Merged

EliteTK force-pushed the tk/remedy_windows_ci_timeouts branch from 405b315 to 965d844 Compare December 22, 2025 21:02

EliteTK changed the title ~~Improve python_install test reliability on Windows~~ Improve reliability of certain managed python tests on Windows CI Dec 22, 2025

EliteTK temporarily deployed to uv-test-registries December 22, 2025 21:04 — with GitHub Actions Inactive

EliteTK force-pushed the tk/remedy_windows_ci_timeouts branch from 965d844 to a8d2eb2 Compare December 22, 2025 21:20

EliteTK temporarily deployed to uv-test-registries December 22, 2025 21:23 — with GitHub Actions Inactive

EliteTK marked this pull request as ready for review December 22, 2025 21:39

EliteTK requested a review from zanieb December 22, 2025 21:39

zanieb approved these changes Dec 22, 2025

View reviewed changes

EliteTK mentioned this pull request Dec 22, 2025

Find some way to give nextest a list of tests which are io bound from more sophisticated heuristics than the name #17223

Open

Base automatically changed from tk/ci-profiles to main December 29, 2025 13:13

Make disk intensive managed python tests run serially

4b9c43c

EliteTK force-pushed the tk/remedy_windows_ci_timeouts branch from a8d2eb2 to 4b9c43c Compare December 29, 2025 13:15

EliteTK temporarily deployed to uv-test-registries December 29, 2025 13:17 — with GitHub Actions Inactive

EliteTK enabled auto-merge (squash) December 29, 2025 13:19

EliteTK merged commit 936da00 into main Dec 29, 2025
101 checks passed

EliteTK deleted the tk/remedy_windows_ci_timeouts branch December 29, 2025 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve reliability of certain managed python tests on Windows CI#17177

Improve reliability of certain managed python tests on Windows CI#17177
EliteTK merged 1 commit intomainfrom
tk/remedy_windows_ci_timeouts

EliteTK commented Dec 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

EliteTK commented Dec 18, 2025 •

edited

Loading

Uh oh!

EliteTK commented Dec 18, 2025

Uh oh!

EliteTK commented Dec 18, 2025 •

edited

Loading

Uh oh!

zanieb commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EliteTK commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Conclusion

Uh oh!

Uh oh!

EliteTK commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EliteTK commented Dec 18, 2025

Uh oh!

EliteTK commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zanieb commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EliteTK commented Dec 18, 2025 •

edited

Loading

EliteTK commented Dec 18, 2025 •

edited

Loading

EliteTK commented Dec 18, 2025 •

edited

Loading