Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix collection of short paths on Windows #11936

Merged
merged 5 commits into from
Feb 23, 2024

Conversation

nicoddemus
Copy link
Member

@nicoddemus nicoddemus commented Feb 5, 2024

Passing a short path in the command line was causing the matchparts check to fail, because Path(short_path) != Path(long_path).

Using os.path.samefile as fallback ensures the comparison works on Windows when comparing short/long paths.

Fix #11895

@nicoddemus nicoddemus force-pushed the 11895-short-path-win branch 3 times, most recently from 1024598 to f52d637 Compare February 5, 2024 21:47
@nicoddemus nicoddemus changed the title Add test for #11895 Fix collection of short paths on Windows Feb 5, 2024
@nicoddemus nicoddemus marked this pull request as ready for review February 5, 2024 22:18
@nicoddemus
Copy link
Member Author

The initial commit was used just to demonstrate the failing test on CI, however we should squash/merge this in the end.

@bluetech
Copy link
Member

bluetech commented Feb 6, 2024

Each call to samefile adds 2 stats, and stats are generally reputed to cause slowness. I wonder then, is it possible in Python to "expand" a short path to a long path, then we can run it on the inputs and proceed as usual, instead of using samefile?

@nicoddemus
Copy link
Member Author

I wonder then, is it possible in Python to "expand" a short path to a long path, then we can run it on the inputs and proceed as usual, instead of using samefile?

Should be possible, I will look into it, however I think it will change the resulting node ids.

@nicoddemus nicoddemus force-pushed the 11895-short-path-win branch from 1186137 to 7ed44fb Compare February 9, 2024 21:08
@nicoddemus
Copy link
Member Author

Done, it was a bit more code but definitely won't impact performance.

@nicoddemus
Copy link
Member Author

@bluetech just stumbled into this:

# Implement a special _is_same function on Windows which returns True if the two filenames
# compare equal, to circumvent os.path.samefile returning False for mounts in UNC (#7678).
if sys.platform.startswith("win"):
def _is_same(f1: str, f2: str) -> bool:
return Path(f1) == Path(f2) or os.path.samefile(f1, f2)
else:
def _is_same(f1: str, f2: str) -> bool:
return os.path.samefile(f1, f2)

This suggests we might be not considering other cases if we try to just expand the short names during collection. Perhaps we should bite the bullet and just go with the stat solution? Less code and will be the safest one, while I understand there's the performance concern.

@nicoddemus
Copy link
Member Author

@bluetech what do you think regarding my last point?

@bluetech
Copy link
Member

@nicoddemus I'll try to make an informed review, but if I take too long (like before 8.0.2) then please don't wait for me, I don't mean to block this.

@nicoddemus
Copy link
Member Author

nicoddemus commented Feb 17, 2024

I used https://github.com/nicoddemus/collection-test-bed to measure the performance so we can have some numbers to decide.

Using that configuration we have 10.5k tests 4 levels deep, with test files containing simple parametrized tests, here are the numbers in my Windows using pytest --co:

branch timing
main 8.71s
stat solution 9.23s
convert to long solution 8.70s

So indeed the stat call has some performance penalty, as expected. Keep in mind however that this is for a test suite which does not import anything else, it only imports pytest itself.

Testing in a more realist test suite from a project I have around with 1.3k tests, I couldn't measure any difference between the stat aproach and main, both clocking around the same 6.6s for the full collection.

So while we can see some difference in a synthetic test suite, with 10k tests but without any other imports, using a realistic test suite, with 1.3k tests and which imports application code, the difference is not perceptible at all.

I'm leaning towards the stat solution as it is simpler and also more correct in my opinion.

Passing a short path in the command line was causing the matchparts check to fail, because ``Path(short_path) != Path(long_path)``.

Using ``os.path.samefile`` as fallback ensures the comparsion works on Windows when comparing short/long paths.

Fix pytest-dev#11895
@nicoddemus nicoddemus force-pushed the 11895-short-path-win branch 4 times, most recently from 7b04075 to abb43a6 Compare February 17, 2024 21:17
Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and I also favor the stat solution - correctness is definitely worth it IMO.

@nicoddemus
Copy link
Member Author

Going ahead and merge this then, we can revisit later if needed. 👍

@nicoddemus nicoddemus merged commit 8d9b95d into pytest-dev:main Feb 23, 2024
24 checks passed
@nicoddemus nicoddemus deleted the 11895-short-path-win branch February 23, 2024 10:51
mrbean-bremen added a commit to mrbean-bremen/pytest that referenced this pull request Mar 3, 2024
The check for short paths under Windows via os.path.samefile, introduced in pytest-dev#11936, also found similar tests in symlinked tests in the GH Actions CI.
This checks additionally that one of the files is not a symlink.

Fixes pytest-dev#12039.
mrbean-bremen added a commit to mrbean-bremen/pytest that referenced this pull request Mar 3, 2024
The check for short paths under Windows via os.path.samefile, introduced in pytest-dev#11936, also found similar tests in symlinked tests in the GH Actions CI.
This checks additionally that one of the files is not a symlink.

Fixes pytest-dev#12039.
mrbean-bremen added a commit to mrbean-bremen/pytest that referenced this pull request Mar 3, 2024
The check for short paths under Windows via os.path.samefile, introduced in pytest-dev#11936, also found similar tests in symlinked tests in the GH Actions CI.
This checks additionally that one of the files is not a symlink.

Fixes pytest-dev#12039.
mrbean-bremen added a commit to mrbean-bremen/pytest that referenced this pull request Mar 3, 2024
The check for short paths under Windows via os.path.samefile, introduced in pytest-dev#11936, also found similar tests in symlinked tests in the GH Actions CI.
This checks additionally that one of the files is not a symlink.

Fixes pytest-dev#12039.
nicoddemus added a commit that referenced this pull request Mar 3, 2024
The check for short paths under Windows via os.path.samefile, introduced in #11936, also found similar tests in symlinked tests in the GH Actions CI.

Fixes #12039.

Co-authored-by: Bruno Oliveira <[email protected]>
flying-sheep pushed a commit to flying-sheep/pytest that referenced this pull request Apr 9, 2024
Passing a short path in the command line was causing the matchparts check to fail, because ``Path(short_path) != Path(long_path)``.

Using ``os.path.samefile`` as fallback ensures the comparsion works on Windows when comparing short/long paths.

Fix pytest-dev#11895
flying-sheep pushed a commit to flying-sheep/pytest that referenced this pull request Apr 9, 2024
The check for short paths under Windows via os.path.samefile, introduced in pytest-dev#11936, also found similar tests in symlinked tests in the GH Actions CI.

Fixes pytest-dev#12039.

Co-authored-by: Bruno Oliveira <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Short paths in Windows seem to fail to be collected
3 participants