fix `OsStr` conversion for non-utf8 strings on windows #5444

bschoenmaeckers · 2025-09-17T10:39:30Z

While testing OsStr conversions with ill-formed UTF-16 strings on windows, I found out that the current implementation is wrong.

When wstr is NULL, instead return the size that would be required to store all of unicode including a terminating null.

Note that the resulting wchar_t* string may or may not be null-terminated. It is the responsibility of the caller to make sure that the wchar_t* string is null-terminated in case this is required by the application. Also, note that the wchar_t* string might contain null characters, which would cause the string to be truncated when used with most C functions.

PyUnicode_AsWideChar returns the size including the null byte, but does not copy the null byte. This fails the assert in this function. Furthermore we want the byte slice without the null byte.

ref #5368

davidhewitt · 2025-09-18T11:04:54Z

newsfragments/5444.fixed.md

I'm trying to assess if this is better to class as a "fix" or a breaking change. At the moment I guess the result is that we always include a trailing NUL on the extracted string, only on Windows?

So this is a good consistency fix, to match the other platforms, but it also might cause issues in the worst case? 🤔

The current case is that it can only succeed when the string is valid utf8. Because it will take the shortcut using utf8 conversion. When the string is WTF16 it will continue and use PyUnicode_AsWideChar. But this currently never succeeds because the assertion assert_eq!(bytes_read, size); always fails (bytes_read does not contain the null byte but size does). So this is not a breaking change.

I guess no one is hitting this assert because valid utf8 is the common case.

davidhewitt

Makes sense, thanks!

bschoenmaeckers force-pushed the os-str-conversions branch from 05cf958 to 272e08b Compare September 17, 2025 10:41

fix OsStr conversion for non-utf8 strings on windows

04a8dc9

bschoenmaeckers force-pushed the os-str-conversions branch from 272e08b to 04a8dc9 Compare September 17, 2025 10:50

davidhewitt reviewed Sep 18, 2025

View reviewed changes

davidhewitt approved these changes Sep 18, 2025

View reviewed changes

davidhewitt added this pull request to the merge queue Sep 18, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 18, 2025

bschoenmaeckers added this pull request to the merge queue Sep 18, 2025

bschoenmaeckers removed this pull request from the merge queue due to a manual request Sep 18, 2025

davidhewitt added this pull request to the merge queue Sep 19, 2025

Merged via the queue into PyO3:main with commit 5e10ea1 Sep 19, 2025
45 checks passed

bschoenmaeckers deleted the os-str-conversions branch September 20, 2025 16:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix `OsStr` conversion for non-utf8 strings on windows #5444

fix `OsStr` conversion for non-utf8 strings on windows #5444

Uh oh!

bschoenmaeckers commented Sep 17, 2025 •

edited

Loading

Uh oh!

davidhewitt Sep 18, 2025

Uh oh!

bschoenmaeckers Sep 18, 2025

Uh oh!

bschoenmaeckers Sep 18, 2025

Uh oh!

davidhewitt left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix OsStr conversion for non-utf8 strings on windows #5444

fix OsStr conversion for non-utf8 strings on windows #5444

Uh oh!

Conversation

bschoenmaeckers commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidhewitt Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

bschoenmaeckers Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

bschoenmaeckers Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

davidhewitt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix `OsStr` conversion for non-utf8 strings on windows #5444

fix `OsStr` conversion for non-utf8 strings on windows #5444

bschoenmaeckers commented Sep 17, 2025 •

edited

Loading