Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows device path and long-path meta issue. #9770

Open
ehuss opened this issue Aug 6, 2021 · 17 comments
Open

Windows device path and long-path meta issue. #9770

ehuss opened this issue Aug 6, 2021 · 17 comments
Labels
A-filesystem Area: issues with filesystems E-hard Experience: Hard O-windows OS: Windows S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@ehuss
Copy link
Contributor

ehuss commented Aug 6, 2021

This is a meta issue to coordinate the different issues related to handling device paths and long paths on Windows (such as \\?\ or \\.\). There are several places where Cargo does not handle these well, but it is not clear exactly how they all should be approached. Changes for these require careful consideration, and it's not clear what a general good approach would look like. Some rough thoughts to consider:

  • Where exactly are the problems? (Making a very clear overview would be an extremely helpful way to help here!)
  • To what degree should we strive for long-path support? Having the target directory exceed MAX_PATH seems like it would be quite difficult due to issues like Windows: Allow running processes whose path exceeds the legacy MAX_PATH rust#86406. Manifests require a registry setting that is off by default.
    • If we don't or can't support MAX_PATH paths, does it make sense to ever use device paths? Can they just be converted to normal paths and make Windows handle its regular normalization?
    • Is supporting long paths feasible without a manifest?
  • Should the fixes be primarily done to the standard library?
  • Should Cargo use an external library (like dunce), or should it all be internal? What should be done with normalize_path?
  • How to approach normalization/canonicalization? There are classic issues like whether to follow symlinks, but also the troubles of using's Rust's canonicalize function on Windows.
  • Should Cargo try to avoid device (aka verbatim) paths as much as possible?
  • Are there other ways to lean on Win32 normalization (like \\.\ or GetFullPathNameW)?
  • Should it try to use \\?\GLOBALROOT\ style paths (see Windows: Fix fs::canonicalize to work with legacy drivers rust#86447)?
  • Is it feasible to just translate \\?\ paths to \\.\, and rely on the Win32 API to do normalization? This would not support long-paths, but there are many other problems with long paths. (Probably not, just tossing out the idea.)

Linking issues and PRs:

@ehuss ehuss added O-windows OS: Windows A-filesystem Area: issues with filesystems labels Aug 6, 2021
@ChrisDenton
Copy link
Member

ChrisDenton commented Aug 16, 2021

Since my PR was linked here, I would add that I'd really love to fix this issues in the standard library so everyone can benefit by default. My thinking at the moment is that Rust should auto-convert to \\?\ style paths whenever a filesystem function is called. Also Display always should use the more natural C:\ style paths (where possible) for showing paths to users.

GetFullPathNameW will be very useful here but it'd also be good to be able to work directly with WTF-8 paths so there's not a need to convert to UTF-16 and back when that's not necessary.

Btw, in case it helps someone, I've started writing about Windows paths which attempts to go in to some detail. It's still a work in progress so sorry if there's any mistakes or anything is unclear.

@ehuss
Copy link
Contributor Author

ehuss commented Aug 16, 2021

Thanks for posting your writeup! I think it would be great to have a resource like that. Microsoft's own documentation is a little scattered and lacking, and having one place clearly describing things would be great. Let me know if you ever want feedback on it.

@ChrisDenton
Copy link
Member

Sure, I'd very much welcome feedback! I admit I mostly wrote it for myself which is why it's currently a "secret" gist so I'd appreciate any help in making more useful for others.

@dylni
Copy link

dylni commented Aug 23, 2021

@ehuss Is this just waiting on a decision from the Cargo team? I originally created normpath to fix these issues, but the team appears to be less certain now about how these issues should be addressed.

@ghost
Copy link

ghost commented Sep 3, 2021

@ChrisDenton Instead of fiddling around with NT paths and unnecessarily making life harder, why not use Embedded Manifests per executable ? Embedded Manifests were designed for a reason.

@ChrisDenton
Copy link
Member

@mshaikhcool Enabling the manifest option for long paths would be great! However, it has several limitations which means it doesn't solve all problems. It only works in Windows 10 version 1607 and newer. It requires the user to have admin rights and to change a registry entry. It doesn't fix the issue with broken drivers if you actually do want to resolve symlinks or get an absolute path.

@ghost
Copy link

ghost commented Sep 4, 2021

It requires the user to have admin rights and to change a registry entry.

The person who will be installing Rust in the first place will most likely be A Programmer, so this point is moot.
there's a good reason why longPathAware is not enabled by default and why explorer.exedoesn't embed longPathAware in it's manifest. Don't expect it to be enabled by default in a near future.

It only works in Windows 10 version 1607 and newer.

Ah, Classic. Symlinks were introduced in Vista. By that logic, we should not also use symlinks because XP didn't support it.

The point here is, Why are you even bothering in supporting an Out of Extended Life Support OS (like Windows 7) in the first place ?
Set a check to see if windows version and the registry key enabled or not. if off eg: in win 7, it should not work.

It doesn't fix the issue with broken drivers if you actually do want to resolve symlinks or get an absolute path.

NT UNC(\\?\) is designed to be used by Subsystems themselves and Drivers, not user mode programs. longPathAware in both Embedded and Side by Side type Manifests are designed to be used by win32 subsystem's user mode programs, not Drivers.

Both has different purposes, Rust should implement both as a system programming lang.
Rust should not call undocument apis and should adhere to strict-clean programming principles.

@ChrisDenton
Copy link
Member

As I said, using the longPathAware manifest option is great. It helps with a lot problems and should be done when possible.

However, it alone does not fix all the issues here nor in all circumstances. So other solutions need to be explored as well. Nobody is talking about using undocumented APIs. The use of \\?\ style paths is documented for every function that accepts them (e.g. CreateFileW).

@ghost
Copy link

ghost commented Sep 4, 2021

I think there's a fundamental misunderstanding here . Use NT UNC (\\?\) for Drivers , longPathAware for win32 subsystem's user mode programs.

Cargo or rustc themselves are user mode programs , they should implement longPathAware either in embedded or side by side type manifest for themselves . and should have support for NT UNC (\\?\) for building Drivers.

I hope it's clear now.

@ChrisDenton
Copy link
Member

I would suggest reading Win32 File Namespaces because it feels like we're talking about different things.

@ghost
Copy link

ghost commented Sep 4, 2021

can you point me exact win32 apis/scenarios where Manifest file couldn't work but works otherwise ?

@ChrisDenton
Copy link
Member

The manifest option is not sufficient to solve all issues listed here. For example:

  • It does not enable long paths if the user cannot or will not enable long path awareness in the registry (e.g. OS too old, IT policies, etc).
  • Even with the manifest option, fs::canonicalize returns \\?\ prefixed paths so Rust needs to be able to understand them and, ideally, have a way to convert them to more normal looking paths for display or other uses.
  • Ditto for user supplied paths which can be in any form (even explorer accepts \\?\ style paths).
  • fs::canonicalize can also completely fail with certain RAM drives. The manifest is no help here because that's a different problem.

@ghost
Copy link

ghost commented Sep 4, 2021

It does not enable long paths if the user cannot or will not enable long path awareness in the registry (e.g. OS too old, IT policies, etc).

this point is already moot because of Rust's potential usecases. not sure why it's being thrown around each time. Note that : Windows will always be a backward-compatible OS by default. Users must perform changes by themselves to make it forward-compatible.

To use long paths Users must use windows 10 and must enable group policy or registry. Note: this is the Microsoft recommended way.
Going against MS's own recommendation does indeed sound like a Desperate Excuse to say "Won't Do".

Even with the manifest option, fs::canonicalize returns \\?\ prefixed paths

this is absolutely horrible implementation. the amount existing softwares break because of UNC, including Microsoft's owns is enough big of a reason to abandon UNC and the "the Excuse" presented above.

there are already proposals for abandoning that in favor of returning win32 absolute paths.

It doesn't fix the issue with broken drivers if you actually do want to resolve symlinks or get an absolute path.

Ah, now realized where the confusion is. this doesn't fix this issue is based on above horrible UNC return implementation which itself is wrong to begin with.

Ditto for user supplied paths which can be in any form

manifest just removes hard coded static buffer size from *W functions. the rest behaviors are unchanged.

(even explorer accepts \\?\ style paths).

Explorer doesn't accept \\?\ paths, it Simply ignores supplied \\?\ prefix.
explorer simply converts this \\?\C:\VeryLong255CharPath\VeryLong255Foo\ to 8.3 Path format c:\VERYLO~1\VERYLO~2 for it to access.

fs::canonicalize can also completely fail with certain RAM drives.

I guess you meant RAM Disk by that. creating RAM Disk requires a KMDF Device Driver, Device Drivers use win32 device paths \\.\ then making symlink to win32 file path \\?\ for user mode applications to access.
Manifests Files both Embedded/Fusion or Side By Side (Foo.exe.manifest file) types should work as expected for RAM Disks too.

certain RAM Disk sounds like the software in question's KMDF Driver bug, We shouldn't have to cripple Rust for that.

The manifest is no help here because that's a different problem.

the Manifest files has to do with win32 file paths \\?\ and has nothing to do with win32 device paths \\.\
Most Windows API doesn't take \\.\ device paths as parameters as they already can access devices through
symlinked \\?\ paths.

Now with Manifest support , one doesn't even need to attach \\?\ prefix or deal with UNC Path handling complexities anymore.

@jessesna
Copy link

jessesna commented Dec 13, 2021

NT UNC(\?) is designed to be used by Subsystems themselves and Drivers, not user mode programs.

Hi. Can someone point me to the origin of this statement? Are there any MS docs which can be linked?

Explorer doesn't accept \?\ paths, it Simply ignores supplied \?\ prefix. explorer simply converts this \?\C:\VeryLong255CharPath\VeryLong255Foo\ to 8.3 Path format c:\VERYLO1\VERYLO2 for it to access.

Maybe i'm doing it wrong, or has this changed in newer Windows Versions?

image

@ChrisDenton
Copy link
Member

Here's a brief guide Windows paths, some of the issues involved and what the standard library has done and is doing to address them. None of this is cargo specific but I hope it helps nonetheless. I'll try to keep this short but I fear I might fail.

Terminology cheat sheet

Path Term
C:\path\to\file Drive path
\\server\share UNC path
\\.\PIPE\name Device path (used for pipes, printers, etc)
\\?\C:\path\to\file
\\?\UNC\server\share
\\?\PIPE\name
Verbatim paths
\??\C:\path\to\file
\Device\HarddiskVolume2\path\to\file
NT kernel paths (not used in Win32 APIs)

NT kernel paths are what both verbatim and the non-verbatim paths end up as but aren't otherwise usable in most user space APIs. So when I say "non-verbatim paths" I mean the first three paths in the table and not including kernel paths.

Verbatim paths

Verbatim paths are passed almost directly to the kernel (except \\?\ is changed to \??\). These are always absolute and don't contain . or .. components because those will simply be treated as normal components (e.g. . is a perfectly valid file or directory name according to the kernel, though most filesystem drivers will probably reject it). Also / is not a path separator; in fact everything except \ is not special in any way.

The term "verbatim" is not official terminology but it's the one used by the Rust standard library for lack of an official name.

Non-verbatim paths

Unlike a verbatim path, other paths are subject to limits (such as MAX_PATH, unless a manifest is used) and are parsed in more complex ways. Parsing of non-verbatim paths includes (but is not limited to):

  • If a drive path ends with a special DOS device name, it will be turned into a device path. E.g. C:\path\to\aux.txt will become the NT kernel path \??\aux.
  • If the path includes any . or .. components, these will be resolved lexically. Or to put it another way, resolving .. doesn't read a symlink. It simply does the equivalent of PathBuf::pop().
  • It will trim any trailing . and (space) from the path.
  • Any / will be converted to \ and consecutive \'s will be collapsed into one \.

Path Issues

  • Non-verbatim paths are subject to the legacy maximum path limit (usually 260 UTF-16 code units but sometimes 248).
  • DOS device file names will be converted to a device path when used. So you may end up opening, say, a console handle instead of a file. This can be a particular problem when moving paths from a *nix system.
  • std::fs::canonicalize returns verbatim paths which path handling routines can sometimes struggle to deal with. As discussed above, . and .. components should not appear in verbatim paths and / won't be automatically converted to \.

Filesystem issue

std::fs::canonicalize can fail if the root drive's driver does not implement a necessary kernel interface. This is normally not an issue but there is at least one popular RAM drive software that uses such a broken driver and is a reliable source of bug reports (not just for Rust applications).

Rust standard library

Rust's standard library is addressing these issues in a number of ways:

Outstanding issues

The standard library does not provide a public API to convert between verbatim and non-verbatim paths. Currently the best option would be to use a third party crate for this.

The current directory is always limited by MAX_PATH unless a manifest file is used and the user opts in to enabling long path support. This cannot be fixed by the standard library itself because verbatim paths do not work for the get/set current directory APIs (or rather, they technically work but other Windows APIs will get very confused by it).

@bjorn3
Copy link
Member

bjorn3 commented Jan 10, 2022

It does not enable long paths if the user cannot or will not enable long path awareness in the registry (e.g. OS too old, IT policies, etc).

this point is already moot because of Rust's potential usecases. not sure why it's being thrown around each time. Note that : Windows will always be a backward-compatible OS by default. Users must perform changes by themselves to make it forward-compatible.

How is enabling long path awareness when the application manifest enables it but the register doesn't not backwards-compatible? If the apllication itself opts in, why is there an additional system wide opt in necessary for backwards compatibility?

@ehuss
Copy link
Contributor Author

ehuss commented Jan 10, 2022

If the apllication itself opts in, why is there an additional system wide opt in necessary for backwards compatibility?

I think we can only guess, I haven't seen any explanation from Microsoft. I suspect it is because other programs may fail to access those paths. For example, I believe when it was first added, Explorer couldn't handle those long paths. It introduces an environment where various programs would suddenly start breaking in unpleasant ways when interacting with programs that are long-path aware.

It could also be a security issue similar to how symbolic links are restricted.

ricochet added a commit to wasmCloud/wasmCloud that referenced this issue Feb 9, 2024
Verbatim paths on Windows are not well supported,
e.g. "\\\\?\\C:\\Users..." while technically valid, causes some fs api's like `exists` to fail errantly.

The fix is to use a third party lib normpath to normalize the path to the wasm
binary.

Related issue: rust-lang/cargo#9770

Signed-off-by: Bailey Hayes <[email protected]>
ricochet added a commit to wasmCloud/wasmCloud that referenced this issue Feb 12, 2024
Verbatim paths on Windows are not well supported,
e.g. "\\\\?\\C:\\Users..." while technically valid, causes some fs api's like `exists` to fail errantly.

The fix is to use a third party lib normpath to normalize the path to the wasm
binary.

Related issue: rust-lang/cargo#9770

Signed-off-by: Bailey Hayes <[email protected]>
Co-authored-by: Victor Adossi <[email protected]>
connorsmith256 pushed a commit to connorsmith256/wasmCloud that referenced this issue Feb 12, 2024
Verbatim paths on Windows are not well supported,
e.g. "\\\\?\\C:\\Users..." while technically valid, causes some fs api's like `exists` to fail errantly.

The fix is to use a third party lib normpath to normalize the path to the wasm
binary.

Related issue: rust-lang/cargo#9770

Signed-off-by: Bailey Hayes <[email protected]>
Co-authored-by: Victor Adossi <[email protected]>
brooksmtownsend pushed a commit to wasmCloud/wasmCloud that referenced this issue Feb 13, 2024
Verbatim paths on Windows are not well supported,
e.g. "\\\\?\\C:\\Users..." while technically valid, causes some fs api's like `exists` to fail errantly.

The fix is to use a third party lib normpath to normalize the path to the wasm
binary.

Related issue: rust-lang/cargo#9770

Signed-off-by: Bailey Hayes <[email protected]>
Co-authored-by: Victor Adossi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-filesystem Area: issues with filesystems E-hard Experience: Hard O-windows OS: Windows S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests

6 participants