-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A '@' character in the host part of file URLs #805
Comments
It seems reasonable to allow, but I wonder if it would be possible for Chromium to determine the complete set of changes needed for it to not have platform-divergent behavior. At least I suspect that making them all at once would allow for an easier rollout. |
A single Let's take any other URL scheme, e.g. HTTP:
Which is clearly not what the reporter wants to happen. What's more, this has been the accepted interpretation for at least the last 30 years (going back to RFC-1738). I doubt many URL parsers are going to interpret I think the actual problem is that hostnames in file URLs are not able to contain percent-encoding. I looked in to this in depth a while back, and found that:
See #599 |
Please remove @ from the forbidden host code point list! Given rfc1738 it was probably a mistake that it was there in the first place. Just on this:
Firefox, Safari, windows explorer, linux terminals all handle the URL fine, in fact its only chromium based browsers that have the issue because they want to use the URL standard as their only authority, rather than use multiple path standards... |
Safari also rejects this URL, and the only reason it works in Firefox is that we currently ignore everything in the hostname part of a file URL (tracked in 1507354 - URL parser discards host for file URLs Allowing |
@valenting yes I think the problem is calling them file URLs, they are URL like but ultimately the OP (file://webdavserver.net@ssl/a.pdf) is a UNC file path, and currently its just a coincidence that Chromium works for most of them... |
@hayatoito any thoughts on @karwa's comment? It seems very plausible that the solution here is solving #599. |
Ah, my previous comment was premature. I'll retract it. Do you mean the solution provided at #805 (comment) (allowing percent encoding in file: URL host) is sufficient, and we don't need an opaque host in file: URLs? (#599) I don't have a strong opinion, but to my knowledge, unescaped spaces are still used in file: URLs on Windows. So I'm wondering how we should handle unescaped spaces. |
Oh I see, that is indeed confusing. I was thinking that we probably want to switch Now for non-percent-encoded spaces do they need to remain non-percent-encoded or would it be okay when we create the buffer which we pass to the host parser to percent-encode at that point? That way opaque hosts can still ban spaces, but they end up working for |
Thanks for the explanation. Let me confirm the intended visible behaviors of a proposal here. The question is A and (B or C), right? A. Don't percent-decode any characters in an opaque host. const url = new URL("file://opaque%2ahost/");
console.log(url.hostname); // "opaque%2ahost"
console.log(url.href); // "file://opaque%2ahost/" B. Don't percent-encode any characters in an opaque host. const url = new URL("file://opaque host/");
console.log(url.hostname); // "opaque host"
console.log(url.href); // "file://opaque host/" C. Percent-encode spaces (or other chars?) in an opaque host. const url = new URL("file://opaque host/");
console.log(url.hostname); // "opaque%20host"
console.log(url.href); // "file://opaque%20host/"
I'm afraid I might misunderstand what |
B is the only option that will actually end up solving the original WebDAV issue |
Right, opaque hosts don't do percent-decoding and don't have IDNA either. So A is what I would expect if we make that change. And if we don't want them to contain literal spaces we'd have to percent-encode those as in C. (B doesn't seem great, but in theory we could have a file opaque host or expand the value space of opaque hosts I suppose.) They could still be percent-decoded down the line of course depending on the protocol in use. @catmanjan Why can WebDAV deal with %40 but not %20? |
@annevk I don't think WebDAV can deal with either, when I say WebDAV I mean Microsoft's WebDAV mini-redirector software. In the original post the file URL is resolved on the client via the WebDAV mini-redirector, it requires the unencoded @ symbol to determine whether to use HTTPS. |
Percent-encoding is used to escape opaque strings, allowing them to contain any character (even those which conflict with URL syntax characters - those characters get escaped). In B, even if we can allow some dodgy things like unescaped spaces, we still can't allow every character, so actually C aligns more with "opaque". All software needs to unescape the URL component to read the actual content stored inside (percent encoding has no meaning at the application level). If the Microsoft software is not doing that, I'm inclined to say it's an application bug, albeit one that the standard has no solution for. |
Here is Microsoft's view on it: https://learn.microsoft.com/en-us/troubleshoot/windows-client/networking/url-encoding-unc-paths-not-url-decoded With options C, implementers (Chromium people) will just have to percent-decode the host themselves before they start trying to use it as a UNC |
@catmanjan that document is about the path, not the host. |
@annevk ok |
@karwa Could I confirm one more thing to ensure I understand completely? Non-special opaque path URLs don't percent-encode spaces: const url = new URL("git:opaque path");
console.log(url.href); // "git:opaque path" I assume the opaque host and opaque path should be handled differently, in terms of percent encoding, correct? We percent-encode spaces in file: URLs' hosts, but not in non-special opaque paths. |
Yeah, opaque paths (which you can only get for non-special URLs so non-special is kinda redundant, FWIW) are their own thing. Not percent-encoding those spaces there has been problem (#784), but it's probably not a compatible change to start encoding them. |
(Reported in https://crbug.com/1502849)
It appears that Windows uses file URLs with '@' (U+0040) characters in their host parts, such as
file://webdavserver.net@ssl/a.pdf
.However, according to my understanding,
file://webdavserver.net@ssl/a.pdf
is an invalid URL in the URL Standard because '@' is considered a forbidden host code point.To ensure compatibility with Windows file URLs, should we consider allowing the '@' character in the host part of file URLs?
I'd appreciate hearing opinions of the URL Standard folks on this matter.
The text was updated successfully, but these errors were encountered: