navigate() should not use parse a URL #9708

annevk · 2023-09-08T13:05:43Z

I think we should make new APIs use UTF-8 exclusively.

(But maybe don't fix it right away as I'll refactor parse a URL first.)

domenic · 2023-09-08T13:43:56Z

I think I support this, but I would feel better if this was not literally the first such API. I think we have plenty of other precedents right? Like fetch() maybe?

I wonder if implementations are generally sloppy about this... might be worth writing tests for a lot of newer APIs.

annevk · 2023-09-08T14:14:50Z

Yeah, everything in Fetch should be good, but it's probably buggy. I suspect we can still get some of that changed. web-platform-tests/wpt#4934 (comment) has my progress on the general topic (with help from you, @zcorpan, and others for review and such).

annevk · 2023-09-09T11:31:10Z

Seems to be mostly good: web-platform-tests/wpt#41894.

domenic · 2023-09-10T07:32:14Z

Tests for some of the others in https://dontcallmedom.github.io/webdex/u.html#URL%20parser%40%40url%25%25dfn , e.g. new WebSocket(), would probably be worthwhile too.

I was thinking as part of this change we should add a note to "parse a URL" and "parse and serialize a URL". Something like:

These algorithms are to be used when deriving URLs from HTML content, or for legacy JavaScript APIs. HTTP-level URL processing, or URL processing done by newer JavaScript APIs, use the URL parser and URL serializer directly, so as to avoid changing results depending on the Document's encoding.

annevk · 2023-09-10T07:43:12Z

My thinking is that the next step is changing the name of these algorithms to "legacy" and add new algorithms that take a document/environment, but don't take the encoding into account. As it's much nicer to use an algorithm like this than grab the base URL from somewhere. (Exporting the base URL directly was probably a mistake?)

And I wonder if we can even refactor further to never have an encoding of an environment as workers are supposed to be always UTF-8.

annevk · 2023-09-10T07:48:59Z

We have coverage for WebSocket already, see https://wpt.fyi/results/html/infrastructure/urls/resolving-urls/query-encoding. Perhaps it should be moved out of that bigger test, but that's a separate task.

domenic · 2023-09-10T07:49:37Z

Sounds reasonable, although I don't think we should necessarily make new HTML attributes always use UTF-8, so "legacy" might be a bit strong.

annevk · 2023-09-10T07:59:13Z

Fair, maybe "parse a URL" and "HTML-parse a URL", although I also would not necessarily mind moving away from it more. Mostly you have to parse eagerly anyway and after that it no longer matters so it's quite easy to switch for new things.

Tests: ... Fixes #9708.

annevk added the topic: navigation label Sep 8, 2023

annevk mentioned this issue Sep 8, 2023

/html/infrastructure/urls/resolving-urls/query-encoding/* are disabled in Mozilla and Chromium web-platform-tests/wpt#4934

Open

annevk added a commit that referenced this issue Sep 19, 2023

Make navigate() use UTF-8 in the URL parser

4e5aafd

Tests: ... Fixes #9708.

annevk mentioned this issue Sep 19, 2023

Make navigate() use UTF-8 in the URL parser #9756

Merged

4 tasks

domenic closed this as completed in #9756 Jul 25, 2024

domenic closed this as completed in 4f3ac96 Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

navigate() should not use parse a URL #9708

navigate() should not use parse a URL #9708

annevk commented Sep 8, 2023

domenic commented Sep 8, 2023

annevk commented Sep 8, 2023

annevk commented Sep 9, 2023

domenic commented Sep 10, 2023

annevk commented Sep 10, 2023 •

edited

Loading

annevk commented Sep 10, 2023

domenic commented Sep 10, 2023

annevk commented Sep 10, 2023

navigate() should not use parse a URL #9708

navigate() should not use parse a URL #9708

Comments

annevk commented Sep 8, 2023

domenic commented Sep 8, 2023

annevk commented Sep 8, 2023

annevk commented Sep 9, 2023

domenic commented Sep 10, 2023

annevk commented Sep 10, 2023 • edited Loading

annevk commented Sep 10, 2023

domenic commented Sep 10, 2023

annevk commented Sep 10, 2023

annevk commented Sep 10, 2023 •

edited

Loading