-
Notifications
You must be signed in to change notification settings - Fork 38.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add URL Parser for RFC 3986 #33639
Labels
in: web
Issues in web modules (web, webmvc, webflux, websocket)
type: enhancement
A general enhancement
Milestone
Comments
rstoyanchev
added
in: web
Issues in web modules (web, webmvc, webflux, websocket)
type: enhancement
A general enhancement
labels
Oct 3, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
An example of this can be found in RFC 2732, but it is obsoleted by RFC 3986 whose syntax for IPv6address does not allow dots. Also, Appendix D of RFC 3986: As [RFC2732] defers to [RFC3513] for definition of an IPv6 literal address, which, unfortunately, lacks an ABNF description of IPv6address, we created a new ABNF rule for IPv6address that matches the text representations defined by Section 2.2 of [RFC3513]. See gh-33639
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
isUnreserved and isSubDelimiter are usually checked together. It helps to have a shortcut with an efficient lookup. See gh-33639
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
rstoyanchev
added a commit
that referenced
this issue
Oct 7, 2024
spencergibb
added a commit
to spring-cloud/spring-cloud-gateway
that referenced
this issue
Oct 11, 2024
ryanjbaxter
added a commit
to spring-cloud/spring-cloud-config
that referenced
this issue
Oct 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
in: web
Issues in web modules (web, webmvc, webflux, websocket)
type: enhancement
A general enhancement
Before 6.2,
UriComponentsBuilder
used regex expressions. Generally, they split on the main component delimiters,":"
,"/"
,"?"
, and"#"
, but did not enforce deviations from the allowed character set by component. The resultingUriComponents
can then encode any non-conforming characters.Regular expressions are convenient, but provide limited control and visibility. This is why in #32513 we added an implementation of the URL parsing algorithm from the WhatWg URL Living Standard that browsers use to align on how to handle a wide range of cases leniently. While this provides more robust parsing than before, arguably on a server we can expect URLs that don't deviate from the RFC quite as far as what browsers need to be able to handle.
We can add a new parser that follows RFC syntax along the lines of the
java.net.URI
or Jetty'sHttpUri
parsers. The new parser should respect the main component delimiters, but otherwise leave some room for leniency within each component to allow some characters like spaces or curly braces (URI variables), similar to what the regex expressions did.UriComponents
can then encode any non-confirming characters that remain after URI variables are expanded.It should be possible to choose which parser to use, RFC or the WhatWG, when more leniency or alignment with browsers is needed.
The topic of RFC vs WhatWG parsing was first brought up by @joakime in #33542. For broader context, and possible future effort to standardize lenient parsing of user provided URLs, see https://lists.w3.org/Archives/Public/ietf-http-wg/2024JulSep/0281.html.
The text was updated successfully, but these errors were encountered: