Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review case of MimeType.Type charsets #11741

Open
gregw opened this issue May 3, 2024 · 2 comments · Fixed by #12347
Open

Review case of MimeType.Type charsets #11741

gregw opened this issue May 3, 2024 · 2 comments · Fixed by #12347
Assignees
Labels
Bug For general bugs on Jetty side Enhancement

Comments

@gregw
Copy link
Contributor

gregw commented May 3, 2024

Jetty version(s)
12

Description

We generate charsets in lower case (e.g. text/html;charset=iso-8859-1), but when parsed by HttpParser, they end up as uppercase: text/html;charset=ISO-8859-1

We should be consistent with our cases

@gregw gregw added Enhancement Bug For general bugs on Jetty side labels May 3, 2024
@joakime
Copy link
Contributor

joakime commented Jul 11, 2024

Go with all lowercase. (this should apply to mime-types too)

From https://datatracker.ietf.org/doc/html/rfc9110#section-8.3.1

For example, the following media types are equivalent in describing
HTML text data encoded in the UTF-8 character encoding scheme, but
the first is preferred for consistency (the "charset" parameter value
is defined as being case-insensitive in [RFC2046], Section 4.1.2):

  text/html;charset=utf-8
  Text/HTML;Charset="utf-8"
  text/html; charset="utf-8"
  text/html;charset=UTF-8

@gregw gregw self-assigned this Oct 2, 2024
@gregw
Copy link
Contributor Author

gregw commented Oct 3, 2024

@joakime I think this is another whatwg inspired cluster-stuffup:

So again, rather than enforce standards, the whatwg again just normalises those that cannot follow simple specifications, and expects the world to follow!

But as you say, RFC9110 recommends we prefer lowercase, so I guess we follow!

gregw added a commit that referenced this issue Oct 3, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
@gregw gregw linked a pull request Oct 3, 2024 that will close this issue
gregw added a commit that referenced this issue Oct 4, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
gregw added a commit that referenced this issue Oct 4, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
gregw added a commit that referenced this issue Oct 6, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
gregw added a commit that referenced this issue Oct 6, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
gregw added a commit that referenced this issue Oct 6, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
gregw added a commit that referenced this issue Oct 16, 2024
Fix #11741 as per the WhatTFWG recommendations, use lower case for charset names.
Took the opportunity for some minor optimizations:
 + use the already made HttpField instance in MimeTypes.Type rather than create a new one in the HttpParser.CACHE
 + keep the MimeType.Type associated with the pre encoded Content-Type fields
@gregw gregw moved this to ✅ Done in Jetty 12.1.0 Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For general bugs on Jetty side Enhancement
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

2 participants