-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Move lower casing of headers from parser to valve #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 9.0.x
Are you sure you want to change the base?
Move lower casing of headers from parser to valve #893
Conversation
-1 (veto) to the proposed solution. If header lookup was case-sensitive, this solution would break case insensitive header lookup from the point the header is parsed until processing reaches the new Valve. There are a lot of header look ups in that time and this proposal would break those look ups. However...
Hmm. That too goes all the way back to the original Tomcat 4.1.x code. The proposed solution is still the wrong solution but mostly because it is based on incorrect information. Getting back to the original problem - preserving the case of the HTTP request header names as provided by the client - it looks removing the conversion to lowercase is the only change that is required. The header names are only returned in two places in Tomcat. The first is in response to a call to The second is in the validation of an HTTP/2 request. Removing the forcing to lowercase from HTTP/1.1 processing would not affect HTTP/2. I would be against implementing any change simply to benefit broken clients/libraries. However, looking at this issue has shown that there is potential for some performance optimisations. And those are always of interest. There appears to be two different optimisations that could be applied. Option A looks to be the simpler solution to implement but only removes a single mixed-case to lowercase conversion. Option B may provide more opportunity for performance gains. There is a greater risk of regressions with option B since all response header names would need to be forced to lowercase. Again, any issues with the case of the HTTP header names would be with the client/library not being HTTP/1.1 compliant. My current thinking is that Tomcat 9.0.x to 11.0.x could implement option A whereas option B could be explored for Tomact 12 onwards. I'd like to have an idea of the performance improvements for both before any implementation changes though. I'll do some experimenting. |
I'm also -1 for this approach. A more efficient equalsIgnoreCase could certainly be implemented, though there a bunch of gotchas with regard to i18n which make offloading to String.toLowerCase attractive. Creating two new String objects every time a comparison is done is certainly wasteful. I believe even String.toLowerCase applies all those complicated rules. Looking at MimeHeaders and MessageBytes, I don't see where String.toLowerCase is being called. Should I be looking elsewhere? The only place I see forcing lower-case is in Http11InputBuffer, and that's done once (during parsing) using as efficient an algorithm I can think of. Back to the original request: preserving the original letter-casing of the HTTP headers... I fail to see the use-case. I'm happy to hear what yours @maxfortun actually is, but your original post is very vague. The only way to preserve the original case of the headers and improve comparison speed would be to store the string in two different formats: the I agree that removing the (single) to-lower-case operation will achieve the goals of this PR / feature request. But I really want to understand the use-case. I haven't seen a modern web server that doesn't force the header names to lower case. I also think that, should we continue to force everything to lower case, we should take advantage of the minor speedup we'll get from performing .equals instead of .equalsIgnoreCase. I tend to think that doing lower case is the right move, here. I'm sorry @maxfortun but you'll need to provide a much better justification for supporting case-preservation if you want us to consider this. |
It is all |
@markt-asf , @ChristopherSchultz thank you both for in-depth responses. And I completely agree that simply removing forced lower case is sufficient for addressing original case preservation issue. The valve proposal was for those cases where one might actually want to force lower case. My use case is an Apache ActiveMQ broker backing a pipeline with many existing Apache Camel apps publishing messages with headers to and subscribing from the broker. These Camel apps are mainly a routing plane that internally posts the actual messages to micro-services backed by Apache Tomcat, and other frameworks, like ExpressJS and what not. The way such pipelines usually work, as the content traverses the pipeline the headers between requests are preserved and passed from one pipeline stage to the next. Apache Camel, Apache ActiveMQ, and all the other components in the pipeline preserve the headers just fine, so an X-Request-Id header remains an X-Request-Id throughout the pipeline. Currently we are trying to add a legacy Tomcat backed service into the middle of the pipeline, and while we can lookup X-Request-Id if the servlet needs it, we can't do a pass-through for all headers, as that would mangle X-Request-Id into x-request-id and all existing downstream stages of the pipeline will no longer recognize the header, since downstreams treat headers in case-sensitive manner. Changing hundreds of applications to switch to lower case headers is not really feasible, so the cheapest thing for us is to patch http1 parser in tomcat for now. Again, thinking of Tomcat in context of vast eco-systems where Tomcat integrates with many other apps should be a good enough motivation for high fidelity pass-through processing without mangling inputs and thus losing that fidelity, thus losing the ability to play nice with other apps. |
I saw those. My comment was that no new String objects are created. Instead, the MessageBytes class implements that method in a way very similar to String.equalsIgnoreCase when only latin1 characters are being used. That is, it's relatively quick and doesn't generate any garbage. |
@maxfortun You need to start reporting bugs to the components of that pipeline that don't treat HTTP headers as case insensitive. At some point you or someone else is going to need to use HTTP/2 and that will force all HTTP headers to lower case. |
@markt-asf thank you for considering this. While we do intend to switch to http2 at some point, it is a long time in the future for us. We have thousands of apps, so there is no quick way about it for now. |
I took a look at moving all of Tomcat's internals to lowercase for HTTP headers. My expectation was that there would be either no impact or a small performance improvement. The actual results were a small performance degradation. I'm not entirely sure why - it merits further investigation at some point. Regarding the removal of the current forcing to lower case for request headers, that should be minimal impact but I want to test that too before making any changes. |
There is a very old Bugzilla issue for this already 58464.
If you think of tomcat as an app by itself, and not as a part of a larger ecosystem of apps, it may be ok to claim that tomcat is opinionated on the subject of keeping all the headers lower cased. However, in enterprise environments, like my own, there are hundreds of apps all talking to one another, especially if async pipelines, like the ones powered by activemq or kafka, are involved. In this case preserving access to the original header spelling may be important, as there are maybe hundreds of legacy applications that are not ready to be refreshed to fully support lower cased headers yet. Yes, it is 2025, and yes, it is still relevant. Legacy is legacy. It would be nice if tomcat offered access to the original spelling of the headers. While http1 spec does require case agnostic treatment, it doesn't require losing original spelling of the http headers. I am proposing to move case processing from the Http11InputBuffer into its own valve, and have it included into the default configuration. This should retain full backward compatibility to current deployments and actually allow tomcat users that do require original spelling to access it via something like this: