-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to the HTTP trailer #34
Comments
So trailer headers are atomic just like normal headers are atomic? How are streams affected? Also, I thought that semantically trailer headers are not distinct from normal headers, though it's not entirely clear to me how that's supposed to work in a streaming situation. |
Yes, we need to discuss that point. /cc @domenic Like the headers being atomic, my gut feeling is that the trailers should also be handled without backpressure once the body has been consumed. So, you may need to wait for completion of body consumption by the ReadableStream due to backpressure, or it's ok that a UA fulfills the trailer promise without waiting for completion of consumption if possible.
https://tools.ietf.org/html/rfc7230#section-4.1.2 says
Not fully sure, but I interpret this as it doesn't forbid distinction between headers and trailers at the application level. |
Given that section 4.1.2 it seems we also need to define what happens when a server does generate those forbidden trailer headers. Will we just pass them through or discard them. And if we discard them, what is our whitelist/blacklist? I think the UA should fulfill the trailer header promise once the network layer has consumed all trailer header bytes and parsed them successfully. I don't think it needs to depend on content consuming a stream, but I might be missing a subtlety. |
You should also consider how you want to allow client generated requests to produce trailers. As chunked-encoding must be used in this case and this would have to be specified before the body is produced for HTTP/1.1. https://tools.ietf.org/html/rfc7230#section-4.1 Separately the client needs to indicate whether it is willing to accept trailers in the response using "TE: trailers", specifying this via the normal request headers mechanism seems sufficient. https://tools.ietf.org/html/rfc7230#section-4.1.2 Given that trailers are strictly limited to occurring after the last transfer chunk and that the content of the trailers may vary based on the application behavior right up until writing the last chunked byte it seems the API should not require the trailer values prior to receiving the last byte on the stream. A decision also needs to be made about how to interpret the "Trailer" header itself, though given that it's a SHOULD in the spec I suspect it should just be propagated with the initial header values rather than using it to implement some strict presence checks for the subsequent trailers |
I think streaming should not be too affected. It makes sense to fulfill the trailers promise as soon as they are available---which might be never if the stream consumer exerts backpressure and stops the flow, or might be before the stream consumer fully reads all chunks if the stream is doing internal buffering. And assuming they are used for similar metadata to headers I don't see a need to stream the trailers; we can use the same model as headers and get them all at once. |
@louiscryan I think what @tyoshino suggested for requests makes sense. You supply a promise at the point you initiate the request. That indicates ahead of time you might supply trailers. |
Agreed. Promises seem like a good way to model the intent even if the value |
Another thing we need to define is the relationship between a response having a trailer and the |
It's forbidden to set the |
@tyoshino ah yes, for the response we would need that. Perhaps requiring either |
Note that trailers are very poorly supported in HTTP implementations. Microsoft have explicitly said that they ignore them; others do similar things. I think that the right thing to do now is to give up on trailers. |
Not sure I follow your reasoning. Trailers are part of the spec and the historically poor support in server implementations is in no small part because there was no browser API which this proposal would address. Trailers do have support in a non-trivial number of the more popular server libraries and proxies. HTTP2 does a reasonable job addressing the encoding issues that have also dogged trailers. Is the functional utility of trailers if they were well supported in question? E.g. MACs, trailing status & debug info. |
If the use cases for trailers were actually compelling, then we would have seen client implementations by now. And it's not just browsers that matter here, you have to convince developers of major client libraries (I'm thinking Windows, iOS and Android here) that it's worthwhile too. "Build it and they will come" is a posture that ignores the fact that this feature has gone 15 years without any real traction. |
Last time I looked library support was actually pretty decent .Net https://msdn.microsoft.com/en-us/library/system.net.http.headers.httprequestheaders.trailer(v=vs.118).aspx Ill let others weigh in on iOS |
Trailers are used extensively inside of back-end networks and CDNs. Server-side folks generally are very keen to get them supported because they help deal with pain around buffering, etc. E.g., Varnish would LOVE to be able to send ETag in trailers. The problem has always been browser support. The use cases might not be compelling to browsers, and browsers might have legitimate concerns about buffering, code complexity, etc., but saying that there aren't any compelling use cases because browsers haven't done it yet is pretty presumptuous. |
+1 to Mark's comments. I hope that HTTP/2 is an opportunity for us to rally around enabling proper trailer support in clients, not the other way around. The "lack of real traction in past X years" is a circular argument: use cases are blocked on support, support wants to see "real-world" use cases. FWIW, Server Timing (http://w3c.github.io/server-timing) would benefit a great deal from trailers. |
While Martin phrased it as "actually compelling", I would like to put out the other argument - experience is that there are profound security implications regarding how you operate on data and if and whether transformative operations are allowed to be sent in trailers that cause the content to be reinterpreted. For example, a big concern I spelled out in tc39/proposal-cancelable-promises#4 was precisely regarding header truncation, with trailers offer (both implicitly - through poor implementations - and explicitly - via cancelable promises). I have little faith that trailers have received proper security analysis, especially given the past two decades of security research, with my gut (and my bias) being that they represent yet another attack vector for folks like @sirdarckcat in browsers being too clever for their own good. If there were demonstrably compelling reasons, then I'd hope it was first explored outside the context of any programatic API, so that proper security review and experience with could be accomplished. But I'm really not keen to even see this implemented in the browser - the gain vs risk is too disproportionally balanced in the latter. |
@sleevi fair points, but I'm reading this as "we don't know what we don't know so we shouldn't do it", which seems like the wrong place to start. We do know that there are concrete use cases that developers would benefit from (e.g. trailer ETags, timing metrics, and so on) and I think we should pursue this with appropriate security reviews and guidance. |
@igrigorik You're reading it wrong. I'm explicitly opposed to exposing it in |
@sleevi fair enough, I agree that we should (carefully) work through the security implications. I just want to make sure that we don't end up burying this prematurely (once again :)). |
Am I right in assuming that the objections are limited to the additional risk interpretation of truncated trailers present beyond the risks already present in truncating HTTP/1.1 chunked transfers / HTTP2 DATA frame sequences? Similarly we're only concerned with exposure/interpretation of these headers in the API distinct from whether receipt of trailers from the wire impacts whether a response is considered valid or not. AFAIK most XHR implementations simply drop received trailers on the floor but do not consider their presence an error at the API level. I don't claim to know the intricate details of how browsers interpret headers and which headers would present a greater risk when truncated after some payload has already been received. You mention 'transformative operations' earlier, could you clarify which headers you consider to present a risk in that class? |
@louiscryan No, that's not correct - that is, the objections are not limited to that. It's about a scope of authority separation that trailers introduce. Consider, for example, RFC 6797 or RFC 7469. Both of these explicitly restrict processing to the first instance of the header field (e.g. http://tools.ietf.org/html/rfc6797#section-8.1 http://tools.ietf.org/html/rfc7469#section-2.3.1 ). This wasn't accidental, but predicated by a concern about the separation between the capabilities being expressed by someone who controls a given resource and those who control the domain/server. In many (most) systems, the headers are handled by a 'trusted' system (such as the web server), which handles access controls about what can be set where and by whom, and then shunts processing of the body off to other, less-trusted systems (think CGI scripts, although I'm sure you can insert your favourite framework here). The trusted process handles the preamble generation, and then everything else is handed to the untrusted process and no further parsing/inspection is needed by the server. To put this more concretely, consider a server that wished to delegate the production of content bodies to some 'untrusted' process, which may include the production of chunked bodies. In order to safely sanitize headers (especially security-relevant headers), the server implementation would need to inspect/process all chunks and filter them along the way, versus the current behaviour of filtering them up at the forefront. This is just one example where the state of non-implementation in browsers is a security benefit to server operators. While it may be perfectly fine when all elements of the data production are trusted (such as the cases @mnot raises), I don't believe that the threat models are the same (nor do I think Mark would suggest they are), thus we can't reach a conclusion that what is good for the goose is good for the gander. This sort of system, as a concept, is a common source of attacks and vulnerabilities. For example, consider certificates that contain multiple common names. Some CAs would only validate the first instance as a domain name, and let the remaining instances be requestor-controlled. However, some applications would check the first CN, others would check the last CN, and there would be confusion such that those who looked at the last CN would be seeing attacker controlled data. Or, to use another certificate-based example, Moxie Marlinspike's null termination attack (which related to ambiguity whether systems checked the name from the beginning up to the null, or checked the name from the end). Even within HTTP, you've had situations like Request/Response smuggling vectors leading to confusion. I hope you can see that there is a whole host of systemic errors that emerge when there are multiple ways to represent something, when implementations may support one, the other, or both, and when there is additional filtering overhead beyond "out of sight, out of mind". This is why we (intentionally) have not supported trailing headers and are loathe to do so, and why it's such an exceptionally high bar outside of niche use cases that may not have the same security concerns. |
I agree we should just give up on trailers in the browser context at this point. There's first the huge mass of interop and security pitfalls. Browsers process existing headers everywhere. When I think about where we process them and what the headers are, they span a spectrum from:
Not to mention the unbounded use of headers in web content. We can't make XHRs suddenly see trailers mixed into headers without warning. Given that all existing code and headers have been designed without trailers in mind, I think the only answer is that we MUST break the correspondence between header and trailer semantics. Trailers and headers must be considered completely unrelated animals. Unless a field is explicitly specified as a trailer, it will be treated as any unknown field in a trailer. [Edit: Fixed some confusing wording here.] Anything else welcomes unexpected security problems, bizarre behaviors, and surprise interop failures when two implementations disagree on which headers would be too hard to process streaming. Even after accepting that, there are significant costs. The current core data model for a network request is very simple: you have an atomic set of request headers, sometimes a body stream, and the net stack gives you back an atomic set of response headers followed by a body stream. There's side fluff around redirects and auth hooks and MIME sniffing and such, but that's the core abstraction. It's very clean. We can build a clear division of responsibility between the atomic response headers and streaming body. This is how navigation works, this is how some servers work (as Ryan noted), etc. Changing that core data model is not a small change. This affects not just fetch, but everything in a browser's network stack from the low-level HTTP implementation to the disk cache to any hooks (extensions, etc.) to web content to all the layers in between. I think the use cases need to be exceedingly compelling to entertain this. |
@sleevi I take your point about the separation of trust between header production & 'everything afterwards' in some systems but I don't think anyone should be relying on browsers as the last-line of defense against these issues. Caching / De-chunking proxies are free to promote trailers into headers & HTTP/1.1 keep-alive requires servers to observe a chunked response for termination. I note that a lot of the discussion here is focused on how the browser should interpret and enforce policy around trailers when things like (Alt-Svc, Set-Cookie, HPKP are present in trailers). With the possible exception of ETag / Content-MD5 (and @mnot my have some others) my assumption was that the browser would continue to ignore headers from an internal processing perspective just as they do today. That is to say that trailers have no semantic meaning to the browser and they are simply made available for programmatic use by developers in the fetch API. @davidben - I don't think anyone is suggesting mixing trailers into headers in XHR. That's a de facto standard and changing it's contract would be a very bad idea. The proposal is to allow for them in fetch possibly as en explicitly separate feature from headers as it's an entirely new interface and so would not disrupt existing usages. |
I certainly agree that a precondition here is that all existing headers (including ETag and friends; allowing that is considerable complexity to a disk cache implementation) must be ignored in trailers. Though, even that might be troublesome. The client advertisement is simply "trailers", not a list of headers that may be put there. I don't know how enthusiastic existing servers will get if you start advertising support. Will ETags suddenly move to the trailer? What will happen if some browsers do support ETags in the trailer and some don't? Even if that turns out fine (the install-base of servers probably does not include that many trailer users?), this is still a massive change to the core abstraction of the whole browser. The benefits have to outweigh the cost. If the use case is purely programmatic use by developers, it seems this is not the right time to propose the feature. The core capability (a stream of bytes) is already being provided by Fetch. In the spirit of the extensible web manifesto, JS code is free to parse out whatever additional structure it wishes. While we don't let you extend the HTTP parser, the difference between that and parsing out the body is minimal. If it turns out everyone is effectively reimplementing HTTP trailers (unlikely), then it makes sense to move a construct of the sort into browsers. The reason one might wish to short-circuit this process is if some processing in the browser needs to interpret this. But we seem to both agree that the opposite is desirable. (Nit: It's not accurate to say XHR lacking trailers is a de facto standard. This is de jure at this point. The specification is Fetch and friends. They say, on the web, HTTP trailers do not exist. This ticket exists specifically to propose changing that.) |
I wouldn't say that the parsing separation is minimal overhead in the case of handling stream termination. JS code will rely on the HTTP parser for non-streaming status handling and stream initialization handling but will have to switch to payload parsing for stream termination status. A payload and therefore it's mime type will have to describe how terminal statuses are encoded and make them reliable for the range of possible termination causes. Consider an API like the Twitter streaming API which returns a homogeneous sequence of types in a stream https://dev.twitter.com/streaming/reference/get/statuses/sample If they wanted to convey a termination status after a period of time like 'Credential Expired' or 'Budget Spent' to a browser they are forced to change their schema. Today they just hard drop connections https://dev.twitter.com/streaming/overview/connecting which is not exactly pretty. Trailers solve this problem in a way that is also not subject to streamed message truncation causing the developers payload parser to get into an unrecoverable state which prevents it from parsing the special 'error' entry in the stream. It also wouldn't require new MIME types for commonly streamable formats like audio & video that want better termination handling though these are so valuable on the web that side-polling is used to bridge the gap when 'Rental Period Expired' kicks in :) On the subject of compelling use-cases what kind of quantification would be persuasive? |
I'm not sure I follow. Trailers don't have any magic as far as termination is concerned. If the connection to the server is shut off, we're not going to get the trailers either. HTTP/2, for all the bells and whistles, is still sent over a byte stream. Changing the schema for a termination signal sounds about right? You're sending a sequence of objects terminated by another object. Trailers just make this more confusing. Now, as a client, I have to care about what it means if the peer sends half a status and then stops with a trailer. In the case of extra formats, this proposal isn't going to help the video tag anyway, no? It's only if you do a low-level It seems, even if If there's no fundamental new capability, these kinds of polyfills are really the right way to get this kind of APIs into a browser. They are already necessary for deployment and can evolve much faster into the right solution for some use case, rather than the first one we thought of at the time. Successful JS experiments demonstrate need and use-cases. Those beget candidates for browser APIs. (Only candidates, not guarantees; again, changing the core data model like this is not a small change. This has one of the highest bars a network feature could have.) Remember, HTTP trailers do not exist as far as browsers are concerned. You have to imagine this as if someone's proposing adding a brand new unheard of HTTP semantic to browsers. That they're already spec'd is, if anything, a drawback. That means the latent threat of compatibility problems is worse because the install-base may already try to speak it. |
The point was that its preferable to not have to change the schema and unambiguously use trailers to convey terminal status. This allows for graceful upgrade of existing APIs that currently can't represent a meaningful terminal status without breaking backward compatibility. The story for older browsers using the Twitter API is unchanged. Using a MIME multipart of application/http can and does work, I've done such things before ... |
@sleevi - I'm reading in a hurry in an airport lounge, but with the caveat, it feels like you're arguing against something that is not this proposal. AIUI this bug is not proposing that trailers be folded into existing headers willy-nilly (which indeed would be insane); rather, it's making them available to applications that specify the use of trailers, and (presumably) understand the various security risks. Now, one might argue that trailers are Just Too Dangerous to expose to any application, even with full knowledge. However, I'd find this a might curious place to draw that line, given where we're at. To give an example, I received an e-mail from one of your Googly brethren just yesterday asking about how to put a digest into trailers for integrity checking purposes. I'd design that by defining a header that communicates the algorithm, and a trailer that carries the actual digest. Are there security and interoperability issues in that use case? Certainly, but it's not being designed without trailers in mind. |
@mnot no, we don't want combine semantics at all. We just want a distinct Headers object. A response would have "headers" and it would have "trailer headers". The user agent would only ever look at "headers". Script can look at "trailer headers" too. |
+1 |
+1 largely because were talking about the fetch API here. Speaking as someone who serves a LOT of ETags based on buffering content I would like to see UA support for that in trailers too, I'm just not sure this is the right place to discuss that (is there a place to discuss that??) |
@louiscryan Fetch is the interface between user agents and HTTP (and other things), it's not just an API. Having said that, any standards forum is typically a bad place to convince user agents to implement a feature they don't want to implement. |
We discussed this at the HTTP workshop:
|
Clarification on 1, for now we'll add it to responses. Since adding it on requests would also require chunked encoding which browsers don't do at the moment we'll leave that once request streams are sorted. |
@yutakahirano do you have suggestions for how to modify step 14 of https://fetch.spec.whatwg.org/#concept-http-network-fetch to account for bytes that are part of the response body and bytes that end up forming the trailer? I guess we should just tweak the language a bit and maybe add a clarifying note that the framing in the protocol takes care of the distinction... |
Currently the description expects that Transfer-Encoding is processed at a lower layer. Can we assume that such a lower layer decouples the message body part from the trailer? 14.2: Whenever one or more bytes for message body are transmitted, ... |
I used payload body in #344, but I think you're correct I should update some instances to say message body instead. I don't explicitly set response's trailer since we don't explicitly set the other fields of a response either (except body...). |
WRT |
Additional trailer support we can add in the future if this minimal viable solution (that is nonetheless complicated) is a success: * Support for synthetic responses. * Support for (synthetic) requests. * Trailer headers that have semantics. * Support for trailer headers in CORS (only for responses I think). Fixes #34 and fixes #343.
Additional trailer support we can add in the future if this minimal viable solution (that is nonetheless complicated) is a success: * Support for synthetic responses. * Support for (synthetic) requests. * Trailer headers that have semantics. * Support for trailer headers in CORS (only for responses I think). Fixes #34 and fixes #343.
Everyone subscribed to this issue might be interested in #473. I wrote a test for |
The trailer semantics in RFC 7230 (and also since 2616) provides an ability to send/receive a meta-data out of and after the body part.
HTTP2 also supports it though the chunked TE has been deprecated.
gRPC (https://github.com/grpc/grpc) utilizes it and other applications may also benefit from it.
For XHR2, there was some discussion about trailer support in 2010: https://lists.w3.org/Archives/Public/public-webapps/2010OctDec/thread.html#msg163. @mnot well described that together with observation such as increasing use of the trailers.
Anne declined it at that time https://lists.w3.org/Archives/Public/public-webapps/2010OctDec/0233.html. But maybe it's good to revisit this needs in the context of the Fetch API standardization. Introduction of interfaces to access the trailer may affect streams integration design. So, we should start thinking how it would be like asap if we support it.
The text was updated successfully, but these errors were encountered: