Skip to content
This repository has been archived by the owner on Jun 9, 2023. It is now read-only.

Handling mid-stream errors #1

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Handling mid-stream errors #1

wants to merge 3 commits into from

Conversation

tigt
Copy link

@tigt tigt commented Apr 8, 2020

No description provided.


In the past, browsers other than Internet Explorer accepted `multipart/x-mixed-replace` responses to be loaded in `target="_top"`, and that could potentially be used to swap to a a new copy of the response-in-progress but with error-indicating headers, or potentially even a replacement error page. However, modern browsers have quietly dropped support for `multipart/x-mixed-replace` in the top browsing context.

It _might_ be possible to write `<meta http-equiv="refresh" content="0;url=${errorPageLocation}">`. It’s not _supposed_ to be outside the `<head>`, but I still like it better than how Rails creates a mid-stream error redirect with an inline `<script>`. ([some context on how search engines could better understand this is a temporary redirect, not a permanent one](https://twitter.com/JohnMu/status/969486943351394304))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this is what I've seen in several applications. Curious if this is one of those "kinda not supposed to work but does" things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It definitely works for browsers and probably search engines, but caches still probably need the explicit indicator that the HTTP response was incomplete.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing some of the prior art has handled partially is what happens if the stream error occurs while the HTML has already opened a tag, attribute value, etc. Rails prepends "> and I’ve also seen </script>

GitHub explored this problem even more thoroughly for Dangling Markup Busting in their Post-CSP journey

Copy link
Author

@tigt tigt Apr 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spitballing here:

out.streamErrorRedirect = function (url) {
  out.shouldMarkAsIncomplete = true;
  out.write(`
    <!--'"--><!--]]></style></script></textarea></select></template>-->
    <meta http-equiv=refresh content="0;url=${url}">
    <script${out.global.cspNonce ? ' nonce='+out.global.cspNonce : ''}>location=${JSON.stringify(url)}</script>
    <a href=${url}>Follow this link if you aren’t redirected</a>
  `);
};
  • Should there be a default error page URL taken from out.global.error500Url, like out.global.cspNonce?

  • How much should Marko protect against dangling markup? Rails and other existing solutions don’t seem to care much.

  • Writing further error content should still happen with whatever is inside <@catch>, in case both the inline <script> and meta[http-equiv=refresh] fail (not likely, but certainly not impossible.) Should it go before or after the stream error redirect markup? My vote is for before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigt I might've missed this in another part of this discussion, but is there value in writing both the meta and script tags here? In which case would one work but not the other? I feel like meta is more practically reliable and works with js disabled which is a bonus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also add that having dangling markup is probably very rare since for the most part things are escaped. To have dangling markup it would mean that you would have gotten your application into a state where you are passing invalid markup to an unescaped interpolation which is an issue in and of itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know of a specific browser where it happens, but my intuition makes me worry about relying solely on the option that is technically invalid HTML — all it would take is one high-profile unescaped markup attack to make some security-conscious browser ignore meta refreshes in the <body>. (Maybe Tor Browser or Brave already do.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blast from the future: turns out there might be value in writing both meta[http-equiv="refresh"] and window.location=. WebKit changeset 280,870:

Meta HTTP refresh should not navigate if document has sandboxed automatic features browsing context flag set:

Firefox and Chrome already behave this way.

I haven’t managed to figure out what the spec means by that yet, but putting it here for later.


- Express and other Node HTTP frameworks
- https://expressjs.com/en/guide/error-handling.html#the-default-error-handler
- https://github.com/expressjs/express/issues/2700 (!!!)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. If express's next/error handler does support this properly then perhaps using the marko/express plugin should be good enough. We actually forward Marko's errors to express in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only uncaught errors though. The case we need to think about is when the error was caught and so we were able to render a page for the user, but we want to signal that content is missing to any proxies/bots/etc. that might otherwise cache/index a page that didn't render all its content.

If we did want to rely on this for those cases, we'd probably have to hold off on firing the event until just before Marko ends the stream, otherwise express would close the connection before we were done rendering. We'd also probably want to use an event other than error since I know there's already apps out there listening for that event and doing a script/meta redirect.

@mlrawlings mlrawlings marked this pull request as draft April 8, 2020 23:58
@tigt
Copy link
Author

tigt commented Apr 12, 2020 via email

@tigt
Copy link
Author

tigt commented Apr 15, 2020

This is how I’m leaning after thinking for a few days:

I don’t want to encourage mid-page redirects

  • Changing the URL is user-hostile:
    • It breaks the Refresh button, which is how 99% of users will successfully fix transient streaming errors
    • It makes it hard to bookmark the page and come back later to see if the error is gone
    • Real HTTP errors don’t change the URL, so redirecting violates the principle of least surprise
    • The user may still value the rest of the page — imagine they visit the homepage to use the navigation, but a stream error keeps kicking them to a navigationless error page instead
  • There isn’t a 100% vetted way to implement the redirect in a layer low enough I’m comfortable with on the framework layer:
    • HTML validation may be important to some authors, so meta[http-equiv=refresh] in the body has the drawback of being invalid. (I’m also not certain it works everywhere, such as more exotic user agents.)
    • Sure, only ~0.8% of folks intentionally turn off JavaScript, but some authors today still find value in not relying on JS — such as NearlyFreeSpeech.NET’s site panel or WordPress’s admin area.
    • The dangling markup defenses seem to hit a dangerous combination of:
      • Anticipating everything a web author could possibly do, including new elements and syntax we can’t know of
      • Potential security issues
      • Happens rarely enough that the above problems won’t receive the scrutiny they need

If the intent of existing implementations (e.g. Rails) is to have a fallback generic error handler/message for the user, we can achieve that with some API for uncaught <await>s.

Handling mid-stream errors is something Marko should have docs for

The fact that I had to do quite a bit of research for a problem that’s existed since 1997 suggests that this is information Marko authors could certainly use.

  • Describe deciding importance for a given part of the page, and what authors can do:
    • Very important parts of the page should show an error message, much like a 5XX handler
    • Somewhat important parts could fall back to an “unenhanced” version, maybe linking to a dedicated endpoint for that content (especially useful for <await> with timeout)
    • Trivial parts can have empty <@catch>es
  • Explain pitfalls with the mid-page redirect pattern
  • Note challenges of getting outer layers to correctly handle the streaming error signals Marko emits (NGiNX, etc.)
  • Explain/link to the specific, standardized HTTP ways of signaling mid-stream errors, since it’s not obvious where to find the specs for this

@tigt
Copy link
Author

tigt commented Apr 15, 2020

It also seems like Marko could know if an underlying stream is for Node’s http or http2 modules, and emit error signals appropriately:

  • http should close the TCP connection
  • http2 should send RST_STREAM
  • Anything else would do nothing other than the error event Marko already fires

It seems like out instanceof require('http').ServerResponse is the official way to do it. But since it’s very common to create generic streams that eventually get piped somewhere else, there should also be a way for authors to tell Marko it’s destined for some kind of HTTP stream eventually — probably the same per-request API that lets authors turn off this error signaling for misbehaving middle/endboxen.

@tigt tigt changed the title WIP: initial thoughts on stream error handling Handling mid-stream errors Jul 24, 2020
@tigt
Copy link
Author

tigt commented Jul 24, 2020

Show-stopping errors

It is highly recommended not to begin streaming a Marko response (or any HTTP response) for crucial page content, such as:

  • If a product URL addresses a product you have (200 OK) or something completely bogus (404 Not Found)
  • If a user-specific area has valid authentication (200 OK) or needs to log in first (403 Forbidden)
  • If a time-limited promotion has ended (401 Gone) or is still ongoing (200 OK)

In these cases, you should wait to begin the HTTP stream until you know which status code to send.

If you want to begin streaming ASAP for performance reasons, consider the following:

Trivial errors

Do nothing different. If you don’t care that the “recommended for you” section renders or not, <@catch> the error and swallow it silently.

Something in-between

This is where advice for listening to the Marko stream’s error event and sending either invalid Transfer-Encoding: chunked bytes or RST_STREAM frames would go.

Responsibilities that Marko can help developers with:

  • Showing a human-facing error message, perhaps a default one with <button formaction="" formmethod="post">reload this page</button>
  • Finishing the streamed response, but with proper signaling to indicate the HTTP response is damaged and should not be indexed/cached
  • Clear expectations of browser behavior and how to test if the used approach is propagating between middleboxen to the end-user

@taylor-hunt-kr
Copy link

Related: I found an old Apache bug ticket on correctly forwarding malformed CTE streams: https://bz.apache.org/bugzilla/show_bug.cgi?id=55475

Lots of relevant discussion, URLs, and good examples of running code in there.

@tigt
Copy link
Author

tigt commented Jan 13, 2022

I pinged the HTTP working group at httpwg/http-core#895, and that thread contains a few links and a suggestion to bring this problem area to the httpwg mailing list

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants