Should we remove load-balancing and failover/retry logic from document-centered reporting? #196

clelland · 2020-01-13T16:40:05Z

It's been discussed previously that no other web API handles retries of failed requests or load-balancing between different endpoints in the way that the Reporting spec currently does, and that maybe we should remove that complexity from the base spec.

It's clear (at least to me) that the primary motivation for it is Network Error logging, where the very conditions that make reporting necessary also mean that one-shot delivery to a single endpoint is inherently unreliable. So we definitely need those constructs there. But are they useful for CSP / Policy / Deprecation / Crash reports as well?

If we remove all of that, then an endpoint definition becomes essentially just a URL. No priority, weight, failures, retry-after or pending. With even load balancing gone, there doesn't seem to be a case for multiple endpoints in a group. Presumably, all load balancing could be done at the endpoint using DNS or routing techniques.

If client-side load balancing is useful, then we should keep the endpoint group concept, as well as priority and weight.

If failover / retry is also still useful, then we should keep everything, either in this spec, or perhaps figure out how to move it into fetch as a general mechanism.

clelland · 2020-01-13T19:32:56Z

@dcreager , @igrigorik -- do either of you have an opinion on this?

dcreager · 2020-01-27T13:30:13Z

NEL definitely needs load balancing, failover, and retries to all be implemented client-side, but I agree that all of the per-document reporting use cases can get away with a single upload URL and all of that stuff handled server-side. So I'm 👍 to removing load balancing and failover from the per-document Reporting spec.

Retries I could see going either way. If we're still going to keep batching of reports in the per-document Reporting spec, then retrying those batched uploads up to X times if they fail doesn't add that much complexity. (For instance, we'd already have to work out hard caps on how long we queue up a batch for, so we could use that same hard cap for how long to retry a batch upload for.)

I also suggested over in #191 (comment) that if NEL is really the only spec that needs the client-side complexity, there's no need to have it be in a separate spec — let's just put that directly in NEL itself.

clelland · 2020-01-28T15:56:32Z

So, without this, it sounds like there is no more need for endpoint groups in per-document reporting to contain more than a single endpoint.

If we simplify it to that extent, the report-to header becomes essentially just a mapping of names to URLs, providing those names to other mechanisms that want to direct reports.

I'll update the PR accordingly.

clelland · 2020-02-05T15:56:48Z

Updated now; the basic reporting spec now has no more concept of 'groups', that is specified completely in network reporting. Similarly, retries and failover have been removed.

clelland mentioned this issue Jan 13, 2020

Feedback from Mozilla #158

Closed

clelland closed this as completed Mar 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we remove load-balancing and failover/retry logic from document-centered reporting? #196

Should we remove load-balancing and failover/retry logic from document-centered reporting? #196

clelland commented Jan 13, 2020

clelland commented Jan 13, 2020

dcreager commented Jan 27, 2020

clelland commented Jan 28, 2020

clelland commented Feb 5, 2020

Should we remove load-balancing and failover/retry logic from document-centered reporting? #196

Should we remove load-balancing and failover/retry logic from document-centered reporting? #196

Comments

clelland commented Jan 13, 2020

clelland commented Jan 13, 2020

dcreager commented Jan 27, 2020

clelland commented Jan 28, 2020

clelland commented Feb 5, 2020