-
Notifications
You must be signed in to change notification settings - Fork 821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collector exporter retries and backoff #1233
Comments
The spec says that it should be an exporter. I think it has some cons as we will have to implement more less the same logic for all exporters and still it might be not enough. But the api itself gives us possibility of delegating this to a processor already. This would allow us easily implement it once and then use it with any exporter without rules for retry etc. Because the api allows that I don't really see a reason why we couldn't add a new processor with different retry rules. Also the retry rules for certain processor might be not enough for someone's needs but then if you can create your own processor this would gives endless possibility without changing the exporters. And having own processor might be much faster for people - you can have full power there to decide about all different retry rules whereas with exporter it is not the case. |
I can tackle this issue. |
As discussed in this week's SIG meeting, I promised to share some context around other SIGs retry policies plus some ongoing discussions at the spec level. cc @svetlanabrennan @dyladan Which status codes are retryable?OTLP/gRPCThe specification defines the retryable gRPC status codes here. OTLP/HTTPThings remains a little more ambiguous. Specification does mention 429 and 503, but an open spec issue proposes to also include 502 and 504. The experimental implementation for opentelemetry-java retries 429, 502, 503, 504. @mwear shared Ruby's implementation. If I'm reading my Rubies right, it appears it retries 408, 429, 502, 503, 504, as well as a number of internal errors. Additional retry policy parametersAn open spec issue proposes to define a reasonable default retry policy. An attempt was made to define some options + their defaults, but given varying opinions it seems likely this will be left to the individual SIGs to define. Retry policy parameters may include things like:
These are inspired by gRPCs notion of a retry policy, but could equally apply to the OTLP/HTTP exporter. The opentelemetry-java project has implemented these configuration parameters and supplied defaults (also inspired by gRPC's defaults), but for now they are not publicly accessible. These parameters are used by Java's gRPC and HTTP implementations. |
* chore: upgrade expect to v29.2.0 * fix: commit forgotten file
Is your feature request related to a problem? Please describe.
The collector exporter doesn't current retry sending spans. We should consider adding retry logic with a reasonable backoff strategy.
Describe the solution you'd like
The spec mentions that exporters are responsible for implementing retry strategies and that they will differ between protocols and tracing backends. The OTLP exporter for the collector implements retires using exponential backoff. See: https://github.com/open-telemetry/opentelemetry-collector/blob/1b6e88b9995f0559b0f25b4f7c9f2e95e92965bb/exporter/otlpexporter/exporter.go#L94-L98. It would be ideal if we could do something similar for OTel JS.
The text was updated successfully, but these errors were encountered: