Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(otlp-exporter-http): add retries #2717

Closed

Conversation

svetlanabrennan
Copy link
Contributor

Which problem is this PR solving?

Trying to implement retry logic in the export method inside the trace-otlp-http exporter. I'm opening this draft PR to get some feedback on this solution. Any feedback would be greatly appreciated!

The exporter spec is asking the retry logic to include an exponential back-off with jitter. Currently this solution doesn't include the "jitter" portion.

Changes made:

  • Updated the OTLPExporterBase class in exporter-trace-otlp-http to include retry logic
  • Moved timer logic around property exportTimeoutMillis from the BatchSpanProcessorBase class in the sdk-trace-base to OTLPExporterBase class in exporter-trace-otlp-http
  • Updated SpanExporter interface in sdk-trace-base to include optional parameters that are being used by the export method in the OTLPExporterBase class

Fixes # 1233

Short description of the changes

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Tests not written yet. I tested this retry logic locally on an instrumented application.

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added
  • Documentation has been updated

@svetlanabrennan svetlanabrennan changed the title Retry logic http feat(otlp-exporter-http): add retries Jan 14, 2022
@codecov
Copy link

codecov bot commented Jan 14, 2022

Codecov Report

Merging #2717 (dbce62e) into main (2da6754) will decrease coverage by 0.91%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #2717      +/-   ##
==========================================
- Coverage   93.43%   92.51%   -0.92%     
==========================================
  Files         159       75      -84     
  Lines        5452     2819    -2633     
  Branches     1145      541     -604     
==========================================
- Hits         5094     2608    -2486     
+ Misses        358      211     -147     
Impacted Files Coverage Δ
...s/exporter-trace-otlp-http/src/OTLPExporterBase.ts
...entelemetry-propagator-b3/src/B3MultiPropagator.ts
...ckages/opentelemetry-core/src/utils/environment.ts
packages/opentelemetry-resources/src/Resource.ts
packages/opentelemetry-core/src/baggage/utils.ts
...lemetry-resources/src/detectors/ProcessDetector.ts
...es/opentelemetry-core/src/propagation/composite.ts
...elemetry-sdk-trace-base/src/BasicTracerProvider.ts
...ackages/opentelemetry-exporter-zipkin/src/utils.ts
...ontext-async-hooks/src/AsyncHooksContextManager.ts
... and 74 more

*/
export(
spans: ReadableSpan[],
resultCallback: (result: ExportResult) => void
resultCallback: (result: ExportResult) => void,
exportTimeoutMillis?: number,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't add optional parameter in the middle of the arguments because that would be a breaking change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Thanks for your feedback.

@@ -143,10 +143,6 @@ export abstract class BatchSpanProcessorBase<T extends BufferConfig> implements
return Promise.resolve();
}
return new Promise((resolve, reject) => {
const timer = setTimeout(() => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like from the spec that both the BSP and the OTLP exporter have different timeout (30s for BSP and 10s pour otlp). I think we are meant to keep a global timeout at the BSP level and add another one in the OTLP exporter. WDYT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree. I just learned about that and solving issue #2706 first (adding and using the timeout in the OTLP exporter) is going to help me with this retry logic and I can avoid touching the BSP level timeout.

@github-actions
Copy link

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Apr 18, 2022
@dyladan dyladan removed the stale label Apr 18, 2022
@dyladan
Copy link
Member

dyladan commented Apr 18, 2022

Not stale merely waiting on other things I think?

@svetlanabrennan
Copy link
Contributor Author

Yes it's not stale. Waiting on timeout pr which will be ready for final review soon now that we're back to a single lerna monorepo.

@github-actions
Copy link

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Jun 20, 2022
@svetlanabrennan
Copy link
Contributor Author

Not stale. Will update this pr soon now that timeout pr was approved/merged.

@github-actions github-actions bot removed the stale label Jul 18, 2022
@svetlanabrennan
Copy link
Contributor Author

Closing this draft pr in favor of this one: #3207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants