-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OTLP/GRPC] Ensure OTLP receiver handles consume errors correctly #8080
Conversation
dc526f3
to
1a31b8b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #8080 +/- ##
==========================================
+ Coverage 90.74% 90.76% +0.01%
==========================================
Files 341 341
Lines 18344 18389 +45
==========================================
+ Hits 16647 16691 +44
- Misses 1360 1361 +1
Partials 337 337 ☔ View full report in Codecov by Sentry. |
759c6c5
to
1a31b8b
Compare
@bogdandrutu can you have a quick look at this, if it's fine. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Co-authored-by: Evan Bradley <[email protected]>
Co-authored-by: Evan Bradley <[email protected]>
@bogdandrutu can you a quick look if it's fine by you? |
Bogdan will be out for a few more weeks. @open-telemetry/collector-approvers could one of you please take a look? |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
0b9635f
to
0226831
Compare
@bogdandrutu can we get this merged? #8676 seems to waiting on it. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@open-telemetry/collector-approvers I will merge this in 5 working days unless someone blocks it
s, ok := status.FromError(err) | ||
if !ok { | ||
code := codes.Unavailable | ||
if consumererror.IsPermanent(err) { | ||
code = codes.InvalidArgument | ||
} | ||
s = status.New(code, err.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this error handling is the same for all signals I do wonder if it could be extracted and re-used, but that is an optimization we could handle later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #9300 for this
The otlp receiver was recently updated via #8080 to properly propagate consumer errors back to clients as either permanent or retriable. The code we're using to indicate a non-retriable error is `codes.InvalidArgument`, which is the equivalent of `400` in HTTP. While 100% correct according to the [OTLP specification](https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md#failures) to indicate a non-retriable error, I think `codes.Internal` (which is equivalent to HTTP `500`), better conveys the actual state of the collector in these situations. Related to #9357 (comment) --------- Co-authored-by: Evan Bradley <[email protected]>
…rrors (#9357) **Description:** Updates the receiver's http response to return a proper http status based on whether or not the pipeline returned a retryable error. Builds upon the work done in #8080 and #9307 **Link to tracking Issue:** Closes #9337 Closes #8132 Closes #9636 Closes #6725 **Testing:** Updated lots of unit tests
Description: Follow the receiver contract and return Unavailable for non-permanent and InvalidArgument for permanent errors for OTLP/gRPC receiver.
Leave the "Retry-After" field blank and let the client implement an exponential backoff strategy.
Link to tracking Issue: #4335
Testing: Added relevant test cases.