Skip to content

Commit dbea54d

Browse files
authored
Add support for partial success in an OTLP export response [2] (#2696)
1 parent 1ef7871 commit dbea54d

File tree

2 files changed

+104
-26
lines changed

2 files changed

+104
-26
lines changed

CHANGELOG.md

+3
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,9 @@ release.
6666

6767
### OpenTelemetry Protocol
6868

69+
- Add support for partial success in an OTLP export response
70+
([#2696](https://github.com/open-telemetry/opentelemetry-specification/pull/2696)).
71+
6972
### SDK Configuration
7073

7174
- Mark `OTEL_METRIC_EXPORT_INTERVAL`, `OTEL_METRIC_EXPORT_TIMEOUT`

specification/protocol/otlp.md

+101-26
Original file line numberDiff line numberDiff line change
@@ -16,14 +16,18 @@ nodes such as collectors and telemetry backends.
1616
* [OTLP/gRPC](#otlpgrpc)
1717
+ [OTLP/gRPC Concurrent Requests](#otlpgrpc-concurrent-requests)
1818
+ [OTLP/gRPC Response](#otlpgrpc-response)
19+
- [Full Success](#full-success)
20+
- [Partial Success](#partial-success)
21+
- [Failures](#failures)
1922
+ [OTLP/gRPC Throttling](#otlpgrpc-throttling)
2023
+ [OTLP/gRPC Service and Protobuf Definitions](#otlpgrpc-service-and-protobuf-definitions)
2124
+ [OTLP/gRPC Default Port](#otlpgrpc-default-port)
2225
* [OTLP/HTTP](#otlphttp)
2326
+ [OTLP/HTTP Request](#otlphttp-request)
2427
+ [OTLP/HTTP Response](#otlphttp-response)
25-
- [Success](#success)
26-
- [Failures](#failures)
28+
- [Full Success](#full-success-1)
29+
- [Partial Success](#partial-success-1)
30+
- [Failures](#failures-1)
2731
- [Bad Data](#bad-data)
2832
- [OTLP/HTTP Throttling](#otlphttp-throttling)
2933
- [All Other Responses](#all-other-responses)
@@ -35,7 +39,6 @@ nodes such as collectors and telemetry backends.
3539
- [Known Limitations](#known-limitations)
3640
* [Request Acknowledgements](#request-acknowledgements)
3741
+ [Duplicate Data](#duplicate-data)
38-
* [Partial Success](#partial-success)
3942
- [Future Versions and Interoperability](#future-versions-and-interoperability)
4043
- [Glossary](#glossary)
4144
- [References](#references)
@@ -145,16 +148,57 @@ was not delivered.
145148

146149
#### OTLP/gRPC Response
147150

148-
The server may respond with either a success or an error to the requests.
151+
The response MUST be the appropriate message (see below for
152+
the specific message to use in the [Full Success](#full-success),
153+
[Partial Success](#partial-success) and [Failure](#failures) cases).
154+
155+
##### Full Success
149156

150-
The success response indicates telemetry data is successfully processed by the
151-
server. If the server receives an empty request (a request that does not carry
157+
The success response indicates telemetry data is successfully accepted by the
158+
server.
159+
160+
If the server receives an empty request (a request that does not carry
152161
any telemetry data) the server SHOULD respond with success.
153162

154-
Success response is returned via
155-
[Export*ServiceResponse](https://github.com/open-telemetry/opentelemetry-proto)
156-
message (`ExportTraceServiceResponse` for traces, `ExportMetricsServiceResponse`
157-
for metrics, `ExportLogsServiceResponse` for logs).
163+
On success, the server response MUST be a
164+
[Export<signal>ServiceResponse](https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto/collector)
165+
message (`ExportTraceServiceResponse` for traces,
166+
`ExportMetricsServiceResponse` for metrics and
167+
`ExportLogsServiceResponse` for logs).
168+
169+
The server MUST leave the `partial_success` field unset
170+
in case of a successful response.
171+
172+
##### Partial Success
173+
174+
If the request is only partially accepted
175+
(i.e. when the server accepts only parts of the data and rejects the rest), the
176+
server response MUST be the same
177+
[Export<signal>ServiceResponse](https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto/collector)
178+
message as in the [Full Success](#full-success) case.
179+
180+
Additionally, the server MUST initialize the `partial_success` field
181+
(`ExportTracePartialSuccess` message for traces,
182+
`ExportMetricsPartialSuccess` message for metrics and
183+
`ExportLogsPartialSuccess` message for logs), and it MUST set the respective
184+
`rejected_spans`, `rejected_data_points` or `rejected_log_records` field with
185+
the number of spans/data points/log records it rejected.
186+
187+
The server SHOULD populate the `error_message` field with a human-readable
188+
error message in English. The message should explain why the
189+
server rejected parts of the data, and might offer guidance on how users
190+
can address the issues.
191+
The protocol does not attempt to define the structure of the error message.
192+
193+
Servers MAY also make use of the `partial_success` field to convey
194+
warnings/suggestions to clients even when the request was fully accepted.
195+
In such cases, the `rejected_<signal>` field MUST have a value of `0` and
196+
the `error_message` field MUST be non-empty.
197+
198+
The client MUST NOT retry the request when it receives a partial success
199+
response where the `partial_success` is populated.
200+
201+
##### Failures
158202

159203
When an error is returned by the server it falls into 2 broad categories:
160204
retryable and not-retryable:
@@ -382,8 +426,9 @@ numbers or strings are accepted when decoding.
382426

383427
#### OTLP/HTTP Response
384428

385-
Response body MUST be the appropriate serialized Protobuf message (see below for
386-
the specific message to use in the Success and Failure cases).
429+
The response body MUST be the appropriate serialized Protobuf message (see below for
430+
the specific message to use in the [Full Success](#full-success-1),
431+
[Partial Success](#partial-success-1) and [Failure](#failures-1) cases).
387432

388433
The server MUST set "Content-Type: application/x-protobuf" header if the
389434
response body is binary-encoded Protobuf payload. The server MUST set
@@ -395,15 +440,52 @@ If the request header "Accept-Encoding: gzip" is present in the request the
395440
server MAY gzip-encode the response and set "Content-Encoding: gzip" response
396441
header.
397442

398-
##### Success
443+
##### Full Success
399444

400-
On success the server MUST respond with `HTTP 200 OK`. Response body MUST be
401-
Protobuf-encoded `ExportTraceServiceResponse` message for traces,
402-
`ExportMetricsServiceResponse` message for metrics and
403-
`ExportLogsServiceResponse` message for logs.
445+
The success response indicates telemetry data is successfully accepted by the
446+
server.
404447

405-
The server SHOULD respond with success no sooner than after successfully
406-
decoding and validating the request.
448+
If the server receives an empty request (a request that does not carry
449+
any telemetry data) the server SHOULD respond with success.
450+
451+
On success, the server MUST respond with `HTTP 200 OK`. The response body MUST be
452+
a Protobuf-encoded
453+
[Export<signal>ServiceResponse](https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto/collector)
454+
message (`ExportTraceServiceResponse` for traces,
455+
`ExportMetricsServiceResponse` for metrics and
456+
`ExportLogsServiceResponse` for logs).
457+
458+
The server MUST leave the `partial_success` field unset
459+
in case of a successful response.
460+
461+
##### Partial Success
462+
463+
If the request is only partially accepted
464+
(i.e. when the server accepts only parts of the data and rejects the rest), the
465+
server MUST respond with `HTTP 200 OK`. The response body MUST be the same
466+
[Export<signal>ServiceResponse](https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto/collector)
467+
message as in the [Full Success](#full-success-1) case.
468+
469+
Additionally, the server MUST initialize the `partial_success` field
470+
(`ExportTracePartialSuccess` message for traces,
471+
`ExportMetricsPartialSuccess` message for metrics and
472+
`ExportLogsPartialSuccess` message for logs), and it MUST set the respective
473+
`rejected_spans`, `rejected_data_points` or `rejected_log_records` field with
474+
the number of spans/data points/log records it rejected.
475+
476+
The server SHOULD populate the `error_message` field with a human-readable
477+
error message in English. The message should explain why the
478+
server rejected parts of the data, and might offer guidance on how users
479+
can address the issues.
480+
The protocol does not attempt to define the structure of the error message.
481+
482+
Servers MAY also make use of the `partial_success` field to convey
483+
warnings/suggestions to clients even when the request was fully accepted.
484+
In such cases, the `rejected_<signal>` field MUST have a value of `0` and
485+
the `error_message` field MUST be non-empty.
486+
487+
The client MUST NOT retry the request when it receives a partial success
488+
response where the `partial_success` is populated.
407489

408490
##### Failures
409491

@@ -520,13 +602,6 @@ received yet. The client will typically choose to re-send such data to guarantee
520602
delivery, which may result in duplicate data on the server side. This is a
521603
deliberate choice and is considered to be the right tradeoff for telemetry data.
522604

523-
### Partial Success
524-
525-
The protocol does not attempt to communicate partial reception success from the
526-
server to the client (i.e. when part of the data can be received by the server
527-
and part of it cannot). Attempting to do so would complicate the protocol and
528-
implementations significantly and is left out as a possible future area of work.
529-
530605
## Future Versions and Interoperability
531606

532607
OTLP will evolve and change over time. Future versions of OTLP must be designed

0 commit comments

Comments
 (0)