Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenTelemetry protocol design goals and requirements #193

Merged

Conversation

tigrannajaryan
Copy link
Member

Moving the design goals and requirements docs from opentelemetry-proto
to this repository.

The design goals and requirements will help us design the right wire protocol.

The design goals part has been already approved and merged in
opentelemetry-proto, the requirements part were under review here:
open-telemetry/opentelemetry-proto#22

Copy link
Member

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


#### Compression

The protocol must achieve high compression ratios for telemetry data. The protocol design must consider batching of telemetry data and grouping of similar data (both can help to achieve better compression using common compression algorithms).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without sacrificing the goal of being "Load Balancer Friendly".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we have a number of requirements that may actually be at odds, and there may be tradeoff's involved. I think it is implicitly understood (and not just for Compression vs Load Balancing, etc).


#### Compression

The protocol must achieve high compression ratios for telemetry data. The protocol design must consider batching of telemetry data and grouping of similar data (both can help to achieve better compression using common compression algorithms).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For batching, I wonder if the protocol should support streaming - the receiver should be able to start processing the 1st item upon arrival without having to wait for the entire batch to finish transmission.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be somewhat difficult to achieve. I think this requirement should be in the realms of the special exporters and special protocols which aim to minimize the overall latency. Streaming SDK implementation were discussed a few times and I think they are somewhat niche, so I'd leave that out from general requirements.

Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with these requirements, and they're well stated. Thanks!

@tedsuo
Copy link
Contributor

tedsuo commented Jul 30, 2019

Do we expect the exchange protocol to be the same across all transports? HTTP/1.1, gRPC, and UDP all have different qualities, and can support different features, such as backpressure. Alternatively, these protocol design goals part of how we pick which transports we want to support?

In our data formats discussions, we initially proposed to have "exchange protocol" and "transport" be split into separate layers, but I'm starting to wonder if that is realistic. Should we instead be defining an exchange protocol and a transport together as a single choice, and only add support for a new transport when we decide we want to offer a different protocol with different trade offs?

@tigrannajaryan
Copy link
Member Author

@jmacd

Do we expect the exchange protocol to be the same across all transports? HTTP/1.1, gRPC, and UDP all have different qualities, and can support different features, such as backpressure. Alternatively, these protocol design goals part of how we pick which transports we want to support?

Yes, I am aiming to choose one that shows best overral performance based on prototyping / measurements.

I do not see the need to support multiple transports.

In our data formats discussions, we initially proposed to have "exchange protocol" and "transport" be split into separate layers, but I'm starting to wonder if that is realistic. Should we instead be defining an exchange protocol and a transport together as a single choice, and only add support for a new transport when we decide we want to offer a different protocol with different trade offs?

The separation is to aid understanding, not because we think they must be exposed as separate components to the user. You are right that we should define it as a single choice.

There is plenty of alternate choice for telemetry protocols and transports if the users wants something different (e.g. OpenCensus, Jaeger, etc). These are well supported by their vendors and also by OpenTelemetry Service. It is not difficult for vendors to implement OpenTelemetry library exporters for their protocol and the user will not have to user OpenTelemetry protocol if they don't want to.

@yurishkuro
Copy link
Member

My 2c:

  • exchange format and transport protocol should be separate. The goals should be split into data representation concerns (CPU and mem pressure for marshalling, enrichments, incremental reporting, etc) and delivery (retries, back pressure, acks).
  • the implementation should allow to treat those two parts independently. If someone wants to choose UDP or a Kafka for delivery, they should still be able to use the encoding format
  • I don't know if we want to state that explicitly, but inventing a new encoding mechanism (e.g. "like protobuf but better") would be a huge turn-off for me. I would not want to use a library that depends on some bespoke, custom serialization that needs to be maintained across all languages and compete for resources with existing projects in this space
  • same argument against custom transport (don't want another tchannel situation)
  • schema backwards compatibility isn't mentioned anywhere, but will be part of the encoding mechanism (existing formats already made those decisions for us, eg protobuf vs avro have different approaches to compatibility, because of different design goals)

@tigrannajaryan
Copy link
Member Author

@yurishkuro

exchange format and transport protocol should be separate. The goals should be split into data representation concerns (CPU and mem pressure for marshalling, enrichments, incremental reporting, etc) and delivery (retries, back pressure, acks).
the implementation should allow to treat those two parts independently. If someone wants to choose UDP or a Kafka for delivery, they should still be able to use the encoding format

I agree, that implementation should clearly delineate these 2.

I am not sure if you are also suggesting that the end users should be able to swap out for a different transport or that you advocate that it is a possibility for OpenTelemetry maintainers to change the transport in the future or provide multiple transports for the end user to choose from. Can you clarify this?

@tigrannajaryan
Copy link
Member Author

@yurishkuro

I don't know if we want to state that explicitly, but inventing a new encoding mechanism (e.g. "like protobuf but better") would be a huge turn-off for me. I would not want to use a library that depends on some bespoke, custom serialization that needs to be maintained across all languages and compete for resources with existing projects in this space
same argument against custom transport (don't want another tchannel situation)

I agree. Definitely not something I would want to do either. It has to be a well-known, stable, battle tested encoding mechanism with ubiquitous availability of implementations in wide selection of languages that are supported by OpenTelemetry.

@tigrannajaryan
Copy link
Member Author

@yurishkuro

schema backwards compatibility isn't mentioned anywhere, but will be part of the encoding mechanism (existing formats already made those decisions for us, eg protobuf vs avro have different approaches to compatibility, because of different design goals)

This is what I have in the doc, do you think it is not enough?

#### Backwards Compatibility

The protocol should be possible to evolve over time. It should be possible for nodes that implement different versions of OpenTelemetry protocol to interoperate (while possibly regressing to the lowest common denominator from functional perspective).

@yurishkuro
Copy link
Member

I am not sure if you are also suggesting that the end users should be able to swap out for a different transport or that you advocate that it is a possibility for OpenTelemetry maintainers to change the transport in the future or provide multiple transports for the end user to choose from. Can you clarify this?

Both, I think. The main thing is for the SDK to treat encoding and transport as independently pluggable

@yurishkuro
Copy link
Member

It has to be a well-known, stable, battle tested encoding mechanism with ubiquitous availability of implementations in wide selection of languages that are supported by OpenTelemetry.

maybe this should be called out in the requirements

@yurishkuro
Copy link
Member

The protocol should be possible to evolve over time. It should be possible for nodes that implement different versions of OpenTelemetry protocol to interoperate (while possibly regressing to the lowest common denominator from functional perspective).

That's probably enough for now.

@tigrannajaryan
Copy link
Member Author

Both, I think. The main thing is for the SDK to treat encoding and transport as independently pluggable

@yurishkuro The SDK will treat the protocol as pluggable. The protocol has to implement the Exporter interface.

The encoding and transport are internal implementation details of the Exporter which the SDK is not concerned with.

Whether we want to clearly expose the OpenTelemtry encoding and transport as separate components to the end-user can be decided later. I think it can be a detail of particular implementation of OpenTelemetry protocol specification. I will aim to write the specification itself in a way that makes it clear what is encoding and what is the transport.

@tigrannajaryan tigrannajaryan force-pushed the feature/tigran/newprotocol branch from e4a8c80 to 2a77ccf Compare July 30, 2019 16:54
@tigrannajaryan
Copy link
Member Author

It has to be a well-known, stable, battle tested encoding mechanism with ubiquitous availability of implementations in wide selection of languages that are supported by OpenTelemetry.

maybe this should be called out in the requirements

@yurishkuro I added this as the last requirement to the doc.

@tigrannajaryan
Copy link
Member Author

Reviewers, I need more comments or one more approval to merge this.

Moving the design goals and requirements docs from opentelemetry-proto
to this repository.

The design goals and requirements will help us design the right wire protocol.

The design goals part has been already approved and merged in
opentelemetry-proto, the requirements part were under review here:
open-telemetry/opentelemetry-proto#22
@tigrannajaryan tigrannajaryan force-pushed the feature/tigran/newprotocol branch from 2a77ccf to fd5730d Compare July 31, 2019 20:48
@yurishkuro
Copy link
Member

Since this is going directly into Specification, I still find it problematic that we're conflating encoding and transport under the single "wire protocol" term. When I first saw the PR title I thought this would be about data model and encoding, not the transport concerns like reliable delivery. I think it's reasonable to converge on the data model/encoding, but the transport is very dependent on specific user requirements

@bogdandrutu
Copy link
Member

@tigrannajaryan can you respond to @yurishkuro 's concerns. Thanks.

@tigrannajaryan
Copy link
Member Author

Since this is going directly into Specification, I still find it problematic that we're conflating encoding and transport under the single "wire protocol" term. When I first saw the PR title I thought this would be about data model and encoding, not the transport concerns like reliable delivery. I think it's reasonable to converge on the data model/encoding, but the transport is very dependent on specific user requirements

@yurishkuro sorry for late response. I eliminated the term "wire protocol" in favour of simply "protocol", I agree it was not adding any value.

I think it is important to define how the data will be sent over a particular transport and make that part of the specification. It does not preclude from specifying other transports that carry the same telemetry data in the future as extensions to the protocol. However, I feel that not having any transport defined at all in the protocol RFC would reduce the value of such RFC significantly because it will be essentially non-implementable in code.

I am aiming to clearly define at least one transport in the RFC and specifically mention how it can be extended to other transports in the future.

@tigrannajaryan
Copy link
Member Author

@yurishkuro if you are OK with current wording let's merge this PR.

@yurishkuro yurishkuro merged commit af7f161 into open-telemetry:master Aug 16, 2019
@tigrannajaryan
Copy link
Member Author

Thanks.

@tigrannajaryan tigrannajaryan deleted the feature/tigran/newprotocol branch August 16, 2019 15:58
SergeyKanzhelev pushed a commit to SergeyKanzhelev/opentelemetry-specification that referenced this pull request Feb 18, 2020
…try#193)

* Add OpenTelemetry protocol design goals and requirements

Moving the design goals and requirements docs from opentelemetry-proto
to this repository.

The design goals and requirements will help us design the right wire protocol.

The design goals part has been already approved and merged in
opentelemetry-proto, the requirements part were under review here:
open-telemetry/opentelemetry-proto#22

* Address PR comments
TuckTuckFloof pushed a commit to TuckTuckFloof/opentelemetry-specification that referenced this pull request Oct 15, 2020
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
…try#193)

* Add OpenTelemetry protocol design goals and requirements

Moving the design goals and requirements docs from opentelemetry-proto
to this repository.

The design goals and requirements will help us design the right wire protocol.

The design goals part has been already approved and merged in
opentelemetry-proto, the requirements part were under review here:
open-telemetry/opentelemetry-proto#22

* Address PR comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants