-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch Jaeger remote sampler to use grpc lite #4043
Switch Jaeger remote sampler to use grpc lite #4043
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this, looks great! Hope you could learn something about gRPC and enjoy it while writing the PR. If not, then good job anyways :)
...ler/src/main/java/io/opentelemetry/sdk/extension/trace/jaeger/sampler/OkHttpGrpcService.java
Outdated
Show resolved
Hide resolved
@@ -3,6 +3,8 @@ plugins { | |||
id("otel.publish-conventions") | |||
|
|||
id("otel.animalsniffer-conventions") | |||
|
|||
id("com.squareup.wire") | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need as many as the OTLP tests but can we add one test suite with the grpc dependencies used?
It's ok to just copy the test code instead of abstract
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added support only for "grpc light" approach via okhttp.
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
7aa6250
to
4442800
Compare
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #4043 +/- ##
============================================
+ Coverage 90.12% 90.13% +0.01%
- Complexity 4374 4486 +112
============================================
Files 518 530 +12
Lines 13303 13788 +485
Branches 1276 1321 +45
============================================
+ Hits 11989 12428 +439
- Misses 909 926 +17
- Partials 405 434 +29
Continue to review full report at Codecov.
|
@anuraaga the coverage is low, but it seems like the "grpc lite" classes in |
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pavolloffay - I have added back code from my commit. I had to revert the change of the method signature to void
. There is some more cleanup that can be done but that can happen later
Do not merge yet, the tests are flaky. |
I am going to look at fixing them. regarding the return type of the service, it would be appropriate propagate the result of the call to the caller. |
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
@anuraaga I could not reproduce the failing test (
Now it seems that it is passing on CI. |
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
The second to last CI job failed on the flaky test. @anuraaga any ideas why the logs are not being captured?
|
…fay/openconsensus into jaeger-remote-sampler-grpc-lite
@pavolloffay I added another commit 0bf83a1 to workaround issue with logunit. We've generally been able to avoid problems by using |
Hmm had confidence in my workaround but somehow it's not working, need to look more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of additional comments. Looking pretty good though 👍
...s/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/ExporterBuilderUtil.java
Show resolved
Hide resolved
requireNonNull(retryPolicy, "retryPolicy"); | ||
this.retryPolicy = retryPolicy; | ||
return this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RetryPolicy
implementation is experimental and is part of the OTLP spec. I think you should remove the methods which are not exposed in JaegerRemoteSamplerBuilder
from io.opentelemetry.sdk.extension.trace.jaeger.sampler.GrpcServiceBuilder
:
addRetryPolicy
addHeader
setCompression
setTimeout
Also consider removing setChannel(ManagedChannel)
if you don't anticipate people using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in a soon followup we can add all except addRetryPolicy
to the builder. I don't think it hurts too much keeping them.
setChannel
we hope is not used but since it was public let's stick with having it for now and deprecating all of our setChannel
methods in one PR
.../main/java/io/opentelemetry/sdk/extension/trace/jaeger/sampler/SamplingStrategyResponse.java
Outdated
Show resolved
Hide resolved
...e-sampler/src/main/java/io/opentelemetry/sdk/extension/trace/jaeger/sampler/UnMarshaler.java
Outdated
Show resolved
Hide resolved
@pavolloffay FYI I added some commits, please don't lose them :) For the record, I will be on the hook if the build is flaky after merging this. I am hopeful for d5a38b5 my rough hypothesis is that due to HTTP/2 details, it depends on the state of the connection if there is an empty body sent and the trailer includes the status, or the headers include the status and there's no body and body read fails, which previously skipped our status handling logic. The reason I had this idea is because I saw the "could not consume server response." being logged even though none of our test cases are meant to bail on that line. |
Thanks @pavolloffay for helping with this tedious PR! |
it was fun to look at the proto and grcp! Thanks for guidance as well :P |
Resolves #4014
Notable changes:
CodedInpotStream
to support additional data types