Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop support for jaeger-thrift #2859

Closed
breedx-splk opened this issue Jun 30, 2022 · 31 comments · Fixed by #2858
Closed

Drop support for jaeger-thrift #2859

breedx-splk opened this issue Jun 30, 2022 · 31 comments · Fixed by #2858
Assignees

Comments

@breedx-splk
Copy link
Contributor

I'd like to propose that we drop support for the jaeger-thrift exporter. The upstream repo has been marked deprecated and they're actively encouraging folks to use OpenTelemetry instead. They're also refusing security patches.

Furthermore, the docs indicate

Since v1.35, the Jaeger backend can receive trace data from the OpenTelemetry 
SDKs in their native OpenTelemetry Protocol (OTLP)].

Overall, I think this just makes sense and reduces an overwhelming number of choices for end users.
Also curious what @pavolloffay thinks.

@breedx-splk
Copy link
Contributor Author

@jkwatson
Copy link
Contributor

We're currently using this at Verta, so I would prefer that we don't drop support for it quite yet.

@breedx-splk
Copy link
Contributor Author

Yeah I'm proposing that we keep it in the splunk distro until we deprecate the Smart Agent, which won't happen until December, but I figured upstream could get ahead of that (or that we'd need this very discussion 😁 )

@jkwatson
Copy link
Contributor

jkwatson commented Jul 1, 2022

Yeah I'm proposing that we keep it in the splunk distro until we deprecate the Smart Agent, which won't happen until December, but I figured upstream could get ahead of that (or that we'd need this very discussion 😁 )

👍🏽 It won't be hard for us to switch over to using grpc, rather than thrift for our jaeger exporter; I just need some lead time to make sure it gets on the schedule.

@pavolloffay
Copy link
Member

No objections from the Jaeger community.

We recommend people to use Jaeger gRPC and/or switch to OTLP.

@anuraaga
Copy link
Contributor

anuraaga commented Jul 4, 2022

@pavolloffay I have heard from some users that while gRPC is the standard for Jaeger collector, UDP+Thrift is still idiomatic for Jaeger agent so they try to use it. I guess the jaeger-agent at least doesn't document supporting gRPC

https://www.jaegertracing.io/docs/1.35/deployment/#agent

Is it safe to say that we don't support the Jaeger agent with OTel SDKs and they must switch to using the OTel collector as an agent?

(Note for context, our README mentions "Jaeger over Thrift Over HTTP", but while it may not have been intentional, we support UDP by accepting ThriftSender)

@pavolloffay
Copy link
Member

I guess the jaeger-agent at least doesn't document supporting gRPC

Correct, Jaeger agent does not support gRPC nor thrift over HTTP (the thrift exporter in this repo https://github.com/open-telemetry/opentelemetry-java/tree/main/exporters/jaeger-thrift)

I didn't know a sender can be configured https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/jaeger-thrift/src/main/java/io/opentelemetry/exporter/jaeger/thrift/JaegerThriftSpanExporterBuilder.java#L26
I don't know if people are using this API.

@breedx-splk what is the motivation to remove jaeger-thrift? Is there any technical reason or just drop it because Jaeger clients are deprecated?

@anuraaga
Copy link
Contributor

anuraaga commented Jul 4, 2022

It may not have been SDK usage, I think I was remembering this ask for support in the javaagent

open-telemetry/opentelemetry-java-instrumentation#5821

There does seem to be one OSS usage

https://github.com/xkazxx/jaeger_demo/blob/e4e04be299e923beeb0434a5989775fa264ec4b7/src/main/java/com/xkazxx/jaegerdemo/config/JaegerDemoWebConfigure.java#L27

In general, even though the use of the deprecated dependency isn't ideal, it looks like the final release was made 3 days ago, so it's not an ancient library yet. Without a known security issue, I'd still err towards keeping what we have for now without a more proactive push for removal. A proactive push for removal would probably mean

  1. Removing Jaeger Thrift from spec compliance table https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md#exporters
  2. Documenting that OpenTelemetry doesn't officially support Jaeger thrift in the jaeger exporter spec, or at least that it is fine for SDKs to ignore it. https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk_exporters/jaeger.md
  3. (Maybe) Jaeger deprecating jaeger-agent officially, or otherwise adding text to their document suggesting that OpenTelemetry users should be using the OTel collector, not the jaeger-agent https://www.jaegertracing.io/docs/1.35/deployment/#agent

I'd be supportive of that happening if someone wanted to drive it through ;)

@pavolloffay
Copy link
Member

(Maybe) Jaeger deprecating jaeger-agent officially

There is no plan to deprecate jaeger-agent. At least I haven't heard anything about it in the Jaeger community.

The jaeger-java client was released recently to fix some CVEs.

@breedx-splk
Copy link
Contributor Author

breedx-splk commented Jul 7, 2022

Correct, Jaeger agent does not support gRPC nor thrift over HTTP (the thrift exporter in this repo https://github.com/open-telemetry/opentelemetry-java/tree/main/exporters/jaeger-thrift)

Right, and that repo depends on jaeger-client which is now formally deprecated and the repo has been archived.

@breedx-splk what is the motivation to remove jaeger-thrift? Is there any technical reason or just drop it because Jaeger clients are deprecated?

Yeah, the main thing is that it's a liability. If something comes up with jaeger-client (like a bug or a security fix or whatever) we (opentelemetry) will have no recourse other than forking the archived repo. It also creates a confusing story for the user -- users who land on the jaeger page that says to use OTLP might get confused by the fact that there's a jaeger exporter and configure that into their app. If the exporter doesn't exist any more, it reduces the chances of that happening and improves the OpenTelemetry as the standard.

The jaeger-java client was released recently to fix some CVEs.

I know, one of those was mine...and it was literally the day before the repo was archived.

@anuraaga
Copy link
Contributor

anuraaga commented Jul 8, 2022

it reduces the chances of that happening and improves the OpenTelemetry as the standard.

I agree with this :) So listed out steps that we should probably take to remove it from OpenTelemetry completely. I think we just need a volunteer to drive that (hint hint :P)

@yurishkuro
Copy link
Member

Any objections if I transfer this issue into OTEL Spec repo?

@jack-berg
Copy link
Member

Nope!

@yurishkuro yurishkuro transferred this issue from open-telemetry/opentelemetry-java Oct 10, 2022
@yurishkuro
Copy link
Member

I'm not opposed to this overall. But to add to what @anuraaga said (#2859 (comment)):

  • Should there be some public announcement that this deprecation is happening?
  • I think there needs to be a waiting period after the deprecation announcement before the functionality can be removed from the SDKs, to give organizations time to migrate (changing an exporter in 1000s of services is not a one-day affair)
  • it is not clear to me that simply dropping the UDP exporter is the right move. UDP export mode has its own problems, but also has some benefits (e.g. the SDK does not spam your logs in dev simply because no receiver is running). Overall I think UDP mode has more problems than benefits, but it would be good to do some survey of the end users on how they feel about not having the UDP-to-local-agent export path.

@breedx-splk
Copy link
Contributor Author

@yurishkuro thanks for chiming in on this and being open about it. 👍🏻

  • Should there be some public announcement that this deprecation is happening?

Probably? Did you have something specific in mind and/or can we ask the comms SIG for help?

  • I think there needs to be a waiting period after the deprecation announcement before the functionality can be removed from the SDKs, to give organizations time to migrate (changing an exporter in 1000s of services is not a one-day affair)

Yeah, totally agree. I guess another thing to keep in mind is that users would only be facing this exporter change if/when they were upgrading an sdk or agent, which is maybe also not a 1 day affair. Point being, they're already in there making changes.

I guess the next question is then how long? 6 months? 12 months? Something else?

  • it is not clear to me that simply dropping the UDP exporter is the right move. UDP export mode has its own problems, but also has some benefits (e.g. the SDK does not spam your logs in dev simply because no receiver is running). Overall I think UDP mode has more problems than benefits, but it would be good to do some survey of the end users on how they feel about not having the UDP-to-local-agent export path.

Interesting...I hadn't anticipated log spamming to be a consideration here! 🙃 I wonder what a good mechanism to surveying users could be. It looks like the slack channel #jaeger has a lot of users. As with any sizable install base, there are likely to be folks who are reluctant to change for various reasons and we'll have to balance that.

@yurishkuro
Copy link
Member

I would go with a tweet from OTEL account pointing to this issue and asking for votes / comments. Then we can decide about the waiting period. Changing from sending data to a local host agent to exporting to a remote gRPC collector could be a significant architectural change that requires establishing some discovery/lb mechanism that the orgs may not have needed before.

@breedx-splk
Copy link
Contributor Author

breedx-splk commented Oct 20, 2022

@yurishkuro I have met with the comms sig and now I am working on a blog post and will engage in #jaeger channel in CNCF slack as well.

One minor clarification: This issue was initially scoped to only the thrift based jaeger exporters, but #2858 is really about deprecating all flavors. Since that has a larger scope, I will point users there for comments/feedback.

@anuraaga
Copy link
Contributor

significant architectural change that requires establishing some discovery/lb mechanism that the orgs may not have needed before.

Happen to still be seeing this issue so wanted to add one small point - the idea was to replace a jaeger agent with a sidecar OTel collector I think, not a remote one. Switching to remote would be a huge architectural change and probably blocked in a lot of cases, but I think we were hoping that swapping the local host agent binary would not have anywhere near as much friction. Just in case this helps with the comms @breedx-splk

@breedx-splk
Copy link
Contributor Author

significant architectural change that requires establishing some discovery/lb mechanism that the orgs may not have needed before.

Happen to still be seeing this issue so wanted to add one small point - the idea was to replace a jaeger agent with a sidecar OTel collector I think, not a remote one. Switching to remote would be a huge architectural change and probably blocked in a lot of cases, but I think we were hoping that swapping the local host agent binary would not have anywhere near as much friction. Just in case this helps with the comms @breedx-splk

Thanks @anuraaga, I appreciate that. I don't have a strong sense of how widespread the implementation of the two architectural models are (jaeger agent vs. "direct" ingest). If users currently leverage a Jaeger agent, I suppose they will probably need to switch to a collector....that is, unless the agent also supports OTLP (it sounds like it doesn't?)

@yurishkuro
Copy link
Member

This issue was initially scoped to only the thrift based jaeger exporters, but #2858 is really about deprecating all flavors.

@breedx-splk I still find it acceptable to deprecate all flavors, we just need to give people more time to migrate away.

@breedx-splk
Copy link
Contributor Author

@yurishkuro On the otel blog post PR, @austinlparker brought up a good question about cross-posting with the Jaeger blog. Any chance you can assist with that? 🙏🏻

@yurishkuro
Copy link
Member

@breedx-splk any responses to the survey?

@pavolloffay
Copy link
Member

What exactly is this issue deprecating? Is the intent to deprecate all APIs using jaeger.thrift?

There are a couple of APIs using jaeger.thrift:

  • collector 14268/http
  • agent 6832/udp
  • agent 6832/udp

If that is the case I would maybe prefer to move these APIs to a separate receiver.

@yurishkuro
Copy link
Member

I believe the intent is to only deprecate support in the SDK, not in the backend.

@breedx-splk
Copy link
Contributor Author

What exactly is this issue deprecating? Is the intent to deprecate all APIs using jaeger.thrift?
@pavolloffay As @yurishkuro already covered, the intent is to deprecate it in the SDKs. I expect we'd want to leave it in the collector for a while to ease migration but also to support some specific jager-agent use cases (eg. aggregation, sampling strategy).

@breedx-splk any responses to the survey?

(Sorry, was out last week).

So here are the raw results. We had 15 responses between Nov. 2nd and Nov. 21st.

I'll paste screenshots as of today for each question:

image

image

On this scale, 1 is `Quick and easy` and 10 is `Slow and labor-intensive`

image

image

So then the challenge is to turn these results into a deprecation period. About 1/4 of respondents checked "Longer than 12 months", and all of those rated the difficulty as an 8 out of 10.

My brain says 3 months (to mitigate impending issues with the deprecated dependencies) but my gut says 6 months. If we take "zero" to mean 0.5 months and we take 12+ months to mean 12 months, the average across all responses is 5.6 months. (I added this to the sheet).

Looking forward to what other folks think and how they interpret the results.

@yurishkuro
Copy link
Member

What is the downside of keeping support through 2023 and perhaps move to contrib in 2024? Are the implementations of Jaeger exporters cloned into OTEL repos or use original Jaeger repos as dependencies?

@breedx-splk
Copy link
Contributor Author

What is the downside of keeping support through 2023 and perhaps move to contrib in 2024? Are the implementations of Jaeger exporters cloned into OTEL repos or use original Jaeger repos as dependencies?

I can't speak for every language, but at least Java uses the original jaeger client repos as a dependency. I think that's true of at least a couple other languages too. The downside then is that when a vuln is discovered, the instrumentation will have no simple remediation and will need to reimplement the client functionality.

This is a major motivation for the deprecation in the first place.

@breedx-splk
Copy link
Contributor Author

Would love to get consensus on a number of months so that we can take #2858 out of draft. What do folks think?

I'm still leaning toward 6 months deprecation, with spec removal in July 2023.

Seems like a reasonable amount of time...and even after that , teams can choose not to upgrade.

@breedx-splk
Copy link
Contributor Author

Ok, #2858 is ready for review.

brettmc added a commit to open-telemetry/opentelemetry-php that referenced this issue Dec 9, 2022
Jaeger has dropped support for the jaeger exporter in favour of opentelemetry exporter. The upstream repos are now archived.
Support has been dropped from the opentelemetry collector, and other SIGs should follow soon.
ref: open-telemetry/opentelemetry-specification#2858 and open-telemetry/opentelemetry-specification#2859
otel-php-bot pushed a commit to opentelemetry-php/sdk that referenced this issue Dec 9, 2022
Jaeger has dropped support for the jaeger exporter in favour of opentelemetry exporter. The upstream repos are now archived.
Support has been dropped from the opentelemetry collector, and other SIGs should follow soon.
ref: open-telemetry/opentelemetry-specification#2858 and open-telemetry/opentelemetry-specification#2859
otel-php-bot pushed a commit to opentelemetry-php/sdk-contrib that referenced this issue Dec 9, 2022
Jaeger has dropped support for the jaeger exporter in favour of opentelemetry exporter. The upstream repos are now archived.
Support has been dropped from the opentelemetry collector, and other SIGs should follow soon.
ref: open-telemetry/opentelemetry-specification#2858 and open-telemetry/opentelemetry-specification#2859
@arminru arminru linked a pull request Dec 13, 2022 that will close this issue
@EricBuist
Copy link

This is a bad idea as Jaeger all in one doesn't accept GRPC at all. I'm searching since more than half an hour, cannot find any accurate information about the GRPC port. Seems for GRPC, we have to deploy the full Jaeger stack as separate services (collector, reporter, and so on), but there is no simple examples of this. Everything has to be read through large documents and guessed through. Yes, removing support for Thrift in favor of GRPC is a good idea, but Jaeger should do the same before.

@yurishkuro
Copy link
Member

@EricBuist yes, all-in-one accepts OTLP, and the docs show how to start it to do that:

https://www.jaegertracing.io/docs/1.42/getting-started/#all-in-one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants