Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need guidance for creating message-based tracing system using Apache Pulsar #1945

Closed
devinbost opened this issue Nov 27, 2019 · 5 comments
Closed

Comments

@devinbost
Copy link

Requirement - Integrating Jaeger into Apache Pulsar for message-based tracing

Apache Pulsar is like a next-generation Kafka that supports functions.
Tracing integrations have already been built for Kafka. They have not yet been built for Pulsar.
Moreover, we wish to build a message-based implementation, which is a little different from other architectures that we've seen so far.

Problem - how to create message-based Spans

For example, let's say that we receive several different messages that all have a commonID that represent different parts of a sequence of operations.
e.g.
message1 -> function1 -> message2 -> function2 -> message3 -> function3

In this case, we don't have the ability to make code changes to the functions, but we can access the messages in a different way. Each message contains the same commonID that we can use to associate the messages together. The question is if we can use this commonID to link the messages into a single span.

Question details

Do we need to have access to all of the messages simultaneously to put them into the same trace? That would require us to join the messages and then manually construct the spans.
It would be ideal if we could instead create the spans as the messages arrive in a way that would include them all in the same span.

The concern

My concern is that it appears that we would need to have access to their contexts in order to link them as parts of the same span.
In the book Mastering Distributed Tracing (which is a great book @yurishkuro ), it wasn't clear to me how the inject and extract methods work and if I can use them on messages that come in sequence (i.e. if the Jaeger collector is somehow able to put the spans together), rather than needing to join the messages together to create a span that includes all of them.

Is this explanation of the problem sufficiently clear?

@objectiser
Copy link
Contributor

@devinbost Sorry it is not clear to me.

e.g. message1 -> function1 -> message2 -> function2 -> message3 -> function3

This seems to imply that the output of function1 is message2 which is then input into function2, etc. Is that correct?

If so then why would you want a single span to represent all three messages, rather than having a span per function call?

@devinbost
Copy link
Author

Sorry, I misspoke. We want a single trace to represent all three messages. We want a span for each function call. In the diagram below, in the Jaeger Sink, it may be more correct to represent the parent span as a trace.

image

Since we're dealing with a distributed messaging system, we don't necessarily have the ability to make code changes to the functions in the system. (e.g. Imagine that you were a service provider like AWS and needed to support tracing on those who uploaded Lambda functions without requiring them to make code changes.)
It seems that the key is for us to be able to control how we're constructing the UUIDs that are consumed by the Jaeger collectors. Since there's not a wire protocol for OpenTracing, we're wondering if there's another way that we can construct the spans.

Does that answer your question?

@devinbost
Copy link
Author

I think this is actually related to this issue: opentracing/specification#81

@objectiser
Copy link
Contributor

@devinbost Can't the tap function extract the span context, create the span and then inject the new context back into the message?

@devinbost
Copy link
Author

We decided to do something similar. We ended up using Flink to join the messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants