Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New opentelemetry source and sink #1444

Open
4 of 6 tasks
binarylogic opened this issue Dec 27, 2019 · 61 comments
Open
4 of 6 tasks

New opentelemetry source and sink #1444

binarylogic opened this issue Dec 27, 2019 · 61 comments
Assignees
Labels
domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources have: must We must have this feature, it is critical to the project's success. It is high priority. type: feature A value-adding code addition that introduce new functionality.

Comments

@binarylogic
Copy link
Contributor

binarylogic commented Dec 27, 2019

OpenTelemetry is a specification for collecting observability data.

Their collector and libraries are of questionable quality. We'd like Vector to support OT through their various protobufs and become the best OT collector.

We should break this down into smaller tasks, likely around their various data type (logs, metrics, and traces). I'd like to start with tracing, if possible, to introduce that type into our data model.

@binarylogic binarylogic added type: new feature needs: approval Needs review & approval before work can begin. needs: requirements Needs a a list of requirements before work can be begin labels Dec 27, 2019
@loony-bean
Copy link
Contributor

Looks relevant #576

@binarylogic binarylogic changed the title New opencensus source and sink New telemetry source and sink Dec 28, 2019
@binarylogic binarylogic changed the title New telemetry source and sink New opentelemetry source and sink Dec 28, 2019
@binarylogic binarylogic added domain: sources Anything related to the Vector's sources domain: sinks Anything related to the Vector's sinks and removed domain: marketing labels Mar 25, 2020
@szibis
Copy link
Contributor

szibis commented Apr 20, 2020

As OpenTracing merged with OpenCensus in one OpenTelemetry project it is considered to support that feature on in and / out ??
It is important to make a Vector replacement for Datadog on the Tracing layer. https://www.datadoghq.com/blog/opentelemetry-instrumentation/ and may also help to build one layer for logs, metrics, and traces.
This also may help to build an architecture that is not vendor locked and allow to switch to other providers easily.

@binarylogic
Copy link
Contributor Author

Thanks @szibis. Agree, that's the idea with thee OpenTelemetry components. We also want them to enforce our tracing data model when we start to implement it.

This also may help to build an architecture that is not vendor locked and allow to switch to other providers easily.

Agree! That's the primary idea behind Vector. Although, Vector wants to acknowledge current state and help users migrate towards open standards.

@binarylogic binarylogic added the have: should We should have this feature, but is not required. It is medium priority. label Apr 20, 2020
@binarylogic binarylogic added type: feature A value-adding code addition that introduce new functionality. and removed type: new feature labels Jun 16, 2020
@binarylogic binarylogic added have: must We must have this feature, it is critical to the project's success. It is high priority. and removed have: should We should have this feature, but is not required. It is medium priority. labels Aug 7, 2020
@LukeMathWalker
Copy link

Hi!
We have been using Vector for a while as a log exporter (stdout -> AWS Kinesis -> ElasticSearch) while we have a separate pipeline for OpenTelemetry traces (application pushes to a Jaeger collector).
We were considering the option of moving to Honeycomb for our observability needs and I noticed Vector provides a honeycomb sink, but it does not provide any OpenTelemetry source. Would we therefore be losing information by using stdout as source and honeycomb as sink compared to pushing our OpenTelemetry data directly into the OpenTelemetry collector and using that to push the data into Honeycomb?

I'd prefer to have a single agent for all our telemetry needs, but it'd be interesting to understand better 👀

@kaarolch
Copy link

kaarolch commented Aug 23, 2021

Any update? I see that tasks related with opentelemetry were removed from different milestones?

@spencergilbert
Copy link
Contributor

Hi @KFearsoff 👋

I do have adding metrics support to the existing opentelemetry source as a work item later this quarter, but otherwise we don't have any existing PRs to add functionality. Was there a particular area of functionality you were looking for first?

@KFearsoff
Copy link

I do have adding metrics support to the existing opentelemetry source as a work item later this quarter

This thread consists of "next quarter". I don't mean any offense to you or the Vector team, but I'd like to take the initiative given the opportunity 😄

Was there a particular area of functionality you were looking for first?

I'm particularly interested in tracing source and sink. I don't really care about metrics or logs, but I'm guessing there will be some overlap (like trace -> log transformation). I'd like to get the sources and sinks done before worrying about transforms, though.

otherwise we don't have any existing PRs to add functionality

I don't mind opening ones 😉
Will be referring to the merged OTel parts, Datadog Agent's code and the development docs from this repo. Hopefully I'll be able to come up with something that gets the job done.

@spencergilbert
Copy link
Contributor

This thread consists of "next quarter". I don't mean any offense to you or the Vector team, but I'd like to take the initiative given the opportunity 😄

No offense taken! I've been disappointed personally with not being able to work on it, but that's just how prioritization goes some times 🙂

I'm particularly interested in tracing source and sink. I don't really care about metrics or logs, but I'm guessing there will be some overlap (like trace -> log transformation). I'd like to get the sources and sinks done before worrying about transforms, though.

It looks like we have a few issues open for transforms:

I'd agree that we need more integrations for receiving and sending trace events before the demands for transforms hits a breaking point. I'm not even sure all of the existing transforms have been updated to support receiving traces.

I don't mind opening ones 😉 Will be referring to the merged OTel parts, Datadog Agent's code and the development docs from this repo. Hopefully I'll be able to come up with something that gets the job done.

👍 sounds good! I think adding trace support to the existing opentelemetry source would make the most sense, and be an easier undertaking than adding a completely new sink. If you'd like to collaborate/discuss I'll open a specific issue for enhancing the source or we can chat in our Discord server.

I suspect much of the code for trace support should be straight forward, currently the trace data model is identical to our log data model (type wise) so I don't expect there will be too many surprises from the existing log implementation.

@ericsampson
Copy link

Godspeed @KFearsoff 🫡

@KFearsoff
Copy link

👍 sounds good! I think adding trace support to the existing opentelemetry source would make the most sense, and be an easier undertaking than adding a completely new sink. If you'd like to collaborate/discuss I'll open a specific issue for enhancing the source or we can chat in our Discord server.

I suspect much of the code for trace support should be straight forward, currently the trace data model is identical to our log data model (type wise) so I don't expect there will be too many surprises from the existing log implementation.

I'm mostly unsure if it would be best to create a dummy source that would take in traces in the current trace data model, or if it's best to start with changing the trace data model (as described in RFC about OTLP traces). I've considered starting with the trace data model, because I think traces and logs are quite different, but I've looked over it and I think starting with that might be a little too ambitious for me 😅

Not sure if treating traces like logs is fine, though (especially because it would make it quite hard to create a sink). What do you think?

I'll join Discord a little bit later, eager to start!

@spencergilbert
Copy link
Contributor

Not sure if treating traces like logs is fine, though (especially because it would make it quite hard to create a sink). What do you think?

I think given that's how we're handling Datadog traces today it should be fine, I'd agree that updating the data model is a much bigger task and would require a lot more consensus and coordination. I'll check what the rest of the team thinks and get back to you.

@spencergilbert
Copy link
Contributor

I'm going to switch over to #17307 for further discussion specific to adding trace support here 👍

@fzyzcjy
Copy link

fzyzcjy commented Jul 18, 2023

Hi, is there any updates? Thanks!

@benjaminhuo
Copy link

benjaminhuo commented Jul 20, 2023

@spencergilbert I'm curious that now vector's OpenTelemetry Source can only ingest logs for now, metrics and traces are not yet supported?

@spencergilbert
Copy link
Contributor

@spencergilbert I'm curious that now vector's OpenTelemetry Source can only ingest logs for now, metrics and traces are not yet supported?

Hi @benjaminhuo - metrics and traces are not yet supported, but we plan to support metrics in the near future.

@srstrickland
Copy link
Contributor

srstrickland commented Sep 2, 2023

Any updates or timeline here? Looking to build out an observability pipeline, and I was excited to use Vector for everything until I noticed the minimal support for OTel. An option we'll be evaluating now is building the entire observability pipeline with OTel collectors & exporters, since in theory they can handle logs, metrics, and traces (though the support for logs via OTel is still very young). My experimentation with vector for logs has been great so far, and it was a huge disappointment to find that we'll likely need another solution for traces (and maybe metrics). Here's the landscape as I see it:

For traces, there's exactly one source (datadog) and one sink (datadog).

For metrics, I am primarily interested in capturing (code-instrumented) application metrics. Ignoring the datadog libs/agent for a moment, it seems like my options for getting these into vector are:

  1. StatsD (ugh)
  2. Prometheus / scraping (not great for batch jobs)
  3. Prometheus / remote write
  4. Logs to metrics (feels like this should be used for edge cases, not the primary mechanism)

Publishing vector-shipped metrics to some endpoint are also a bit limited, but I could probably get by with what's available. The transformations for metrics seem very useful. In particular, the tag cardinality transform seems critical for maintaining downstream health (especially certain DBs). I have yet to find a similar OTel processor, so that's a big selling point for vector metrics, but I'm not sold on the mechanisms for getting metrics into vector. I would love to use OTel.

Am I missing something? Has anyone faced a similar situation? My options seem to be: vector for logs (and maybe metrics), OTel for tracing (and maybe metrics)... Or OTel for everything. (I am not sure about the OTel mechanism & data model for logs, though).

Reading through the history on this thread, it's hard for me to discount the likelihood that this work will never receive internal prioritization because it's counter to datadog's business model. I get it. Datadog needs to sell datadog. Maybe the answer here is for community contribution, as some folks here have already discussed. It would be great to get an idea on where things stand, so we're not just waiting for Godot.

@cboudereau
Copy link

cboudereau commented Sep 3, 2023

Hi @srstrickland,

I had barely the same feelings for vector vs otel. Here is my comparison to help the vector team to enhance OTEL support.

You will find a comparison and working demo (end to end from signal to grafana dashboard for services and agents https://github.com/cboudereau/consul-demo/blob/main/service-discovery/up.sh). I made this demo during DataDog evaluation where I showed the solution to the DataDog Engineer solution from parsing with agent observability to monitor agents for vector and otelcol-contrib:
https://github.com/cboudereau/consul-demo/tree/main/service-discovery/agents

In our case we choose both but we would like to reduce our complexity.

hdost added a commit to hdost/vector that referenced this issue Nov 17, 2023
This will also make sure that all the necessary files exist for the rest of the signals.

Relates vectordotdev#1444
Signed-off-by: Harold Dost <[email protected]>
hdost added a commit to hdost/vector that referenced this issue Nov 17, 2023
This will also make sure that all the necessary files exist for the rest of the signals.

Relates vectordotdev#1444
Signed-off-by: Harold Dost <[email protected]>
hdost added a commit to hdost/vector that referenced this issue Nov 22, 2023
This will also make sure that all the necessary files exist for the rest of the signals.

Relates vectordotdev#1444
Signed-off-by: Harold Dost <[email protected]>
@gaby
Copy link

gaby commented Nov 23, 2023

I also ran into the same issue. All this time I thought Vector supported OTEL, to find out it only supports part of it.

github-merge-queue bot pushed a commit that referenced this issue Nov 23, 2023
This will also make sure that all the necessary files exist for the rest of the signals.

Relates #1444
Signed-off-by: Harold Dost <[email protected]>
hdost added a commit to hdost/vector that referenced this issue Mar 25, 2024
Provides a path forward for integrating `sources` and `sinks` for Tracing.

Relates vectordotdev#1444, vectordotdev#17307, vectordotdev#17308
@fpytloun
Copy link
Contributor

fpytloun commented Apr 9, 2024

Any plan to increase priority for implementing full OTEL support?

AndrooTheChen pushed a commit to discord/vector that referenced this issue Sep 23, 2024
…dev#19188)

This will also make sure that all the necessary files exist for the rest of the signals.

Relates vectordotdev#1444
Signed-off-by: Harold Dost <[email protected]>
@hobbyhorse
Copy link

This is the most depressing thing I have seen on github in a while.

@pront pront self-assigned this Nov 6, 2024
@pront
Copy link
Member

pront commented Nov 27, 2024

Basic support for ingesting logs/metrics/traces was added here. Future work involves gRPC support and native event conversion.

Leaving this ticket open until we add support for metrics and traces to the OpenTelemetry source. Then we can close this and the community can start creating more targeted issues for bugs/enhancements.

@fpytloun
Copy link
Contributor

Basic support for ingesting logs/metrics/traces was added here. Future work involves gRPC support and native event conversion.

Leaving this ticket open until we add support for metrics and traces to the OpenTelemetry source. Then we can close this and the community can start creating more targeted issues for bugs/enhancements.

Nice. That's a good start. I was more thinking about having source than sink to have Vector in role of aggregator.

@pront
Copy link
Member

pront commented Nov 27, 2024

I was more thinking about having source than sink to have Vector in role of aggregator.

This should work for logs and traces. Did you try it? Not sure how well tested it is. Feel free open a discussion or a new issue.

Again, it would be easier if we stopped using this issue as an OpenTelemetry/Vector catch-all.

hdost added a commit to hdost/vector that referenced this issue Dec 30, 2024
Provides a path forward for integrating `sources` and `sinks` for Tracing.

Relates vectordotdev#1444, vectordotdev#17307, vectordotdev#17308
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources have: must We must have this feature, it is critical to the project's success. It is high priority. type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests