InfluxDB as trace storage backend #272

yurishkuro · 2017-07-15T18:42:15Z

Meta-issue no storage backends: #638

There is some work happening here openzipkin/zipkin#1628

My interest at this time is what features such implementation could provide, i.e.

what would be write throughput per node with RF=2
could the backend support indexing of arbitrary tags / log fields, or do they need to be pre-defined
what is the write amplification or perf impact as a function of # of tags/fields per span
- in Cassandra backend every tag is an extra write
- in ES it's extra indexing time on the server
will the backend support correct server-side joins and LIMIT (broken with Cassandra today)
how search with multiple tags would be handled
- in Cassandra it's an AND across different spans from the same service name (weird)
- in ES it's an AND across tags from the same span only (index document is one span)
could the backend support latency aggregates out of the box (by service/endpoint)? This one is something I'd expect InfluxDB to be able to do easily, since it's fundamentally a TS db

xjerod · 2017-07-16T03:28:50Z

From personal experience I've easily done 6 million points per minute to a single node with no issue using the recommended 5000 points per request, however batching is a big key, as your batches get smaller write performance reduces drastically.

In terms of arbitrary tags / log fields, they do not need to be predefined, however fields cannot have a mixed type, so once you set fieldA=int64, fieldA always has to be an int64.

For indexing, tags are always indexed, fields are never indexed. This means that cardinality of tags is a big issue since Influx creates an in-memory index for all tags (might be okay with their new TSI) and any query against a field looking for a specific value causes a scan of the data - this is usually okay since you're generally querying by time span, but something to keep in mind.

Aggregations can be easily implemented with their built in aggregation functions and a groupby service and endpoint

goller · 2017-07-27T22:34:29Z

Hi @yurishkuro we'd like to contribute influxdb as a trace support backend. Currently, we are getting experience with writing spans with telegraf into InfluxDB running with the new TSI engine

@jrbury is absolutely correct on all points. The TSI engine is built to handle much higher cardinality. Here is how we define cardinality: https://docs.influxdata.com/influxdb/v1.3/concepts/glossary/#series-cardinality

I believe that the trace id will dominate the cardinality.

Regarding your other questions:

will the backend support correct server-side joins and LIMIT (broken with Cassandra today)

Influx does not have server-side joins per se, but, it is able to group by any number of tags. Additionally, influx has several meta queries using the SHOW keywords that are used to get information about tag sets. The SELECT and SHOW queries both support LIMIT.

how search with multiple tags would be handled

Multiple tags can be handled with a WHERE clause. The WHERE clause would not neeed to be restrictions of a single service name or single span. I believe it should "just work."

could the backend support latency aggregates out of the box (by service/endpoint)? This one is something I'd expect InfluxDB to be able to do easily, since it's fundamentally a TS db

Yes, I believe this should be in our wheelhouse for sure.

So, what do you think of us trying our hand at implementing the store?

yurishkuro · 2017-07-27T22:45:52Z

As if I could try to stop you!

Seriously though, if you have the cycles and the desire to do this, then by all means. I recommend doing it in some other repo so that you don't have to go through our code reviews until you have a working proof of concept and run some integration and stress tests. Note that we have some integration tests that (in theory) should work across different storage backends - ./plugin/storage/integration/...

yurishkuro · 2017-08-22T02:39:17Z

@goller just saw this https://github.com/influxdata/jaeger. Just curious - why are you going after zipkin's nomenclature ("binary annotations" etc. ) instead of OpenTracing, given that you're already operating on Jaeger's domain model? It seems like extra work. Note that Jaeger backend can both produce and consume Zipkin model if necessary.

goller · 2017-08-22T03:58:26Z

Probably a lack of experience on my part !

…

On Aug 21, 2017, at 9:39 PM, Yuri Shkuro ***@***.***> wrote: @goller just saw this https://github.com/influxdata/jaeger. Just curious - why are you going after zipkin's nomenclature ("binary annotations" etc. ) instead of OpenTracing, given that you're already operating on Jaeger's domain model? It seems like extra work. Note that Jaeger backend can both produce and consume Zipkin model if necessary. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

goller · 2017-08-22T04:06:29Z

@yurishkuro To better understand zipkin's model, we implemented a telegraf plugin here: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/zipkin

Our goal is to support OpenTracing for sure, but, we figured we would support zipkin's data model to store into influxdb via telegraf. That way both jaeger and zipkin could read data from it.

Do you think it would be better for the collection of spans to be stored using the OpenTracing naming?

codefromthecrypt · 2017-08-22T04:14:36Z

Since Chris is new to all this, he should know jury is out on whether there's a data model for opentracing. I'd be careful to pre-emptively label anything as such as it might mislead people or clash with an actual spec. opentracing/specification#64 IOTW, jaeger definitely wrote their model around naming inside OpenTracing, but that doesn't imply there's any official or stable means to do that. If you model based on jaeger, you are just modeling based on jaeger.

…

On Tue, Aug 22, 2017 at 12:06 PM, Chris Goller ***@***.***> wrote: @yurishkuro <https://github.com/yurishkuro> To better understand zipkin's model, we implemented a telegraf plugin here: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/zipkin Our goal is to support OpenTracing for sure, but, we figured we would support zipkin's data model to store into influxdb via telegraf. That way both jaeger and zipkin could read data from it. Do you think it would be better for the collection of spans to be stored using the OpenTracing naming? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/uber/jaeger/issues/272#issuecomment-323913586>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD616JzWT81QScJHw286XLMz0xMdTfiks5salPGgaJpZM4OZF-O> .

yurishkuro · 2017-08-22T04:52:23Z

@goller zipkin model does not support all features of OpenTracing, such as KV-logs and span references. Because of that the transformation from Jaeger to Zipkin data model can be lossy. If you're implementing Jaeger backend with InfluxDB, it seems to make more sense for that backend to use Jaeger data model and not be lossy.

yurishkuro · 2017-08-22T05:19:12Z

@goller btw jaeger-collector can accept Zipkin spans in various formats at :9411/api/v1/spans. It converts them to Jaeger internal data model that SpanWriter/SpanReader are operating on.

codefromthecrypt · 2017-08-22T05:24:11Z

I'm going to stop commenting on this issue. Suffice to say please do not conflate this work with Zipkin as lossiness is a point of view and point in time thing. Yuri's perspective of things is just that. he doesn't represent zipkin.

jacobmarble · 2019-02-15T22:24:14Z

FYI active work on this issue:
https://github.com/influxdata/jaeger/tree/influxdb

Today, this branch works with InfluxDB 2.0 alpha. It works today, but I won't open a PR until we've used it ourselves for a while.

yurishkuro · 2019-02-22T16:11:26Z

The plugin framework issue: #422

jacobmarble · 2019-05-10T21:59:49Z

FYI we have moved our active work to a new repo, which uses the gRPC framework:
https://github.com/influxdata/jaeger-influxdb

juanpabloaj · 2019-05-10T22:06:27Z

@jacobmarble is the repo available? I got 404

jacobmarble · 2019-05-15T04:20:30Z

Should be public now.

On Fri, May 10, 2019 at 5:06 PM JuanPablo ***@***.***> wrote: @jacobmarble <https://github.com/jacobmarble> is the repo available? I got 404 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#272 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEX5N3OPZOGNHWWH6QH5LLPUXWXDANCNFSM4DTEL6HA> .

-- Jacob

MattBoatman · 2021-05-27T15:09:19Z

@jacobmarble Did https://github.com/influxdata/jaeger-influxdb get moved to another location? The link from the docs 404s

jpkrohling · 2021-06-03T08:44:21Z

Looks like the repo is available :-)

jacobmarble · 2021-06-04T15:25:32Z

@MattBoatman I'm not sure why you got a 404.

Related, that repository will be archived in the next few months, as its replacement stabilizes. A new InfluxDB storage engine is in development, which handle traces much better than the current engine. This new Jaeger plugin is designed around a schema which is friendly to both OpenTelemetry and the new storage engine:
https://github.com/influxdata/influxdb-observability/tree/main/jaeger-query-plugin

MattBoatman · 2021-06-04T15:30:39Z

@jacobmarble I emailed the influx team and they restored the repo ;)
Good to know, I was just following the links from the jaeger docs

jkowall · 2022-11-04T15:45:10Z

There is a newer version of this which works with iOx the new engine. https://github.com/influxdata/influxdb-observability/tree/main/jaeger-query-plugin the older repo is only for v1 and v2 of InfluxDB.

jacobmarble · 2023-03-31T17:19:32Z

Last repo link, I promise:
https://github.com/influxdata/influxdb-observability

More specifically:
https://github.com/influxdata/influxdb-observability/tree/main/jaeger-influxdb

yurishkuro mentioned this issue Sep 28, 2017

Tag Search with Value Regex jaegertracing/jaeger-ui#87

Open

yurishkuro mentioned this issue Jan 8, 2018

Additional storage backends #638

Open

22 tasks

jpkrohling added enhancement help wanted Features that maintainers are willing to accept but do not have cycles to implement area/storage labels Jun 29, 2018

Puneeth-n mentioned this issue Sep 18, 2018

Custom links should be rendered irrespective of OAuth2 settings in Chronograf influxdata/chronograf#4473

Closed

jpkrohling mentioned this issue Feb 26, 2019

How to support plugins #422

Closed

8 tasks

jacobmarble mentioned this issue May 10, 2019

gRPC storage plugins - how to debug? #1529

Closed

yurishkuro removed the help wanted Features that maintainers are willing to accept but do not have cycles to implement label Aug 26, 2022

yurishkuro closed this as completed Nov 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InfluxDB as trace storage backend #272

InfluxDB as trace storage backend #272

yurishkuro commented Jul 15, 2017 •

edited

Loading

xjerod commented Jul 16, 2017

goller commented Jul 27, 2017 •

edited

Loading

yurishkuro commented Jul 27, 2017 •

edited

Loading

yurishkuro commented Aug 22, 2017

goller commented Aug 22, 2017 via email

goller commented Aug 22, 2017

codefromthecrypt commented Aug 22, 2017 via email

yurishkuro commented Aug 22, 2017

yurishkuro commented Aug 22, 2017

codefromthecrypt commented Aug 22, 2017 via email

jacobmarble commented Feb 15, 2019

yurishkuro commented Feb 22, 2019

jacobmarble commented May 10, 2019

juanpabloaj commented May 10, 2019

jacobmarble commented May 15, 2019 via email

MattBoatman commented May 27, 2021

jpkrohling commented Jun 3, 2021

jacobmarble commented Jun 4, 2021

MattBoatman commented Jun 4, 2021

jkowall commented Nov 4, 2022

jacobmarble commented Mar 31, 2023

InfluxDB as trace storage backend #272

InfluxDB as trace storage backend #272

Comments

yurishkuro commented Jul 15, 2017 • edited Loading

xjerod commented Jul 16, 2017

goller commented Jul 27, 2017 • edited Loading

yurishkuro commented Jul 27, 2017 • edited Loading

yurishkuro commented Aug 22, 2017

goller commented Aug 22, 2017 via email

goller commented Aug 22, 2017

codefromthecrypt commented Aug 22, 2017 via email

yurishkuro commented Aug 22, 2017

yurishkuro commented Aug 22, 2017

codefromthecrypt commented Aug 22, 2017 via email

jacobmarble commented Feb 15, 2019

yurishkuro commented Feb 22, 2019

jacobmarble commented May 10, 2019

juanpabloaj commented May 10, 2019

jacobmarble commented May 15, 2019 via email

MattBoatman commented May 27, 2021

jpkrohling commented Jun 3, 2021

jacobmarble commented Jun 4, 2021

MattBoatman commented Jun 4, 2021

jkowall commented Nov 4, 2022

jacobmarble commented Mar 31, 2023

yurishkuro commented Jul 15, 2017 •

edited

Loading

goller commented Jul 27, 2017 •

edited

Loading

yurishkuro commented Jul 27, 2017 •

edited

Loading