-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support archiving traces with ES storage #818
Comments
Is there a specific motivation for wanting a separate cluster for archiving, rather than just a separate keysapce/index? |
it doesn't have to be a separate cluster, but doesn't have NOT to be either. The configuration is such that archive storage inherits most settings from primary storage, and you can override some things. |
I am looking into this |
@yurishkuro Are the archived traces shown on |
If the above is true we don't want to create index for service names, therefore e will have to make changes to span writer and reader. I also assume that we will create only one archive index per deployment. To be able to support multiple tenants the index prefix will be just put in front of the archive index e.g. |
Yes, it only works for direct lookups by ID. It's primarily built to support long-lived hyperlinks that people can put to tickets, postmortem docs, etc. |
We should also think about retention for the archive index. We could do it per time like proposed in #628 (day, month, year) or just allow to use a different index name e.g. `jaeger-span-archive-2. It might be also doable with prefix. Note that it's not expected to delete data from an index in ES. |
^^ cc @jaegertracing/elasticsearch |
Just as a counter-point, we would not use the native Jaeger archive at all. Instead, we currently rely on our own elasticsearch-curator configuration to route indices from "hot nodes" to "warm nodes". Specifically in our Kube+AWS deployment, this means moving from Elasticsearch nodes sized as r4.4xlarge with gp2 EBS volumes to r4.xlarge nodes with st1 EBS volumes. This might be a better recommendation for the Jaeger project, even though it is more complexity in the Elasticsearch deployment and surrounding tooling itself. An older blog post, though still largely relevant in Elasticsearch 6.x, which further explains this architecture: "Hot-Warm" Architecture in Elasticsearch 5.x I guess, specifically, this feels like possibly the wrong way to fix a performance limitation/regression in the Elasticsearch storage backend. If given time-bounding on the query, it will "automatically" optimize which indexes need to be scanned instead of walking the entire available data-set of spans. |
This is all new to me I will have to experiment and do some reading. But it seems we could have one archive index (as perhaps an alias). This alias would point to one write index (archive-3) and several read indices (archive-1, archive-2). The write index would be rolled over (based on conditions - shards, time?) and put to read indices. I am not sure if the rolled over index can be automatically assigned to another alias. |
After playing with rollover API here is my proposal how we could go forward and use it for archive index. If it works well we could start experimenting and use it for the main indices. First a brief explanation of rollover API:
Great news is that ES >= 6.4.4 supports My proposal is to allow use two archive indices - one for writes and one for reads. By default the read index would be the same as write. This satifies a simple deployments with no extra configuration and the more complex deployments would have to create the archive aliases (or single if ES6 is used) before deploying and use cronjob with rollover. cc @jaegertracing/elasticsearch any feedback is welcome. The last thing we have to figure out is how to call rollover. We could use curator with a cronjob. The culrator would call rollover, parse the response and put the old index (if ES5) to read alias (I am not sure if rollover operation in curator returns an object with index info). |
NB: are you only thinking of using this for archives? It seems useful for the main storage as well, since we're currently issuing queries over multiple indices, rather than a single alias, which would simplify the code & configuration. |
My plan is to start with the archive index and then add option for the main storage. |
that's fair, although in Cassandra there is no difference between main/archive storage implementations, just a configuration, so you would still need to make changes to the ES storage impl - are you thinking a fork or a feature flag? |
A good question, I think a feature flag there will be a lot of similarities. Only get index names functions should by different maybe we could resolve that function in the constructor. My main blocker here is how to get old index name after rollover and put it into the read alias. I will have to play with the curator. |
@pavolloffay Would archiving traces only be supported with ES >= 6.4.4? |
No, it will be supported for > 5.x. 6.4.4 just would leverage the |
Requirement - what kind of business use case are you trying to solve?
We use the ES backend, and would like to be able to archive traces
Problem - what in Jaeger blocks you from solving the requirement?
Archiving is only supported by the Cassandra storage plugin
Proposal - what do you suggest to solve the problem or improve the existing situation?
I briefly looked at the implementation of archiving. It looks like most of the logic is built on things that already existed, and that it would be fairly straightforward to add an ES implementation. Two options come to mind:
esCleaner.py
script would need to ignore this index.Any open questions to address
It seems like we'd need a different way of dividing up the indexes to support this if we go with option (1) above. Currently we create an index for each day, which probably doesn't make sense for archiving. Option (2) above would solve this inherently.
The text was updated successfully, but these errors were encountered: