-
Notifications
You must be signed in to change notification settings - Fork 137
InfluxDBStore: various improvements (+lower memory usage) #171
Conversation
Seen when visiting the `/traces` page when one trace has no children spans.
Prior to this change we had four tags in our `spans` measurement: - `name` which is generally low-cardinality (maps 1:1 with span names). - `span_id` which is 100% unique. - `trace_id` and `parent_id` which are generally unqiue, but not 100%. If we propose a hypothetical situation with 100 unique `name` tag values, and N=50000 unique `span_id`, `trace_id`, and `parent_id` tag values where N is the number of data points (maps 1:1 with Appdash spans) then we can calculate our total cardinality via the method described at https://docs.influxdata.com/influxdb/v0.13/concepts/glossary/#series-cardinality ``` 100 (name) * 50,000 (span_id) * 50,000 (trace_id) * 50,000 (parent_id) == 12,500,000,000,000,000 ``` For only a dataset of 50,000 spans! And having such Very High Cardinality™ in fact causes much higher RAM usage than is desireable. Quoting https://docs.influxdata.com/influxdb/v0.13/concepts/schema_and_data_layout/#discouraged-schema-design > Tags that specify highly variable information like UUIDs, hashes, and random > strings can increase your series cardinality to uncomfortable levels. If you > need that information in your database, consider storing the high-cardinality > data as a field rather than a tag (note that query performance will be slower). This change does mean that trace lookup times will be a linear scan, but trace lookups are far less common than writes in general (and this decreases RAM usage by a factor of almost 20x in production systems). If it is found that trace lookup times are not great after this change with a full 72hr worth of data, we can consider using a lower cardinality `trace_id`-based tag (e.g. the first two bytes of that string) in order to reduce the linear scan time significantly. It's not clear yet whether or not this optimization is needed. To take advantage of this new schema, users will need to `rm -rf ~/.influxdb` to remove the old database.
I don't know if I have asked this before, but is influxdb appropriate for the appdash use case? We aren't really storing timeseries data. But otherwise LGTM |
@@ -569,8 +568,8 @@ func (in *InfluxDBStore) init(server *influxDBServer.Server) error { | |||
in.server = server | |||
// TODO: Upgrade to client v2, see: github.com/influxdata/influxdb/blob/master/client/v2/client.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kudos on upgrading to v2 client @slimsag! - we might want to remove this TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point, I missed that. Removed in ccf0b7a
Hi @keegancsmith, here an issue where @slimsag wrote about the motivations for replacing Additional to that I'd like to add the following points on why it can be a good use case for appdash:
|
@chris-ramon cool thanks for the response and comments. Wasn't saying we made the wrong choice, was just looking for motivation :) The points you mentioned are great, and I'd love to learn more about it. Would you mind a few emails or VC to sate my curiosity on this? |
@keegancsmith sounds good :) - I'll be sharing some detailed info. expanding on the points mentioned above. |
/traces
page.rm -rf ~/.influxdb
to remove the old DB).