-
Notifications
You must be signed in to change notification settings - Fork 107
calculate TTL relative to now when inserting into cassandra #1448
Conversation
5b7040c
to
de29910
Compare
de29910
to
5904a9a
Compare
I have tested the latest version like this:
There are 2 things to note in this cassandra output:
|
store/cassandra/cassandra.go
Outdated
// - the timestamp of the last datapoint + ttl is the timestamp until when we want to keep this chunk | ||
// - then we subtract the current time stamp to get the difference relative to now | ||
// - the result is the ttl in seconds relative to now | ||
relativeTtl := int64(t0+mdata.MaxChunkSpan()+ttl) - time.Now().Unix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern with using MaxChunkSpan is that it makes the minimum ttl of data at MaxChunkSpan. eg, if you want raw data stored for 1h, but also have rollups being stored for longer and those rollups use a chunkSpan of 6h, then the raw data will be stored for a lot longer then intended.
This is not a concern for any of our use cases, but might cause problems for other users and perhaps end2end tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid concern, but I'm not sure how to improve that. There are basically two possibilities:
- we read the data byte slice to determine the chunk span. this should only require reading the first 2 bytes, it can be done like here: https://github.com/grafana/metrictank/blob/master/mdata/chunk/itergen.go#L76
- we add a span property to the chunk write request's payload. but this will require updating all locations that generate chunk write requests.
I'm leaning towards 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given 1)
a try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran another test with the latest version. I've started MT with an empty cassandra instance and this schema:
1s:1h:10min:1,1min:48h:6h:1
Then I fed it with 72h of data for one metric. It created two tables for the two different TTLs metric_1
/metric_32
. All the TTLs in these two tables look like what we want. In each table the TTLs are nicely adjusted to the span:
cqlsh:metrictank> select key, ts, TTL(data) from metric_1;
key | ts | ttl(data)
----------------------------------------+------------+-----------
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567179600 | 3815
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567179000 | 3215
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567178400 | 2615
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567177800 | 2015
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567177200 | 1415
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567176600 | 815
1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567176000 | 215
(7 rows)
cqlsh:metrictank> select key, ts, TTL(data) from metric_32;
key | ts | ttl(data)
-----------------------------------------------+------------+-----------
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567144800 | 159128
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567123200 | 137528
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567101600 | 115928
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567080000 | 94328
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567058400 | 72728
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567036800 | 51128
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567015200 | 29528
1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1566993600 | 7928
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567144800 | 159128
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567123200 | 137528
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567101600 | 115928
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567080000 | 94328
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567058400 | 72728
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567036800 | 51128
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567015200 | 29528
1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1566993600 | 7928
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567144800 | 159128
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567123200 | 137528
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567101600 | 115928
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567080000 | 94328
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567058400 | 72728
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567036800 | 51128
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567015200 | 29528
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1566993600 | 7928
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567144800 | 159128
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567123200 | 137528
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567101600 | 115928
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567080000 | 94328
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567058400 | 72728
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567036800 | 51128
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567015200 | 29528
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1566993600 | 7928
(32 rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Other then maybe renaming SpanOfChunk
to ExtractChunkSpan
e55b6cc
to
53e316f
Compare
Co-Authored-By: Anthony Woods <[email protected]>
53e316f
to
b28f095
Compare
Calculate the TTL relative to now before inserting into cassandra. If it is
<=0
then we just skip the insert.