calculate TTL relative to now when inserting into cassandra #1448

replay · 2019-08-29T20:41:34Z

Calculate the TTL relative to now before inserting into cassandra. If it is <=0 then we just skip the insert.

replay · 2019-08-29T22:27:15Z

I have tested the latest version like this:

Spun up a Metrictank instance with an empty Cassandra cluster and this schema:
1s:6h:2min:2,1min:2d:6h:1
Fed it with fakemetrics like this:
fakemetrics backfill --offset 72h --speedup 10000 --kafka-mdm-addr kafka:9092 --mpo 1
note that this offset goes back further into the past than the largest aggregate of the storage schema
Selected the data from Cassandra:

cqlsh:metrictank> select key, ts, TTL(data) from metric_32;

 key                                           | ts         | ttl(data)
-----------------------------------------------+------------+-----------
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567080000 |    157484
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567058400 |    135884
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567036800 |    114284
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567015200 |     92684
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1566993600 |     71084
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1566972000 |     49484
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1566950400 |     27884
 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1566928800 |      6284
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567080000 |    157484
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567058400 |    135884
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567036800 |    114284
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567015200 |     92684
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1566993600 |     71084
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1566972000 |     49484
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1566950400 |     27884
 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1566928800 |      6284
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567080000 |    157484
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567058400 |    135884
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567036800 |    114284
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567015200 |     92684
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1566993600 |     71084
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1566972000 |     49484
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1566950400 |     27884
 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1566928800 |      6284
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567080000 |    157484
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567058400 |    135884
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567036800 |    114284
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567015200 |     92684
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1566993600 |     71084
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1566972000 |     49484
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1566950400 |     27884
 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1566928800 |      6284

(32 rows)

There are 2 things to note in this cassandra output:

The total number of chunks is 32. If all the data that has been fed by fakemetrics would have been stored then there should be 72h (offset) / 6h (chunkspan) * 4 (nr aggregates) = 48. But because for 16 of them the generated TTL was <=0 their insert got omitted.
The TTLs shown in the last row are making sense considering that they should:

All be in the range 0 - 48 * 3600 (172800)
Each aggregation (min/max/cnt/sum) should have one of each unique TTL value.
F.e:
1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 has one with 157484
1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 also has one with 157484
The distances between the TTLs of each aggregate should be 6 * 3600 = 21600.
157484 - 135884 = 21600
135884 - 114284 = 21600
114284 - 92684 = 21600
...

woodsaj · 2019-08-30T14:45:25Z

store/cassandra/cassandra.go

+	// - the timestamp of the last datapoint + ttl is the timestamp until when we want to keep this chunk
+	// - then we subtract the current time stamp to get the difference relative to now
+	// - the result is the ttl in seconds relative to now
+	relativeTtl := int64(t0+mdata.MaxChunkSpan()+ttl) - time.Now().Unix()


My concern with using MaxChunkSpan is that it makes the minimum ttl of data at MaxChunkSpan. eg, if you want raw data stored for 1h, but also have rollups being stored for longer and those rollups use a chunkSpan of 6h, then the raw data will be stored for a lot longer then intended.

This is not a concern for any of our use cases, but might cause problems for other users and perhaps end2end tests.

That's a valid concern, but I'm not sure how to improve that. There are basically two possibilities:

we read the data byte slice to determine the chunk span. this should only require reading the first 2 bytes, it can be done like here: https://github.com/grafana/metrictank/blob/master/mdata/chunk/itergen.go#L76

we add a span property to the chunk write request's payload. but this will require updating all locations that generate chunk write requests.

I'm leaning towards 1)

I've given 1) a try

I ran another test with the latest version. I've started MT with an empty cassandra instance and this schema:

1s:1h:10min:1,1min:48h:6h:1

Then I fed it with 72h of data for one metric. It created two tables for the two different TTLs metric_1/metric_32. All the TTLs in these two tables look like what we want. In each table the TTLs are nicely adjusted to the span:

cqlsh:metrictank> select key, ts, TTL(data) from metric_1; key | ts | ttl(data) ----------------------------------------+------------+----------- 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567179600 | 3815 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567179000 | 3215 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567178400 | 2615 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567177800 | 2015 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567177200 | 1415 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567176600 | 815 1.d0a8110e69b0d874610aa08ab6740dfa_647 | 1567176000 | 215 (7 rows) cqlsh:metrictank> select key, ts, TTL(data) from metric_32; key | ts | ttl(data) -----------------------------------------------+------------+----------- 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567144800 | 159128 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567123200 | 137528 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567101600 | 115928 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567080000 | 94328 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567058400 | 72728 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567036800 | 51128 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1567015200 | 29528 1.d0a8110e69b0d874610aa08ab6740dfa_sum_60_647 | 1566993600 | 7928 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567144800 | 159128 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567123200 | 137528 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567101600 | 115928 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567080000 | 94328 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567058400 | 72728 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567036800 | 51128 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1567015200 | 29528 1.d0a8110e69b0d874610aa08ab6740dfa_min_60_647 | 1566993600 | 7928 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567144800 | 159128 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567123200 | 137528 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567101600 | 115928 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567080000 | 94328 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567058400 | 72728 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567036800 | 51128 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1567015200 | 29528 1.d0a8110e69b0d874610aa08ab6740dfa_cnt_60_647 | 1566993600 | 7928 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567144800 | 159128 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567123200 | 137528 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567101600 | 115928 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567080000 | 94328 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567058400 | 72728 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567036800 | 51128 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1567015200 | 29528 1.d0a8110e69b0d874610aa08ab6740dfa_max_60_647 | 1566993600 | 7928 (32 rows)

mdata/chunk/spans.go

woodsaj

LGTM. Other then maybe renaming SpanOfChunk to ExtractChunkSpan

Co-Authored-By: Anthony Woods <[email protected]>

replay force-pushed the fix_importer_bug branch from 5b7040c to de29910 Compare August 29, 2019 22:08

when inserting we want to adjust the ttl to make it relative to now

5904a9a

replay force-pushed the fix_importer_bug branch from de29910 to 5904a9a Compare August 29, 2019 22:11

replay requested a review from woodsaj August 29, 2019 22:28

replay changed the title ~~[WIP] calculate timestamp when inserting into cassandra~~ calculate timestamp when inserting into cassandra Aug 29, 2019

replay changed the title ~~calculate timestamp when inserting into cassandra~~ calculate TTL relative to now when inserting into cassandra Aug 29, 2019

woodsaj reviewed Aug 30, 2019

View reviewed changes

replay added 3 commits August 30, 2019 11:26

determine chunk span to calculate its ttl on insert

a4a9675

saving one allocation

3f3486d

removing unused error to save another allocation

aca9ef7

woodsaj reviewed Aug 30, 2019

View reviewed changes

mdata/chunk/spans.go Outdated Show resolved Hide resolved

woodsaj approved these changes Aug 30, 2019

View reviewed changes

replay force-pushed the fix_importer_bug branch from e55b6cc to 53e316f Compare August 30, 2019 15:58

Rename method SpanOfChunk to ExtractChunkSpan

b28f095

Co-Authored-By: Anthony Woods <[email protected]>

replay force-pushed the fix_importer_bug branch from 53e316f to b28f095 Compare August 30, 2019 15:59

replay merged commit 31bff36 into master Aug 30, 2019

replay deleted the fix_importer_bug branch August 30, 2019 16:10

replay mentioned this pull request Dec 5, 2019

When ingesting data from the future we use a TTL that's higher than the defined TTL #1560

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

calculate TTL relative to now when inserting into cassandra #1448

calculate TTL relative to now when inserting into cassandra #1448

replay commented Aug 29, 2019 •

edited

Loading

replay commented Aug 29, 2019

woodsaj Aug 30, 2019

replay Aug 30, 2019 •

edited

Loading

replay Aug 30, 2019 •

edited

Loading

replay Aug 30, 2019 •

edited

Loading

woodsaj left a comment

calculate TTL relative to now when inserting into cassandra #1448

calculate TTL relative to now when inserting into cassandra #1448

Conversation

replay commented Aug 29, 2019 • edited Loading

replay commented Aug 29, 2019

woodsaj Aug 30, 2019

Choose a reason for hiding this comment

replay Aug 30, 2019 • edited Loading

Choose a reason for hiding this comment

replay Aug 30, 2019 • edited Loading

Choose a reason for hiding this comment

replay Aug 30, 2019 • edited Loading

Choose a reason for hiding this comment

woodsaj left a comment

Choose a reason for hiding this comment

replay commented Aug 29, 2019 •

edited

Loading

replay Aug 30, 2019 •

edited

Loading

replay Aug 30, 2019 •

edited

Loading

replay Aug 30, 2019 •

edited

Loading