read from first chunks #994

Dieterbe · 2018-08-14T07:56:40Z

this is an old optimization (?) that has been with us
since a long time ago: #74
2029113

here's how it caused data loss at read time:

when only 1 chunk of data had been filled:
the "update" of the field is a no-op
because len(chunks) == 1, so oldPos goes back to 0
(not sure if intentional or a bug) so reading the
first chunk worked.
once you have more than 1 chunk: update of oldPos works.
we start hitting cassandra.
depending on how long the chunk takes to get saved
to cassandra, we will miss data at read time.
also, our chunk cache does not cache absence of data,
hitting cassandra harder during this period.
once the chunk is saved to cassandra the problem disappears
once the circular buffer recycles the first time (effectively
removing the first chunk) this optimization no longer applies,
but at that point we still hit cassandra just as before.

This problem is now solved. However, removing that code
enables another avenue for data loss at read time:

when a read node starts (without data backfill)
or a read node starts with data backfill, but the backfill
doesn't have old data for the particular metric, IOW
when the data only covers 1 chunk's worth
a read node starts with data backfill, but since backfilling starts
at arbitrary positions, the first chunk will miss some data in the
beginning.

In all these cases, the first chunk is a partial chunk, whereas
a full version of the chunk is most likely already in cassandra.

To make sure this is not a problem, if the first chunk we used was
partial, we set oldest to the first timestamp, so that the rest
can be retrieved from cassandra.
Typically, this will cause the "same" chunk (but a full version)
to be retrieved from cassandra, which is then cached and seamlessly
merged via Fix()

fix #78
fix #988

this is an old optimization (?) that has been with us since a long time ago: #74 2029113 here's how it caused data loss at read time: - when only 1 chunk of data had been filled: the "update" of the field is a no-op because len(chunks) == 1, so oldPos goes back to 0 (not sure if intentional or a bug) so reading the first chunk worked. - once you have more than 1 chunk: update of oldPos works. we start hitting cassandra. depending on how long the chunk takes to get saved to cassandra, we will miss data at read time. also, our chunk cache does not cache absence of data, hitting cassandra harder during this period. - once the chunk is saved to cassandra the problem disappears - once the circular buffer recycles the first time (effectively removing the first chunk) this optimization no longer applies, but at that point we still hit cassandra just as before. This problem is now solved. However, removing that code enables another avenue for data loss at read time: - when a read node starts (without data backfill) - or a read node starts with data backfill, but the backfill doesn't have old data for the particular metric, IOW when the data only covers 1 chunk's worth - a read node starts with data backfill, but since backfilling starts at arbitrary positions, the first chunk will miss some data in the beginning. In all these cases, the first chunk is a partial chunk, whereas a full version of the chunk is most likely already in cassandra. To make sure this is not a problem, if the first chunk we used was partial, we set oldest to the first timestamp, so that the rest can be retrieved from cassandra. Typically, this will cause the "same" chunk (but a full version) to be retrieved from cassandra, which is then cached and seamlessly merged via Fix() fix #78 fix #988

shanson7 · 2018-08-14T16:57:00Z

mdata/result.go

@@ -8,5 +8,5 @@ import (
 type Result struct {
 	Points []schema.Point
 	Iters  []chunk.Iter
-	Oldest uint32
+	Oldest uint32 // timestamp of oldest point we have, to know when and when not we may need to query slower storage


to know when and when not we may need

is a bit hard to parse. Maybe

to know whether we need

shanson7 · 2018-08-14T16:59:43Z

mdata/aggmetric.go

@@ -342,8 +331,13 @@ func (a *AggMetric) Get(from, to uint32) (Result, error) {
 		}
 	}

+	if oldestChunk.First {


It seems unnecessary that each chunk needs to have this First bool. Can you use the t0 and LastTs to be able to tell whether firstTs falls in this range? Something like a ContainsTimestamp function?

that's a good idea. I had been thinking for a while what was the simplest/lowest-overhead way to do this (ideally introducing only 1 new attribute instead of 2)

But note that the First field should have no overhead. it's a bool so fits in the space that was previously padded space (as we already had a bool on chunk.Chunk) so i still like it.

It's less about the space and more about the branching of chunk types, when from a Chunks perspective it is just a Chunk and has no concept of Firstness. Like, if Chunk was just a slice, it would be weird for the First member to live with the slice, right?

funny, i also thought about this. but what I realized is that chunk.Chunk is already not a pure chunk, rather it is a "chunk as embedded in an AggMetric in metrictank", what i mean is it already has a bunch of fields for other uses within metrictank besides mere chunkness. so one more doesn't change anything IMHO.

that said at some point i want to refactor this to clarify this.

If it doesn't bother you that's ok. But I'll point out that the other elements of the Chunk are actually properties relevant to a Chunks internals. First will be the first/only property that is actually about a state outside of Chunk, and that's what makes it stand out as awkward to me.

makes sense. you're looking at it from a "field describes which state" perspective. I was looking at it "field is used for what/by what" perspective.
using the former perspective generally leads to cleaner code I think. And you're right, it looks a bit more awkward to me too now. But it's the most efficient solution so until we refactor I think it's OK.

shanson7

Minor comments, but nothing objectionable enough to block.

Dieterbe · 2018-08-14T21:05:42Z

thx for reviewing Sean!

Dieterbe requested review from shanson7 and woodsaj August 14, 2018 07:56

Dieterbe force-pushed the read-from-first-chunk branch from 2d12b85 to 0a59537 Compare August 14, 2018 08:17

update tests

b99a0fd

shanson7 reviewed Aug 14, 2018

View reviewed changes

shanson7 approved these changes Aug 14, 2018

View reviewed changes

Dieterbe merged commit b6711ab into master Aug 14, 2018

Dieterbe deleted the read-from-first-chunk branch September 18, 2018 09:09

Dieterbe added this to the 0.10.0 milestone Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read from first chunks #994

read from first chunks #994

Dieterbe commented Aug 14, 2018 •

edited

Loading

shanson7 Aug 14, 2018

Dieterbe Aug 14, 2018

shanson7 Aug 14, 2018

Dieterbe Aug 14, 2018

shanson7 Aug 14, 2018

Dieterbe Aug 14, 2018

shanson7 Aug 14, 2018

Dieterbe Aug 14, 2018

shanson7 left a comment

Dieterbe commented Aug 14, 2018

read from first chunks #994

read from first chunks #994

Conversation

Dieterbe commented Aug 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shanson7 left a comment

Choose a reason for hiding this comment

Dieterbe commented Aug 14, 2018

Dieterbe commented Aug 14, 2018 •

edited

Loading