-
Notifications
You must be signed in to change notification settings - Fork 833
Description
When caching index entries, we basically lookup the entire row (to cache it) and then filter the entries that match after loading the full row:
Lines 84 to 116 in 071502d
type filteringBatchIter struct { | |
query chunk.IndexQuery | |
chunk.ReadBatchIterator | |
} | |
func (f *filteringBatchIter) Next() bool { | |
for f.ReadBatchIterator.Next() { | |
rangeValue, value := f.ReadBatchIterator.RangeValue(), f.ReadBatchIterator.Value() | |
if len(f.query.RangeValuePrefix) != 0 && !bytes.HasPrefix(rangeValue, f.query.RangeValuePrefix) { | |
continue | |
} | |
if len(f.query.RangeValueStart) != 0 && bytes.Compare(f.query.RangeValueStart, rangeValue) > 0 { | |
continue | |
} | |
if len(f.query.ValueEqual) != 0 && !bytes.Equal(value, f.query.ValueEqual) { | |
continue | |
} | |
return true | |
} | |
return false | |
} | |
// QueryFilter wraps a callback to ensure the results are filtered correctly; | |
// useful for the cache and Bigtable backend, which only ever fetches the whole | |
// row. | |
func QueryFilter(callback Callback) Callback { | |
return func(query chunk.IndexQuery, batch chunk.ReadBatch) bool { | |
return callback(query, &filteringBatch{query, batch}) | |
} | |
} |
cortex/pkg/chunk/storage/caching_index_client.go
Lines 70 to 71 in 071502d
// We cache the entire row, so filter client side. | |
callback = chunk_util.QueryFilter(callback) |
Now we introduced an optimisation that does the following:
cortex/pkg/chunk/chunk_store.go
Lines 463 to 470 in 071502d
set := FindSetMatches(matcher.Value) | |
for _, v := range set { | |
var qs []IndexQuery | |
qs, err = c.schema.GetReadQueriesForMetricLabelValue(from, through, userID, metricName, matcher.Name, v) | |
if err != nil { | |
break | |
} | |
queries = append(queries, qs...) |
Basically if we have a lookup that looks like label=~"a|b|c|d"
we are now splitting it into 4 different lookups of label=a
, label=b
, etc. This is causing us to load the entire row multiple times due to caching. We should make sure to push this optmisation down to filteringBatchIter
and not load something multiple times.