Collect dimensions only once per tsid when downsampling#145089
Conversation
|
Buildkite benchmark this with tsdb please |
|
Buildkite benchmark this with tsdb please |
|
Buildkite benchmark this with tsdb please |
|
Aggregate Last value |
|
Buildkite benchmark this with tsdb please |
|
Aggregate Last value |
|
Buildkite benchmark this with tsdb please |
|
Hi @gmarouli, I've created a changelog YAML for you. |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| assert value.equals(this.lastValue) != false | ||
| : "Dimension value changed without tsid change [" + value + "] != [" + this.lastValue + "]"; | ||
| } | ||
| assert docValueCount == 1; |
There was a problem hiding this comment.
Is this needed? It should hold in practice, but there may be cases with multi-values (e.g. ips).
There was a problem hiding this comment.
You can have a list of objects as well, and use that instead? One of them should be set.
There was a problem hiding this comment.
I do not feel strongly about it. It's there because the original code had an equivalent assertion. The collectOnce(docValues.nextValue()); has the assertion that it should be empty. If a dimension had multiple values, when reading the second value it would trigger this assertion.
If we think this is not necessary I prefer to remove it. Do you agree?
There was a problem hiding this comment.
We do support multi-values in dimensions. The logic was added fairly recently. We may've missed updating the assert but the downsample index still gets all values iiuc?
There was a problem hiding this comment.
No, in this case it will only keep the last:
void collectOnce(final Object value) {
assert isEmpty;
Objects.requireNonNull(value);
this.lastValue = value;
this.isEmpty = false;
}
where var value = docValues.nextValue();, to the best of my understanding this is just one of the values.
There was a problem hiding this comment.
I will open a PR to fix this
I think flaky results again are showing the contender being worse because we rarely see aggregate 1m to reach 3.8m. |
|
Buildkite benchmark this with tsdb please |
💚 Build Succeeded
This build ran two tsdb benchmarks to evaluate performance impact of this PR. History
|
We are checking to see if there will be any improvement to downsampling if we only collect the dimension once per tsid.
We are checking to see if there will be any improvement to downsampling if we only collect the dimension once per tsid.