Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Feb 23, 2021

One of the tests that I added in #68871 worked about 99.7% of the time
but on some seeds failed to generate the right buckets because on those
seeds we would collect each segment with its own aggregator and get bad
counts. This "bad counts" problem is known in the terms aggregator - its
the price we pay for distributed work. But we can work around it either
by forcing all the docs into a single segment or by collecting all of
the buckets on the shard. We want to test the code path where don't
collect all buckets on the shard so we opt for the former.

Closes #69413

@nik9000 nik9000 added >test Issues or PRs that are addressing/adding tests :Analytics/Aggregations Aggregations v8.0.0 v7.13.0 labels Feb 23, 2021
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

One of the tests that I added in elastic#68871 worked about 99.7% of the time
but on some seeds failed to generate the right buckets because on those
seeds we would collect each segment with its own aggregator and get bad
counts. This "bad counts" problem is known in the terms aggregator - its
the price we pay for distributed work. But we can work around it either
by forcing all the docs into a single segment or by collecting all of
the buckets on the shard. We want to test the code path where don't
collect all buckets on the shard so we opt for the former.

Closes elastic#69413
@nik9000 nik9000 merged commit 21edf4d into elastic:master Feb 23, 2021
nik9000 added a commit that referenced this pull request Feb 23, 2021
One of the tests that I added in #68871 worked about 99.7% of the time
but on some seeds failed to generate the right buckets because on those
seeds we would collect each segment with its own aggregator and get bad
counts. This "bad counts" problem is known in the terms aggregator - its
the price we pay for distributed work. But we can work around it either
by forcing all the docs into a single segment or by collecting all of
the buckets on the shard. We want to test the code path where don't
collect all buckets on the shard so we opt for the former.

Closes #69413
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v7.13.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] TermsAggregatorTests.testManyUniqueTerms failure

3 participants