Skip to content

Refactor downsampling fetchers and producers#140357

Merged
elasticsearchmachine merged 18 commits intoelastic:mainfrom
gmarouli:refactor-downsampling-fetchers-and-producers
Feb 27, 2026
Merged

Refactor downsampling fetchers and producers#140357
elasticsearchmachine merged 18 commits intoelastic:mainfrom
gmarouli:refactor-downsampling-fetchers-and-producers

Conversation

@gmarouli
Copy link
Copy Markdown
Contributor

@gmarouli gmarouli commented Jan 8, 2026

This refactoring merges the code of the FieldValueFetcher and the AbstractDownsampleFieldProducer.

Until now the FieldValueFetcher was responsible for loading the values and creating and exposing a AbstractDownsampleFieldProducer that was able to collect the raw values and produce a downsampled value per field.

In the last releases we have increased the types of fields that can be downsampled. As a result, we made AbstractDownsampleFieldProducer generic. Furthermore, with the introduction of the different sampling method, the AbstractDownsampleFieldProducer became much more complex than the FieldValueFetcher.

In this PR, we merge their functionalities into AbstractFieldDownsampler which provides the methods to both load the doc values and collect the values.

Finally, with the possibility of an aggregate counter that is going to produce more than one downsampled values, we revert the code of the DownsampleShardIndexer to handle the different downsamplers (old producers) depending on their type in order to avoid reflection checks when collecting and writing the downsampled documents.

@gmarouli
Copy link
Copy Markdown
Contributor Author

gmarouli commented Jan 8, 2026

Buildkite benchmark this with tsdb please

@elasticmachine
Copy link
Copy Markdown
Collaborator

elasticmachine commented Jan 8, 2026

💚 Build Succeeded

This build ran two tsdb benchmarks to evaluate performance impact of this PR.

History

@gmarouli gmarouli added >refactoring :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data labels Feb 12, 2026
@gmarouli gmarouli marked this pull request as ready for review February 12, 2026 11:47
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@gmarouli gmarouli requested review from martijnvg and romseygeek and removed request for martijnvg February 12, 2026 13:32
Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Apologies that it took such a long time to review.

@gmarouli
Copy link
Copy Markdown
Contributor Author

Apologies that it took such a long time to review.

Thank you @martijnvg no worries, these things happen :)

@gmarouli gmarouli added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Feb 27, 2026
@elasticsearchmachine elasticsearchmachine merged commit 51960a4 into elastic:main Feb 27, 2026
35 checks passed
@gmarouli gmarouli deleted the refactor-downsampling-fetchers-and-producers branch February 27, 2026 12:23
PeteGillinElastic pushed a commit to PeteGillinElastic/elasticsearch that referenced this pull request Feb 27, 2026
This refactoring merges the code of the `FieldValueFetcher` and the
`AbstractDownsampleFieldProducer`. 

Until now the `FieldValueFetcher` was responsible for loading the values
and creating and exposing a `AbstractDownsampleFieldProducer` that was
able to collect the raw values and produce a downsampled value per
field.

In the last releases we have increased the types of fields that can be
downsampled. As a result, we made `AbstractDownsampleFieldProducer`
generic. Furthermore, with the introduction of the different sampling
method, the `AbstractDownsampleFieldProducer` became much more complex
than the `FieldValueFetcher`.

In this PR, we merge their functionalities into
`AbstractFieldDownsampler` which provides the methods to both load the
doc values and collect the values.

Finally, with the possibility of an aggregate counter that is going to
produce more than one downsampled values, we revert the code of the
`DownsampleShardIndexer` to handle the different downsamplers (old
producers) depending on their type in order to avoid reflection checks
when collecting and writing the downsampled documents.
szybia added a commit to szybia/elasticsearch that referenced this pull request Feb 27, 2026
…cations

* upstream/main: (35 commits)
  Create ARM bulk sqrI8 implementation (elastic#142461)
  Rework get-snapshots predicates (elastic#143161)
  Refactor downsampling fetchers and producers (elastic#140357)
  ESQL: Unmute test and add extra logging to generative test validation (elastic#143168)
  Fix metadata fields being nullified/loaded by unmapped_fields setting (elastic#143155)
  Determine remote cluster version (elastic#142494)
  Populate failure message for aborted clones (elastic#143206)
  Allow kibana_system role to read and manage logs streams (elastic#143053)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsLength} elastic#143224
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsByteLength} elastic#143223
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:docs.DocsBitLength} elastic#143222
  Fix FloatVectorScorerSupplier bulkScore bug (elastic#143211)
  ESQL: Add data node execution for external sources (elastic#143209)
  [ESQL] Cleanup commands docs (elastic#143058)
  [ML]Fix latest transforms disregarding updates when sort and sync fields are non-monotonic (elastic#142856)
  Mute org.elasticsearch.index.mapper.IpFieldMapperTests testSyntheticSourceInObject elastic#143212
  Tests: Fix StoreDirectoryMetricsIT (elastic#143084)
  ESQL: Add distribution strategy for external sources (elastic#143194)
  CSV IT spec (elastic#142585)
  Fix VectorScorerOSQBenchmark.score to read corrections properly (elastic#143137)
  ...
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
This refactoring merges the code of the `FieldValueFetcher` and the
`AbstractDownsampleFieldProducer`. 

Until now the `FieldValueFetcher` was responsible for loading the values
and creating and exposing a `AbstractDownsampleFieldProducer` that was
able to collect the raw values and produce a downsampled value per
field.

In the last releases we have increased the types of fields that can be
downsampled. As a result, we made `AbstractDownsampleFieldProducer`
generic. Furthermore, with the introduction of the different sampling
method, the `AbstractDownsampleFieldProducer` became much more complex
than the `FieldValueFetcher`.

In this PR, we merge their functionalities into
`AbstractFieldDownsampler` which provides the methods to both load the
doc values and collect the values.

Finally, with the possibility of an aggregate counter that is going to
produce more than one downsampled values, we revert the code of the
`DownsampleShardIndexer` to handle the different downsamplers (old
producers) depending on their type in order to avoid reflection checks
when collecting and writing the downsampled documents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >refactoring :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants