feat(tsdb): add pipeline runtime and rename stage interfaces#145175
Merged
salvatore-campagna merged 5 commits intoelastic:mainfrom Mar 31, 2026
Merged
Conversation
Add the pipeline runtime that connects the composable stage framework to the doc values consumer/producer: NumericEncodePipeline, NumericDecodePipeline, NumericBlockEncoder, NumericBlockDecoder, NumericCodecFactory, StageFactory, TransformEncoder, TransformDecoder. Rename the stage-level interfaces from NumericEncoder/NumericDecoder to TransformEncoder/TransformDecoder, freeing those names for the pipeline coordinators. This aligns with the architecture in the POC (elastic#141353) where transform stages and pipeline coordinators are separate concerns.
Block size must be a multiple of 128 (DocValuesForUtil constraint). Changed randomBlockSize() range from [4,9] to [7,9] (128 or 256 or 512).
Collaborator
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| @@ -76,6 +76,16 @@ public void decode(final long[] values, final int valueCount, final DecodingCont | |||
Contributor
There was a problem hiding this comment.
I think the JIT does a better job of compiling with an intermediate sum variable as in:
I also saw a noticeable improve using this when decoding deltas for binary doc value offsets.
Member
There was a problem hiding this comment.
Good point. @salvatore-campagna let's do this in a separate pr? The micro benchmarks should signal that indeed improve decoding performance.
Use a local sum accumulator instead of reading back from the array on each iteration. This avoids a data dependency on the previous array store and helps the JIT generate better code for the prefix-sum loop.
martijnvg
approved these changes
Mar 31, 2026
| @@ -76,6 +76,16 @@ public void decode(final long[] values, final int valueCount, final DecodingCont | |||
Member
There was a problem hiding this comment.
Good point. @salvatore-campagna let's do this in a separate pr? The micro benchmarks should signal that indeed improve decoding performance.
szybia
added a commit
to szybia/elasticsearch
that referenced
this pull request
Mar 31, 2026
…rics * upstream/main: (428 commits) ESQL: DS: Add inference/RERANK tests (elastic#145229) Unmute MMR logical plan test (elastic#145311) Do not attempt marking store as corrupted if the check is rejected due to shutdown (elastic#145209) feat(tsdb): add pipeline runtime and rename stage interfaces (elastic#145175) Fix UnresolvedException on PromQL by(step) grouping (elastic#145307) ES|QL: Optimize MMR by reducing cache size and lookup (elastic#145014) Prometheus labels/series APIs: support multiple match[] selectors (elastic#145298) Move ClientScrollablePaginatedHitSource into Reindex Module (elastic#144100) mute test class for elastic#145277 CPS mode for ViewResolver (elastic#145219) [ESQL] Disables GroupedTopNBenchmark temporarily (elastic#145124) Make exponential_histogram the default histogram type for HTTP OTLP endpoint (elastic#145065) More tests requiring an explicit confidence interval (elastic#145232) ES|QL: Adding `USER_AGENT` command (elastic#144384) ESQL: enable Generative IT after more fixes (elastic#145112) Rework FieldMapper parameter tests to not use merge builders (elastic#145213) [ESQL] Fix ORC type support gaps (elastic#145074) [Test] Unmute FollowingEngineTests.testProcessOnceOnPrimary (elastic#145192) Add PrometheusSeriesRestAction for /_prometheus/api/v1/series endpoint (elastic#144494) Prometheus labels API: add rest action (elastic#144952) ...
ncordon
pushed a commit
to ncordon/elasticsearch
that referenced
this pull request
Apr 1, 2026
…#145175) * feat(tsdb): add pipeline runtime and rename stage interfaces Add the pipeline runtime that connects the composable stage framework to the doc values consumer/producer: NumericEncodePipeline, NumericDecodePipeline, NumericBlockEncoder, NumericBlockDecoder, NumericCodecFactory, StageFactory, TransformEncoder, TransformDecoder. Rename the stage-level interfaces from NumericEncoder/NumericDecoder to TransformEncoder/TransformDecoder, freeing those names for the pipeline coordinators. This aligns with the architecture in the POC (elastic#141353) where transform stages and pipeline coordinators are separate concerns. * fix(tsdb): fix random block size in pipeline round-trip tests Block size must be a multiple of 128 (DocValuesForUtil constraint). Changed randomBlockSize() range from [4,9] to [7,9] (128 or 256 or 512). * perf(tsdb): use intermediate sum variable in delta decode loop Use a local sum accumulator instead of reading back from the array on each iteration. This avoids a data dependency on the previous array store and helps the JIT generate better code for the prefix-sum loop.
mromaios
pushed a commit
to mromaios/elasticsearch
that referenced
this pull request
Apr 9, 2026
…#145175) * feat(tsdb): add pipeline runtime and rename stage interfaces Add the pipeline runtime that connects the composable stage framework to the doc values consumer/producer: NumericEncodePipeline, NumericDecodePipeline, NumericBlockEncoder, NumericBlockDecoder, NumericCodecFactory, StageFactory, TransformEncoder, TransformDecoder. Rename the stage-level interfaces from NumericEncoder/NumericDecoder to TransformEncoder/TransformDecoder, freeing those names for the pipeline coordinators. This aligns with the architecture in the POC (elastic#141353) where transform stages and pipeline coordinators are separate concerns. * fix(tsdb): fix random block size in pipeline round-trip tests Block size must be a multiple of 128 (DocValuesForUtil constraint). Changed randomBlockSize() range from [4,9] to [7,9] (128 or 256 or 512). * perf(tsdb): use intermediate sum variable in delta decode loop Use a local sum accumulator instead of reading back from the array on each iteration. This avoids a data dependency on the previous array store and helps the JIT generate better code for the prefix-sum loop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Part 1/4 of introducing the ES94 TSDB doc values codec. This PR adds the pipeline runtime that bridges the composable stage framework to the doc values consumer/producer, which is a prerequisite for any codec that uses pipeline-based encoding.
Builds on the stage framework from #143589 and the integer stages from #143934. Aligns interface naming with the POC architecture (#141353).
What's included
New pipeline runtime classes:
NumericEncodePipeline,NumericDecodePipeline,NumericBlockEncoder/NumericBlockDecoder,NumericCodecFactory,StageFactory,TransformEncoder/TransformDecoder.Stage interface rename: The stage-level
NumericEncoder/NumericDecoderfrom #143934 are renamed toTransformEncoder/TransformDecoder, freeing those names for the pipeline coordinators (NumericEncoder/NumericDecoder) that the doc values consumer/producer interact with. This aligns with the POC architecture where transform stages and pipeline coordinators are separate concerns.Monomorphic dispatch: The encode/decode loops use a switch on
StageIdwith static methods (encodeStatic/decodeStatic) instead of virtual dispatch through the array, keeping each call site monomorphic for JIT inlining.PipelineConfig refactoring:
PipelineConfignow stores transform stages and the payload stage as separate fields. The builder separates them at construction time, making illegal states (e.g. two payloads) unrepresentable and eliminatinginstanceofchecks in the pipeline construction path.Testing