feat(tsdb): add composable pipeline framework for ES94 TSDB codec by salvatore-campagna · Pull Request #143589 · elastic/elasticsearch

salvatore-campagna · 2026-03-04T14:59:02Z

Summary

This PR introduces the foundation of the ES94 Deepstore Pipeline Codec, which replaces the monolithic ES819 encoding with a composable pipeline of transform and payload stages. It inlcudes the type system, wire format, metadata I/O, block format, and context objects. No concrete stage implementations are included; those will land in subsequent PRs that build on this base.

What's included

Type system

Component	Purpose
`StageId`	Wire format registry - byte identifiers for all pipeline stages
`StageSpec`	Sealed record hierarchy capturing stage-specific parameters (e.g., `maxError`)
`PipelineDescriptor`	Serialization of pipeline configuration: `[stageCount][blockShift][dataType][stageIds]`
`FieldDescriptor`	Versioned envelope wrapping `PipelineDescriptor` for forward compatibility
`PipelineConfig`	Fluent builder API for constructing pipeline specifications

Encode/decode infrastructure

Component	Purpose
`BlockFormat`	Per-block layout: `[bitmap][payload][stage metadata]` - designed for single-pass sequential decoding
`EncodingContext`	Mutable per-block encoding state (bitmap, metadata offsets), reused via `clear()`
`DecodingContext`	Mutable per-block decoding state, delegates metadata reads to the underlying `DataInput`
`MetadataBuffer`	Auto-growing byte buffer for gather-scatter stage metadata during encoding
`MetadataWriter` / `MetadataReader`	Fluent interfaces for writing/reading stage metadata
`PayloadEncoder` / `PayloadDecoder`	Interfaces for terminal payload stages (e.g., bit-packing, ZTSD/LZ4 compression)

Design highlights

Gather-scatter metadata: stages write metadata forward during encoding; EncodingContext.writeStageMetadata flushes in reverse order so the decoder reads sequentially with no seeking or buffering
Position bitmap: 1 byte for pipelines with <= 8 stages, 2 bytes otherwise, tells the decoder which stages to reverse
Mutable context reuse: EncodingContext/DecodingContext are cleared and reused per block to avoid per-block GC pressure
Wire format stability: StageId byte assignments are declared upfront as the encoder/decoder contract; FieldDescriptor adds a version byte for future format evolution

Testing

./gradlew :server:test --tests "org.elasticsearch.index.codec.tsdb.pipeline.*"

Introduce the foundation layer for the ES94 pipeline codec, which replaces the monolithic ES819 encoding with a composable pipeline of transform and payload stages. This is PR 1 of a series. It contains no concrete stage implementations - only the type system, wire format, metadata I/O, block format, and context objects that subsequent PRs build on. Key components: - StageId: wire format registry of stage byte identifiers - StageSpec: sealed hierarchy of stage specifications with parameters - PipelineDescriptor: serialization/deserialization of pipeline configuration - FieldDescriptor: versioned envelope for pipeline descriptors - PipelineConfig: fluent builder API for pipeline construction - BlockFormat: per-block encode/decode layout (bitmap + payload + metadata) - EncodingContext / DecodingContext: mutable per-block state, reused via clear() - MetadataBuffer / MetadataWriter / MetadataReader: gather-scatter metadata I/O - PayloadEncoder / PayloadDecoder: interfaces for terminal payload stages

…pty"

…lp and extract magic numbers

…private

elasticsearchmachine · 2026-03-04T18:03:27Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

martijnvg · 2026-03-05T10:52:16Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/StageId.java

+    ALP_DOUBLE_STAGE((byte) 0x06, "alpDouble"),
+    ALP_FLOAT_STAGE((byte) 0x07, "alpFloat"),
+    FPC_DOUBLE_STAGE((byte) 0x08, "fpcDouble"),
+    FPC_FLOAT_STAGE((byte) 0x09, "fpcFloat"),
+
+    BITPACK_PAYLOAD((byte) 0xA1, "bitPack"),
+    ZSTD_PAYLOAD((byte) 0xA2, "zstd"),
+    LZ4_PAYLOAD((byte) 0xA3, "lz4"),
+    GORILLA_DOUBLE_PAYLOAD((byte) 0xA4, "gorillaDouble"),
+    GORILLA_FLOAT_PAYLOAD((byte) 0xA5, "gorillaFloat"),
+    CHIMP_DOUBLE_PAYLOAD((byte) 0xA6, "chimpDouble"),
+    CHIMP_FLOAT_PAYLOAD((byte) 0xA7, "chimpFloat"),
+    CHIMP128_DOUBLE_PAYLOAD((byte) 0xA8, "chimp128Double"),
+    CHIMP128_FLOAT_PAYLOAD((byte) 0xA9, "chimp128Float");


I understand that this PR is preparing for the the es94 doc values format, but can we leave these stages and related code out for now and add these back when needed?

Given that we first focus on getting es94 in a state that performs same encoding techniques that es819 is using? This keeps this PR smaller and easier to review.

I can keep a limited set to start with but I need a few of them to be able to do some testing with actual values.

martijnvg

Thanks Salvatore, I did a first review round.

martijnvg · 2026-03-05T11:49:02Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/MetadataWriter.java

+/**
+ * Writes stage metadata values to a buffer during encoding. Supports method chaining.
+ */
+public interface MetadataWriter {


Can you explain why we can't use here StreamOutput or some other existing writable interface from ES or Lucene? Same question for the MetadateReader

The block layout is [bitmap][payload][stage metadata] (see BlockFormat Javadoc). Metadata comes after the payload on disk, but transform stages like delta, offset, GCD, produce metadata before the payload during encoding (e.g. we do delta>offset>gcd>bitpack). So we buffer metadata in memory and flush it after the payload is written.

We chose this layout deliberately: if metadata came first on disk instead, encoding would be simpler (write directly to DataOutput) but the decoder would need to buffer or seek past metadata to reach the payload on every block read. Since decoding happens far more often than encoding, we push the buffering to the encode side (once) to give the decoder a clean single forward pass with no buffering or seeking. With this layout if you look at how decoding works we always read sequentially. We always read and cache the bitmap (1-2 bytes). Then the full payload which we decode using BitPack (first on decode but last during encode). Then every piece of metadata for delta>offset>gcd is already ordered so that when decoding we first have GCD metadata, then offset metadata and then delta metadata. No jump, no buffering no going back and forth. This is important because:

it favors SIMD-friendly decoders (loops with no if or jumps)

fewer jumps in code help the CPU pre-fetch and caching

MetadataWriter/MetadataReader are minimal interfaces (8 methods each) that decouple stages from this buffering strategy. With this idea every stage just writes/reads metadata without knowing all of this goes on under the hood and they don't need to know anything about the block format or the order. This also makes BWC far easier: since stages only see MetadataWriter/MetadataReader, we can change the block layout or metadata ordering without touching any stage implementation.

I see, I agree that this an minimal interface. Can you add your explanation of why you chose not to rely on e.g. DataOutput as class level java doc?

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/PipelineDescriptor.java

martijnvg · 2026-03-05T11:58:22Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/StageSpec.java

+    }
+
+    /** Zstandard block compression payload. */
+    record ZstdPayload(int compressionLevel) implements StageSpec {


Some of these stage specs like this one, should always be the stagespec, right? If that is the case maybe we should have a different interface for these payloads? Like a marker interface that expends from StageSpec? Then we can add a build() method to the builders that accepts this marker interface?

Same reasoning as my reply for add(...): named terminal methods like bitPack() keep callers from depending on any StageSpec type directly. The builder is the only public contract.

martijnvg · 2026-03-05T12:00:12Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/PipelineConfig.java

+        public LongBuilder delta() {
+            specs.add(new StageSpec.DeltaStage());
+            return this;
+        }
+
+        public LongBuilder offset() {
+            specs.add(new StageSpec.OffsetStage());
+            return this;
+        }
+
+        public LongBuilder gcd() {
+            specs.add(new StageSpec.GcdStage());
+            return this;
+        }
+
+        public LongBuilder patchedPFor() {
+            specs.add(new StageSpec.PatchedPForStage());
+            return this;
+        }
+
+        public LongBuilder xor() {
+            specs.add(new StageSpec.XorStage());
+            return this;
+        }


Maybe add one add(...) method that accepts a stage? That way we can reduce the number of methods here.

I kept named transform methods (delta(), offset(), gcd()) because when we add type-specific transforms later the compiler can enforce they only appear on the right builder. For instance we don't want users to use ALP or CHIMP for integer encoding. This way we ensure the codec construction is safe and reduce the likelihood for mistakes. With 3 transforms currently the boilerplate is minimal.

For example, ALP is a double-specific transform. With named methods on typed builders, a caller physically cannot apply ALP to an integer pipeline...code won't compile. A generic add(TransformSpec) would instead accept new AlpStage() on a LongBuilder since both are just TransformSpec, silently producing an invalid pipeline that only fails at runtime. So I prefer the builder to restrict construction via the type system. Detecting these mismatches at runtime is much harder and could easily slip into production unnoticed.

Also, named methods keep the concrete StageSpec record types out of the caller's API. Callers just write .delta() without importing or constructing specific records. With a generic add(TransformSpec) every call site would need new StageSpec.DeltaStage(), leaking internal types into the public surface. This might result in unwanted dependencies for consumers of the API. Beyond the type-safety argument: named methods keep the mapping layer and codec layer dependency-free. The mapping side just expresses intent (.delta().offset().gcd().bitPack()) without importing or constructing codec-internal types.

martijnvg · 2026-03-05T12:01:27Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/PipelineConfig.java

+        public PipelineConfig bitPack() {
+            specs.add(new StageSpec.BitPackPayload());
+            return new PipelineConfig(PipelineDescriptor.DataType.LONG, blockSize, specs);
+        }
+
+        public PipelineConfig zstd() {
+            specs.add(new StageSpec.ZstdPayload());
+            return new PipelineConfig(PipelineDescriptor.DataType.LONG, blockSize, specs);
+        }
+
+        public PipelineConfig zstd(int compressionLevel) {
+            specs.add(new StageSpec.ZstdPayload(compressionLevel));
+            return new PipelineConfig(PipelineDescriptor.DataType.LONG, blockSize, specs);
+        }
+
+        public PipelineConfig lz4() {
+            specs.add(new StageSpec.Lz4Payload());
+            return new PipelineConfig(PipelineDescriptor.DataType.LONG, blockSize, specs);
+        }
+
+        public PipelineConfig lz4HighCompression() {
+            specs.add(new StageSpec.Lz4Payload(true));
+            return new PipelineConfig(PipelineDescriptor.DataType.LONG, blockSize, specs);


These are like build() methods, right? Based on my previous comment about final stage specs, let's have one build() method here?

See my comment above: #143589 (comment)

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/BlockFormat.java

Remove transform and payload stages not needed for ES819 codec parity: PatchedPFor, Xor, ALP, FPC, Zstd, Lz4, Gorilla, Chimp, Chimp128. Retain only delta, offset, gcd, and bitpack. Additional stages will be introduced in subsequent PRs alongside their implementations.

@param

Add missing Javadoc tags to satisfy doclint: @param on record components, @return on builder and id() methods, @throws on writeStageMetadata, Javadoc on enum constants and clear().

Split StageSpec into TransformSpec (chainable transforms) and PayloadSpec (terminal payload stages) for internal type safety.

coderabbitai · 2026-03-06T07:38:35Z

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (2)

Team:Delivery
Team:Search - Inference

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: f500d520-0c2d-4766-bf4e-4b6a3d76b01a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

martijnvg

Thanks Salvatore, some minor comments, LGTM otherwise.

martijnvg · 2026-03-06T08:55:17Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/PipelineConfig.java

+         *
+         * @return the pipeline configuration
+         */
+        public PipelineConfig bitPack() {


Maybe rename to buildUsingBitpack() (or something else) to indicate that this is terminal payload and not other transform stages can be added?

martijnvg · 2026-03-06T09:02:58Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/MetadataWriter.java

+/**
+ * Writes stage metadata values to a buffer during encoding. Supports method chaining.
+ */
+public interface MetadataWriter {


I see, I agree that this an minimal interface. Can you add your explanation of why you chose not to rely on e.g. DataOutput as class level java doc?

martijnvg · 2026-03-06T09:05:00Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/pipeline/EncodingContext.java

+     * @param pipelineLength   the number of stages in the pipeline
+     * @param metadataCapacity the initial metadata buffer capacity in bytes
+     */
+    EncodingContext(int blockSize, int pipelineLength, int metadataCapacity) {


This constructor is unused, maybe remove it?

…ning design rationale Explains why we use dedicated interfaces instead of Lucene's DataOutput/DataInput or Elasticsearch's StreamOutput/StreamInput: the block layout places metadata after the payload on disk, requiring encode-side buffering to give the decoder a clean single forward pass with no seeking or buffering.

…parameter

salvatore-campagna self-assigned this Mar 4, 2026

elasticsearchmachine added the v9.4.0 label Mar 4, 2026

salvatore-campagna and others added 11 commits March 4, 2026 16:01

Merge branch 'main' into feature/es94-pipeline-framework

4f5a092

Merge branch 'main' into feature/es94-pipeline-framework

4e15b19

[CI] Auto commit changes from spotless

6822cd0

fix(tsdb): rename describeStages empty fallback from "default" to "em…

669d8df

…pty"

test(tsdb): add reverse metadata read test for gather-scatter contract

d92933e

feat(tsdb): add configurable compression level to ZstdPayload

770902c

refactor(tsdb): remove final from primitive method parameters

2f63f5a

[CI] Auto commit changes from spotless

da88804

refactor(tsdb): rename fpcStage/alpDoubleStage/alpFloatStage to fpc/a…

544d8a5

…lp and extract magic numbers

refactor(tsdb): tighten visibility across pipeline framework

f7e2447

refactor(tsdb): make ZstdPayload compression level constants package-…

dbcf4e6

…private

salvatore-campagna marked this pull request as ready for review March 4, 2026 17:41

Merge branch 'main' into feature/es94-pipeline-framework

250f9b7

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Mar 4, 2026

salvatore-campagna added >non-issue :StorageEngine/TSDB You know, for Metrics Team:StorageEngine and removed needs:triage Requires assignment of a team area label labels Mar 4, 2026

Merge branch 'main' into feature/es94-pipeline-framework

7c42b01

Merge branch 'main' into feature/es94-pipeline-framework

20d7d4a

martijnvg self-requested a review March 5, 2026 10:33

martijnvg reviewed Mar 5, 2026

View reviewed changes

Merge branch 'main' into feature/es94-pipeline-framework

2599725

martijnvg reviewed Mar 5, 2026

View reviewed changes

salvatore-campagna added 2 commits March 5, 2026 17:09

docs: fix missing Javadoc across pipeline classes

e01c35b

Add missing Javadoc tags to satisfy doclint: @param on record components, @return on builder and id() methods, @throws on writeStageMetadata, Javadoc on enum constants and clear().

salvatore-campagna added 2 commits March 5, 2026 17:09

refactor: introduce TransformSpec/PayloadSpec marker interfaces

52a5ba5

Split StageSpec into TransformSpec (chainable transforms) and PayloadSpec (terminal payload stages) for internal type safety.

docs: add package-level Javadoc for pipeline and numeric packages

a76ad35

salvatore-campagna requested a review from martijnvg March 5, 2026 16:11

salvatore-campagna added 2 commits March 5, 2026 17:14

Merge branch 'main' into feature/es94-pipeline-framework

5e4e692

Merge branch 'main' into feature/es94-pipeline-framework

48d6532

martijnvg approved these changes Mar 6, 2026

View reviewed changes

salvatore-campagna and others added 5 commits March 6, 2026 11:00

fix: remove redundant public modifier from MetadataBuffer constructors

9d90c31

fix: remove unused EncodingContext constructor with metadataCapacity …

c47448e

…parameter

Merge branch 'main' into feature/es94-pipeline-framework

3c6da27

Merge branch 'main' into feature/es94-pipeline-framework

ed336af

salvatore-campagna merged commit 0f7e41e into elastic:main Mar 9, 2026
36 checks passed

prwhelan mentioned this pull request Mar 9, 2026

[Transform] Disable PIT for CPS #143876

Closed

salvatore-campagna mentioned this pull request Mar 10, 2026

feat(tsdb): add integer codec stages for ES94 numeric pipeline #143934

Merged

Conversation

salvatore-campagna commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Design highlights

Testing

Uh oh!

elasticsearchmachine commented Mar 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

salvatore-campagna Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

salvatore-campagna Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

salvatore-campagna Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

salvatore-campagna commented Mar 4, 2026 •

edited

Loading

salvatore-campagna Mar 5, 2026 •

edited

Loading

salvatore-campagna Mar 5, 2026 •

edited

Loading

salvatore-campagna Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 6, 2026 •

edited

Loading