Skip to content

Use XPerFieldDocValuesFormat in AbstractTSDBSyntheticIdCodec#144744

Merged
tlrx merged 10 commits intoelastic:mainfrom
tlrx:2026/03/23-xperfielddocvaluesformat
Mar 26, 2026
Merged

Use XPerFieldDocValuesFormat in AbstractTSDBSyntheticIdCodec#144744
tlrx merged 10 commits intoelastic:mainfrom
tlrx:2026/03/23-xperfielddocvaluesformat

Conversation

@tlrx
Copy link
Copy Markdown
Member

@tlrx tlrx commented Mar 23, 2026

The TSDB optimized merge logic in DocValuesConsumerUtil requires the doc values producer to be an instance of XPerFieldDocValuesFormat.FieldsReader to access field-specific producers and use optimized merges.

Without explicitly overriding docValuesFormat() in AbstractTSDBSyntheticIdCodec, the codec would use Lucene's standard PerFieldDocValuesFormat, causing the optimized merge check to fail and fall back to the slower unoptimized path.

This commit fixes AbstractTSDBSyntheticIdCodec to also use XPerFieldDocValuesFormat.FieldsReader.

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@tlrx tlrx requested review from burqen, dnhatn, fcofdez, martijnvg and romseygeek and removed request for martijnvg March 24, 2026 09:13
@elastic elastic deleted a comment from elasticmachine Mar 24, 2026
@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Mar 24, 2026

Buildkite benchmark this with tsdb-metricsgen-270m please

@fcofdez
Copy link
Copy Markdown
Contributor

fcofdez commented Mar 24, 2026

I've executed the tsdb-metricsgen track with this change and compared against a baseline run without synthetic ids enabled, and these are the preliminary results (comparing against all runs):

Cumulative merge time of primary shards                                                                         -5.24%          20
Cumulative merge count of primary shards                                                                        +3.35%          20
Min cumulative merge time across primary shard                                                                  -7.59%          20
Median cumulative merge time across primary shard                                                               -2.98%          20
Max cumulative merge time across primary shard                                                                  -5.09%          20

Min Throughput                                                 index                                            -0.20%          20
Mean Throughput                                                index                                            +2.01%          20
Median Throughput                                              index                                            +1.96%          20
Max Throughput                                                 index                                            +2.09%          20

Then I pulled results from a previous run where this fix was not applied yet:

Cumulative merge time of primary shards                                                                                                                                +31.47%          25
Cumulative merge count of primary shards                                                                                                                                +3.35%          25
Min cumulative merge time across primary shard                                                                                                                         +25.14%          25
Median cumulative merge time across primary shard                                                                                                                      +35.17%          25
Max cumulative merge time across primary shard                                                                                                                         +33.93%          25

Min Throughput                                                                                                         index                                            -2.45%          25
Mean Throughput                                                                                                        index                                            +0.31%          25
Median Throughput                                                                                                      index                                            -0.05%          25
Max Throughput                                                                                                         index                                            +1.24%          25

As we can observe, with the change, the merge times are similar/better than the baseline.

Copy link
Copy Markdown
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

We should look into making it easier to test that a Codec will use the optimized merge paths here as this is very easy to miss. I'll open an issue and see if I can build something useful.

@romseygeek
Copy link
Copy Markdown
Contributor

I opened #144834

Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tlrx!

@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Mar 24, 2026

Benchmark results are here but contender has a higher merge time, which is surprising and not aligned with our other benchmarks 🤔

Buildkite benchmark this with tsdb-metricsgen-270m please

Copy link
Copy Markdown
Contributor

@burqen burqen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! Just have small comment.

@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Mar 25, 2026

Buildkite benchmark this with tsdb-metricsgen-270m please

@elasticmachine
Copy link
Copy Markdown
Collaborator

elasticmachine commented Mar 25, 2026

💚 Build Succeeded

This build ran two tsdb-metricsgen-270m benchmarks to evaluate performance impact of this PR.

History

@tlrx tlrx merged commit fd5c450 into elastic:main Mar 26, 2026
36 checks passed
@tlrx tlrx deleted the 2026/03/23-xperfielddocvaluesformat branch March 26, 2026 08:50
@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Mar 26, 2026

Thanks everyone!

szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 26, 2026
* upstream/main: (146 commits)
  Revert "[Native] Gradle-related tweaks to improve handling of the simdvec native library (elastic#144539)"
  Fix ArrayIndexOutOfBoundsException in fetch phase with partial results (elastic#144385)
  ESQL: Correctly manage NULL data type for SUM (elastic#144942)
  [ESQL] Fixes GroupedTopNBenchmark not executing (elastic#144944)
  Fix reader context leak when query response serialization fails (elastic#144708)
  Validate individual offset values in BULK_OFFSETS bounds checks (elastic#144643)
  Merge main21 source set into main in simdvec (elastic#144921)
  [TEST] Unmute TsidExtractingIdFieldMapperTests (elastic#144848)
  [Native] Gradle-related tweaks to improve handling of the simdvec native library (elastic#144539)
  Fix `ThreadedActionListenerTests#testRejectionHandling` (elastic#144795)
  Add new DLM Frozen Tier Transition execution plugin and service (elastic#144595)
  Prometheus: execute query_range via parsed EsqlStatement plan (elastic#144416)
  Investigate `testBulkIndexingRequestSplitting` failure (elastic#144766)
  Add test utility for wrapping directories in FilterDirectory layer (elastic#143563)
  Fix ES|QL decay tests with negative scale (elastic#144657)
  Fix circuit breaker leak in percolator query construction (elastic#144827)
  Use XPerFieldDocValuesFormat in AbstractTSDBSyntheticIdCodec (elastic#144744)
  [DOCS] Document how reindex work in CPS (elastic#144016)
  Fix Int4 vector library tests failing on Java 21 (elastic#144830)
  [DiskBBQ] Fix index sorting on flush (elastic#144938)
  ...
seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 26, 2026
…#144744)

The TSDB optimized merge logic in DocValuesConsumerUtil requires the doc values producer to be an instance of XPerFieldDocValuesFormat.FieldsReader to access field-specific producers and use optimized merges.

Without explicitly overriding docValuesFormat() in AbstractTSDBSyntheticIdCodec, the codec would use Lucene's standard PerFieldDocValuesFormat, causing the optimized merge check to fail and fall back to the slower unoptimized path.

This commit fixes AbstractTSDBSyntheticIdCodec to also use XPerFieldDocValuesFormat.FieldsReader.
seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 27, 2026
…#144744)

The TSDB optimized merge logic in DocValuesConsumerUtil requires the doc values producer to be an instance of XPerFieldDocValuesFormat.FieldsReader to access field-specific producers and use optimized merges.

Without explicitly overriding docValuesFormat() in AbstractTSDBSyntheticIdCodec, the codec would use Lucene's standard PerFieldDocValuesFormat, causing the optimized merge check to fail and fall back to the slower unoptimized path.

This commit fixes AbstractTSDBSyntheticIdCodec to also use XPerFieldDocValuesFormat.FieldsReader.
mamazzol pushed a commit to mamazzol/elasticsearch that referenced this pull request Mar 30, 2026
…#144744)

The TSDB optimized merge logic in DocValuesConsumerUtil requires the doc values producer to be an instance of XPerFieldDocValuesFormat.FieldsReader to access field-specific producers and use optimized merges.

Without explicitly overriding docValuesFormat() in AbstractTSDBSyntheticIdCodec, the codec would use Lucene's standard PerFieldDocValuesFormat, causing the optimized merge check to fail and fall back to the slower unoptimized path.

This commit fixes AbstractTSDBSyntheticIdCodec to also use XPerFieldDocValuesFormat.FieldsReader.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants