Skip to content

time series es819 binary dv use up to a 1mb block size#143049

Merged
martijnvg merged 10 commits intoelastic:mainfrom
martijnvg:binary_dv_increase_block_size_1mb
Mar 10, 2026
Merged

time series es819 binary dv use up to a 1mb block size#143049
martijnvg merged 10 commits intoelastic:mainfrom
martijnvg:binary_dv_increase_block_size_1mb

Conversation

@martijnvg
Copy link
Copy Markdown
Member

@martijnvg martijnvg commented Feb 25, 2026

Change time series doc values format to increase the block size threshold for binary doc values from 128kb to 1mb. The value count threshold is increased from 1024 to 32768. This change is gated behind and index setting and the index setting is gated behind a feature flag, which allows to experiment with this change in benchmarks.

Adhoc benchmark runs against clickbench rally track shows a good decrease (16.18GB to 15.20GB) in disk usage, somewhat higher indexing through, while query latency stays within noise boundaries: https://esbench-metrics.kb.us-east-2.aws.elastic-cloud.com:9243/app/r/s/lApCT

and no much higher value count threshold.
@martijnvg
Copy link
Copy Markdown
Member Author

Buildkite benchmark this with clickbench-columnar-mode please

@martijnvg
Copy link
Copy Markdown
Member Author

Buildkite benchmark this with clickbench-columnar-mode please

@martijnvg
Copy link
Copy Markdown
Member Author

Buildkite benchmark this with elastic-logs-logsdb please

@martijnvg martijnvg marked this pull request as ready for review March 6, 2026 14:40
@martijnvg martijnvg requested a review from a team as a code owner March 6, 2026 14:40
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@martijnvg martijnvg changed the title es819 binary dv experiment with 1mb block size time series es819 binary dv use up to a 1mb block size Mar 6, 2026
static final DocValuesFormat ES_819_3_TSDB_DOC_VALUES_FORMAT = new ES819Version3TSDBDocValuesFormat();
static final DocValuesFormat ES_819_3_TSDB_DOC_VALUES_FORMAT_LARGE_NUMERIC_BLOCK = new ES819Version3TSDBDocValuesFormat(true);
static final DocValuesFormat ES_819_3_TSDB_DOC_VALUES_FORMAT_LARGE_BINARY_BLOCK = new ES819Version3TSDBDocValuesFormat(true, false);
static final DocValuesFormat ES_819_3_TSDB_DOC_VALUES_FORMAT_LARGE_NUMERIC_BLOCK = new ES819Version3TSDBDocValuesFormat(false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the booleans are flipped? Should be:

    static final DocValuesFormat ES_819_3_TSDB_DOC_VALUES_FORMAT_LARGE_BINARY_BLOCK = new ES819Version3TSDBDocValuesFormat(false, true);
    static final DocValuesFormat ES_819_3_TSDB_DOC_VALUES_FORMAT_LARGE_NUMERIC_BLOCK = new ES819Version3TSDBDocValuesFormat(true, false);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes 😱

@martijnvg
Copy link
Copy Markdown
Member Author

Buildkite benchmark this with clickbench-columnar-mode please

@martijnvg
Copy link
Copy Markdown
Member Author

Running another benchmark just to make sure that, larger binary blocks are being used.

@elasticmachine
Copy link
Copy Markdown
Collaborator

elasticmachine commented Mar 9, 2026

💚 Build Succeeded

This build ran two clickbench-columnar-mode benchmarks to evaluate performance impact of this PR.

History

@martijnvg martijnvg requested a review from parkertimmins March 9, 2026 16:46
Copy link
Copy Markdown
Contributor

@parkertimmins parkertimmins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@martijnvg martijnvg enabled auto-merge (squash) March 10, 2026 08:27
@martijnvg martijnvg merged commit effbfb2 into elastic:main Mar 10, 2026
36 checks passed
martijnvg added a commit to martijnvg/rally-tracks that referenced this pull request Mar 10, 2026
By setting to `index.use_time_series_doc_values_format_large_binary_block_size` to `true`. This was recently added via elastic/elasticsearch#143049.
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 10, 2026
…locations

* upstream/main: (126 commits)
  Update KnnIndexTester to use more settings from datasets (elastic#143869)
  fix: dynamic template vector array is overridden by automatic dense_vector mapping (elastic#143733)
  ES|QL: Don't reuse the same alias for _fork column (elastic#143909)
  Close and initialize clients after each node upgrade in logsdb rolling upgrade tests. (elastic#143823)
  ESQL: Added GroupedTopNOperator for LIMIT BY, compute only (elastic#143476)
  Handle views in ResolveIndexAction (elastic#143561)
  Improve reindex rethrottle API in stateless (elastic#143771)
  Use a copy of the SearchExecutionContext for each Percolator execution (elastic#142765)
  Log the stacktrace when we encounter a deprecation warning for `default_metric` (elastic#143929)
  ESQL: evaluate ReferenceAttributes to potentially FieldAttributes for full-text functions restriction (elastic#143893)
  Add ClusterStateSerializationStats Serializatation Tests (elastic#142703)
  Adds Coordination Diagnostics Tests (elastic#142709)
  Upgrade Elasticsearch to Apache Lucene 10.4 (elastic#141882)
  ESQL: Add configurable bracket-based multi-value support for CSV reader (elastic#143890)
  time series es819 binary dv use up to a 1mb block size (elastic#143049)
  Dynamically enable / disable plugins in correspondence to stateless mode. (elastic#142147)
  ES|QL: Implement first/last_over_time for tdigest (elastic#143832)
  Document CHANGE_POINT limitation (elastic#143877)
  Fix OperationsOnSeqNoDisabledIndicesIT (elastic#143892)
  [Test] Test that sequence numbers are not pruned with retention lease (elastic#143825)
  ...
martijnvg added a commit to elastic/rally-tracks that referenced this pull request Mar 11, 2026
By setting to index.use_time_series_doc_values_format_large_binary_block_size to true in elastic/logs and elastic/security benchmarks. This was recently added via elastic/elasticsearch#143049.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants