Skip to content

Fix circuit breaker leak in BreakingTDigestHolder#143873

Merged
JonasKunz merged 6 commits intoelastic:mainfrom
JonasKunz:tdigestbuilder-cranky-fix
Mar 11, 2026
Merged

Fix circuit breaker leak in BreakingTDigestHolder#143873
JonasKunz merged 6 commits intoelastic:mainfrom
JonasKunz:tdigestbuilder-cranky-fix

Conversation

@JonasKunz
Copy link
Copy Markdown
Contributor

Fixes #143860.

There indeed was a code path in BreakingTDigestHolder which could cause a circuit breaker memory leak if the breaker trips.

This confused me a little, because it should have been caught by BreakingTDigestHolderTests.testCrankyCircuitBreaker, as I ran that with many iterations before merging the PR that introduced it.
As it turns out, I simply forgot to assert the circuit breaker reserved bytes 🤦

@JonasKunz JonasKunz self-assigned this Mar 9, 2026
@JonasKunz JonasKunz added >test Issues or PRs that are addressing/adding tests :StorageEngine/ES|QL Timeseries / metrics / PromQL / logsdb capabilities in ES|QL labels Mar 9, 2026
@JonasKunz JonasKunz requested a review from kkrik-es March 9, 2026 14:59
@elasticsearchmachine elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v9.4.0 Team:StorageEngine labels Mar 9, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

this.breaker = breaker;
this.delegate = new BreakingBytesRefBuilder(breaker, CIRCUIT_BREAKER_LABEL);
breaker.addEstimateBytesAndMaybeBreak(SHALLOW_SIZE, CIRCUIT_BREAKER_LABEL);
BreakingBytesRefBuilder delegate = new BreakingBytesRefBuilder(breaker, CIRCUIT_BREAKER_LABEL);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called after adding the estimated bytes to the breaker, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BreakingBytesRefBuilder performs the memory accounting for itself in the constructor.
The addEstimateBytesAndMaybeBreak is just for the shallow object size of BreakingBytesStreamOutput.
Technically, BreakingBytesStreamOutput has already been allocated here because we are in the constructor and we adjust the breaker afterwards. However, the object is so small that this overshoot doesn't really matter, BreakingBytesRefBuilder does the same.

You shouldn't do this if e.g. you are allocating large arrays: In that case you should make sure to adjust the breaker before you allocate. For objects this usually means adding a factory method which does the accounting and only calling the constructor at the end of it. However, for such small objects this is not worth the additional lines of code.

@JonasKunz JonasKunz requested a review from kkrik-es March 10, 2026 07:57
@JonasKunz JonasKunz enabled auto-merge (squash) March 10, 2026 12:40
@JonasKunz JonasKunz merged commit d1518cc into elastic:main Mar 11, 2026
36 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 11, 2026
…elocations

* upstream/main: (54 commits)
  [ES|QL|DS] Wire parallel parsing into production for text formats (elastic#143997)
  ESQL: Allow EXTERNAL commands be run part of the CsvTests suite (elastic#143970)
  [ESQL] Push stats to external source via metadata (elastic#143940)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:approximation.Approximate stats with stats where} elastic#144051
  Refactored SortedNumericDocValuesSyntheticFieldLoader into a Layer (elastic#143912)
  Enable extended doc_values params feature flag in RandomizedRollingUpgradeIT (elastic#143918)
  Mute org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT test {csv-spec:approximation.Approximate stats with sample} elastic#144022
  Ensure we use float values for rolling upgrade float vectors (elastic#144032)
  Remove sensitive info from reindex task description (elastic#143635)
  Fix HistogramUnionState.equals (elastic#143990)
  Use dedicated IndexRouting API in ShardSplittingQuery (elastic#143776)
  Engine/Store DistributedArchitectureGuide doc (elastic#143818)
  Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testDeletesAreBatched elastic#144034
  Avoid serializing exceptions as JSON in remote write endpoint (elastic#143987)
  allow testLoadDocSequenceReturnsCorrectResultsText to circuit break, it happens in serverless occasionally (elastic#144023)
  [ESQL] Adds memory accounting to GroupedLimitOperator (elastic#143941)
  Adjust ESIntegTestCase.getLiveDocs method to account for pruned sequence numbers (elastic#143999)
  Support target bucket count in `TBUCKET` with explicit from/to date range (elastic#142747)
  TSDBDocValuesFormatSingleNodeTests with and without synthetic id (elastic#144002)
  Fix circuit breaker leak in BreakingTDigestHolder (elastic#143873)
  ...
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor Pull request authored by a developer outside the Elasticsearch team :StorageEngine/ES|QL Timeseries / metrics / PromQL / logsdb capabilities in ES|QL Team:StorageEngine >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] HistogramMergeTDigestAggregatorFunctionTests testSimpleWithCranky failing

3 participants