Skip to content

Fix AsyncSearchIndexServiceTests.testCircuitBreaker failure#144058

Merged
chrisparrinello merged 8 commits intoelastic:mainfrom
chrisparrinello:async-circuit-breaker-fail
Mar 13, 2026
Merged

Fix AsyncSearchIndexServiceTests.testCircuitBreaker failure#144058
chrisparrinello merged 8 commits intoelastic:mainfrom
chrisparrinello:async-circuit-breaker-fail

Conversation

@chrisparrinello
Copy link
Copy Markdown
Contributor

@chrisparrinello chrisparrinello commented Mar 11, 2026

Closes #143411.

The circuit breaker test did not take into account the overhead in ByteArrayWrapper when picking a size of the circuit breaker to be large enough to accommodate the page and ByteArrayWrapper. With an unlucky random seed, the test would fail immediately for large enough pages.

@chrisparrinello chrisparrinello marked this pull request as ready for review March 12, 2026 14:54
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Mar 12, 2026
@chrisparrinello chrisparrinello added >test Issues or PRs that are addressing/adding tests :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed needs:triage Requires assignment of a team area label labels Mar 12, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

* ByteArrayWrapper shallow size (object header + fields) + RamUsageEstimator.sizeOf(array), which exceeds PAGE_SIZE_IN_BYTES.
*/
private static final int MIN_CIRCUIT_BREAKER_LIMIT_FOR_INITIAL_BUFFER = Math.toIntExact(
RamUsageEstimator.sizeOf(new byte[PageCacheRecycler.PAGE_SIZE_IN_BYTES]) + 64L
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, we had 16* 1024 = 16384

In the new implementation, PAGE_SIZE_IN_BYTES = 1 << 14 -> 16384 (I guess that was the reason added the 16*1024 in the tests)

The RamUsageEstimator.sizeOf(new byte[PageCacheRecycler.PAGE_SIZE_IN_BYTES]) returns 16400

Why do we add +64 in the formula?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That buffer is wrapped in a ByteArrayWrapper whose size is also taken into account in the circuit breaker. If the random circuit breaker size is on the lower end between PAGE_SIZE_IN_BYTES and PAGE_SIZE_IN_BYTES plus the overhead, the test fails.

I dug into the code some more and 64 might be a little high. Running RamUsageEstimator.shallowSizeOfInstance(ByteArrayWrapper.class) shows a size of 40 bytes. I've changed to that and ran 1k iterations without a failure.

Copy link
Copy Markdown
Contributor

@drempapis drempapis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you Chris for the details

@chrisparrinello chrisparrinello merged commit fbb6260 into elastic:main Mar 13, 2026
35 checks passed
@chrisparrinello chrisparrinello deleted the async-circuit-breaker-fail branch March 13, 2026 18:12
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 13, 2026
…elocations

* upstream/main: (72 commits)
  [Test] Randomly disable sequence numbers in CcrTimeSeriesDataStreamsIT (elastic#143930)
  Fix AsyncSearchIndexServiceTests.testCircuitBreaker failure (elastic#144058)
  Refine GenerativeIT some more, this time with accounting for some added (elastic#144220)
  ESQL: Physical Planning on the Lookup Node (elastic#143707)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:approximation.Approximate stats by with zero variance} elastic#144240
  Trigger counter metrics in test for delta temporality measurements (elastic#144193)
  fix capabiltiy approximation_v3 (elastic#144230)
  [ci] Add PR pipeline for testing ipv6 and fix tests not working with ipv6 (elastic#140473)
  update (elastic#144095)
  Make from/to optional in TBUCKET when Kibana timestamp filter is present (elastic#144057)
  Extract reroute behavior from create-index request classes (elastic#144140)
  ESQL: Fix release build only failures (elastic#144122)
  ES|QL query approximation: move sample correction to data node (elastic#144005)
  Add indexing pressure tracking to OTLP endpoints (elastic#144009)
  Fix replica writes after _seq_no doc values are pruned (elastic#144180)
  allow tests to configure supportsLoadingConfig (elastic#144061)
  [ES|QL] Unmute testGiantTextFieldInSubqueryIntermediateResultsWithSort (elastic#144126)
  [ESQL][DOCS] Add CPS page (unpublished for moment) (elastic#144206)
  ESQL: Forbid "load" unmapped_fields for certain commands (elastic#144115)
  Add CCS Remote Views Detection (elastic#143384)
  ...
ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Mar 16, 2026
…144058)

Closes elastic#143411.

The circuit breaker test did not take into account the overhead in ByteArrayWrapper when picking a size of the circuit breaker to be large enough to accommodate the page and ByteArrayWrapper. With an unlucky random seed, the test would fail immediately for large enough pages.
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
…144058)

Closes elastic#143411.

The circuit breaker test did not take into account the overhead in ByteArrayWrapper when picking a size of the circuit breaker to be large enough to accommodate the page and ByteArrayWrapper. With an unlucky random seed, the test would fail immediately for large enough pages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AsyncSearchIndexServiceTests.testCircuitBreaker fails

3 participants