[Saved Objects] Fix scroll_count test 502 on ECH by reducing import batch size#264965
Merged
Merged
Conversation
…atch size Reduces the import batch size in beforeAll from 6,000 to 1,000 objects per call to avoid ECH HAProxy timeout when importing large datasets. Fixes elastic#262663 Made-with: Cursor
Contributor
|
Pinging @elastic/kibana-core (Team:Core) |
jesuswr
approved these changes
Apr 22, 2026
tiansivive
pushed a commit
to tiansivive/kibana
that referenced
this pull request
Apr 23, 2026
…atch size (elastic#264965) ## Summary Fixes elastic#262663 The `scroll_count - more than 10k objects` test was failing on ECH (Elastic Cloud Hosted) with a 502 during `beforeAll` setup, while passing on Serverless. The failure was **not** in the `scroll_count` route itself, but in the test setup step that imports 12,000 visualizations to populate the dataset. The setup used 2 batches of 6,000 objects each via `POST /api/saved_objects/_import`. On ECH, traffic flows through HAProxy which enforces a client-side timeout (typically 60 s). Importing 6,000 objects in a single call requires streaming and parsing a ~1 MB NDJSON multipart payload, multiple bulk-index calls to Elasticsearch, and building a response with `successResults` containing one entry per object (~600 KB JSON). If Kibana takes longer than the HAProxy timeout, the proxy closes the connection and returns 502. **Fix:** Reduce the import batch size from 6,000 to 1,000 objects per call (12 batches total instead of 2). Each batch is ~165 KB request / ~100 KB response, well within ECH proxy limits and timeout. The total number of objects imported (12,000) and the test assertion remain unchanged. ## Test plan - [x] Existing test `returns the correct count for each included types` validates the fix end-to-end (still asserts `{ visualization: 12000 }`) - [ ] Verify the test passes on `cloud-stateful-classic` (ECH) in CI Made with [Cursor](https://cursor.com)
SoniaSanzV
pushed a commit
to SoniaSanzV/kibana
that referenced
this pull request
Apr 27, 2026
…atch size (elastic#264965) ## Summary Fixes elastic#262663 The `scroll_count - more than 10k objects` test was failing on ECH (Elastic Cloud Hosted) with a 502 during `beforeAll` setup, while passing on Serverless. The failure was **not** in the `scroll_count` route itself, but in the test setup step that imports 12,000 visualizations to populate the dataset. The setup used 2 batches of 6,000 objects each via `POST /api/saved_objects/_import`. On ECH, traffic flows through HAProxy which enforces a client-side timeout (typically 60 s). Importing 6,000 objects in a single call requires streaming and parsing a ~1 MB NDJSON multipart payload, multiple bulk-index calls to Elasticsearch, and building a response with `successResults` containing one entry per object (~600 KB JSON). If Kibana takes longer than the HAProxy timeout, the proxy closes the connection and returns 502. **Fix:** Reduce the import batch size from 6,000 to 1,000 objects per call (12 batches total instead of 2). Each batch is ~165 KB request / ~100 KB response, well within ECH proxy limits and timeout. The total number of objects imported (12,000) and the test assertion remain unchanged. ## Test plan - [x] Existing test `returns the correct count for each included types` validates the fix end-to-end (still asserts `{ visualization: 12000 }`) - [ ] Verify the test passes on `cloud-stateful-classic` (ECH) in CI Made with [Cursor](https://cursor.com)
1 task
gsoldevila
added a commit
that referenced
this pull request
May 6, 2026
…ning import batches concurrently (#266628) ## Summary Fixes #262663 (re-opened after #264965) ### What happened PR #264965 fixed the per-request 502 errors on ECH by reducing the import batch size from 6,000 → 1,000 objects per call, keeping each request safely under HAProxy's ~60 s per-connection timeout. However, the 12 batches now run **sequentially**, and their cumulative time exceeds the `beforeAll` hook's 120 s limit: ``` "beforeAll" hook timeout of 120000ms exceeded. ``` Each 1,000-object `_import` call on ECH takes ~10–15 s (NDJSON multipart parsing + Kibana bulk-index + response building). 12 × 10–15 s = 120–180 s. ### Fix Run the import batches **concurrently** (`CONCURRENCY = 3`) rather than sequentially, and increase the hook timeout to 5 minutes as a generous safety net. ``` 12 batches ÷ 3 concurrent = 4 rounds × ~15 s/round ≈ 60 s total ``` Even if server load slows concurrent batches to ~60 s each, 4 rounds × 60 s = 240 s — still well within the 300 s limit. Each individual request remains at 1,000 objects, so no per-request HAProxy concern. This keeps the full Kibana `_import` pipeline in the setup path (proper SO migration, correct index routing, correct namespace handling), avoiding any coupling to internal storage format details. ### Why not insert directly via `esClient.bulk`? A direct ES bulk insert was considered but rejected: it would bypass Kibana's saved-objects migration pipeline, requiring the test to manually track the correct index name (`ANALYTICS_SAVED_OBJECT_INDEX`), document format, and migration version fields. Any change to the `visualization` type registration or storage model would silently break the test setup without a compile-time or schema error. ## Test plan - [ ] Verify `returns the correct count for each included types` passes on `cloud-stateful-classic` (ECH) in CI Made with [Cursor](https://cursor.com)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #262663
The
scroll_count - more than 10k objectstest was failing on ECH (Elastic Cloud Hosted) with a 502 duringbeforeAllsetup, while passing on Serverless.The failure was not in the
scroll_countroute itself, but in the test setup step that imports 12,000 visualizations to populate the dataset. The setup used 2 batches of 6,000 objects each viaPOST /api/saved_objects/_import.On ECH, traffic flows through HAProxy which enforces a client-side timeout (typically 60 s). Importing 6,000 objects in a single call requires streaming and parsing a ~1 MB NDJSON multipart payload, multiple bulk-index calls to Elasticsearch, and building a response with
successResultscontaining one entry per object (~600 KB JSON). If Kibana takes longer than the HAProxy timeout, the proxy closes the connection and returns 502.Fix: Reduce the import batch size from 6,000 to 1,000 objects per call (12 batches total instead of 2). Each batch is ~165 KB request / ~100 KB response, well within ECH proxy limits and timeout. The total number of objects imported (12,000) and the test assertion remain unchanged.
Test plan
returns the correct count for each included typesvalidates the fix end-to-end (still asserts{ visualization: 12000 })cloud-stateful-classic(ECH) in CIMade with Cursor