[Streams] Add scalability performance journeys for Streams#252288
Conversation
| pipeline_file: .buildkite/pipelines/performance/streams_weekly.yml | ||
| provider_settings: | ||
| trigger_mode: none | ||
| build_branches: true |
There was a problem hiding this comment.
I'm not sure if this should be true. Afaik, this means something like: "start a build when a new branch appears". Let's start with this and observe if it starts builds for random branches
There was a problem hiding this comment.
Good catch, changed to false. It’s intended to run scheduled weekly on main, no need to have true here
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
|
flash1293
left a comment
There was a problem hiding this comment.
Seems like the description isn't fully up to date anymore.
About the mapping - for classic streams we should test with more fields (up to 10k), it's something that does happen in practice (but can also happen on a separate PR).
What's our story for getting notified about these / acting on them?
| * 'import' for content pack bulk import (Phase 5B, scales to 1000+) | ||
| * @param count - Number of child streams to create | ||
| */ | ||
| export async function createLargeWiredHierarchy( |
There was a problem hiding this comment.
Looks like this isn't used anywhere?
There was a problem hiding this comment.
Slipped through during cleanup, thanks for catching this
|
I think it's best to handle that in a separate PR, once this one is merged, we'll have the dedicated pipeline set up, which will make it easier to test on the exact environment these journeys will run in. I'll create a follow-up PR after this lands.
There's a separate issue for that: https://github.com/elastic/streams-program/issues/938. Once this PR is merged and we have some data flowing, I'll start working on this |
| log.info('Wired stream hierarchy created'); | ||
| } | ||
|
|
||
| async function ensureScaleParentStream(kibanaServer: KibanaServer, log: ToolingLog): Promise<void> { |
There was a problem hiding this comment.
🟡 Medium synthtrace_data/streams_data.ts:333
ensureScaleParentStream calls forkStream without retry logic for HTTP 422 lock contention. When the Streams backend is under load, the fork can fail immediately and propagate the error, causing setupLargeWiredHierarchy to fail even though other mutation operations in this file implement exponential backoff retries for the same condition.
🤖 Copy this AI Prompt to have your agent fix this:
In file x-pack/performance/synthtrace_data/streams_data.ts around line 333:
`ensureScaleParentStream` calls `forkStream` without retry logic for HTTP 422 lock contention. When the Streams backend is under load, the fork can fail immediately and propagate the error, causing `setupLargeWiredHierarchy` to fail even though other mutation operations in this file implement exponential backoff retries for the same condition.
Evidence trail:
x-pack/performance/synthtrace_data/streams_data.ts lines 333-350 (ensureScaleParentStream with no retry); lines 127-161 (createSingleClassicStream with retry logic for isLockContentionError); lines 398-410 (fork loop with lock contention retry); lines 786-806 (setupLargeWiredHierarchy calling ensureScaleParentStream)
ApprovabilityVerdict: Needs human review 1 blocking correctness issue found. CODEOWNERS file was modified by a non-owner — requires human review You can customize Macroscope's approvability policy. Learn more. |
…52288) ## Summary Adds six `@kbn/journeys` performance journeys that validate the Streams feature at scale, covering the primary user flows across listing, detail, and management pages. These journeys are excluded from PR CI and run only in scheduled performance pipelines. The heavy wired-hierarchy journey (`streams_wired_hierarchy`) runs in a dedicated weekly Streams-only pipeline due to its large data setup. ### Journeys | Journey | Scale | What it exercises | | ------------------------- | -------------------------------- | --------------------------------------------------------------- | | `streams_listing_page` | 5 000 classic + 3 wired children | Load, search, expand/collapse, navigate to detail | | `streams_data_quality` | 3 wired children | Navigate to data quality tab, verify KPI metrics | | `streams_processing_step` | 3 wired children | Open processor form, configure grok processor, save | | `streams_retention` | 3 wired children | Open retention modal, toggle inherit, set custom retention | | `streams_field_mapping` | 3 wired children + 200 fields | Open schema flyout, add keyword field, review & submit | | `streams_wired_hierarchy` | 1 000 wired children | Expand/collapse large tree, search children, navigate to detail | ### Data creation strategies - **Classic streams at scale**: Uses ES `_bulk` API to auto-create 5 000 unmanaged data streams in batches of 250, bypassing the Streams backend global lock. Raises `cluster.max_shards_per_node` to accommodate the shard count. - **Wired hierarchy at scale**: Uses batched content pack imports (20 batches of 50 children) with retry logic for transient 409/422 errors, followed by a final root routing update via the ingest API. Handles idempotency for "already exists" conflicts caused by timed-out-but-successful prior attempts. - **Small wired hierarchy**: Serial fork of 3 children from `logs.otel` for the lightweight journeys. ### Infrastructure - `.buildkite/pipelines/performance/streams_weekly.yml` — weekly Streams-only pipeline (`JOURNEYS_GROUP=streams`) on `kb-static-scalability-2` - `.buildkite/pipeline-resource-definitions/kibana-streams-performance-weekly.yml` + `.buildkite/pipeline-resource-definitions/locations.yml` — Buildkite pipeline resource definition - `streams_heavy_config.ts` — extended FTR config with 1-hour mocha timeout (covers `beforeSteps` data setup) - `streams` journey group added to `run_performance_cli.ts` for `--group streams` execution - Metrics are reported via `report_performance_metrics.sh`. ### Follow-ups - Add a dedicated classic-stream mapping-at-scale journey that exercises very large field counts (up to 10k), in a separate PR. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Tests** - Added comprehensive Streams performance journeys (listing, data quality, field mapping, processing, retention, wired hierarchy), supporting heavy-profile runs with extended timeouts and a new journey group; included extensive setup utilities for bulk data, large wired hierarchies, and scaled test orchestration. - **Chores** - Updated test metadata and scheduling to disable certain streams jobs in scheduled pipelines, expanded project references/dependencies, and added ownership entries for Streams performance tests. <!-- end of auto-generated comment: release notes by coderabbit.ai --> ### Buildkite run https://buildkite.com/elastic/kibana-single-user-performance/builds/19020#019d002d-9541-4221-b66d-ad31c1b71df0 --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
…52288) ## Summary Adds six `@kbn/journeys` performance journeys that validate the Streams feature at scale, covering the primary user flows across listing, detail, and management pages. These journeys are excluded from PR CI and run only in scheduled performance pipelines. The heavy wired-hierarchy journey (`streams_wired_hierarchy`) runs in a dedicated weekly Streams-only pipeline due to its large data setup. ### Journeys | Journey | Scale | What it exercises | | ------------------------- | -------------------------------- | --------------------------------------------------------------- | | `streams_listing_page` | 5 000 classic + 3 wired children | Load, search, expand/collapse, navigate to detail | | `streams_data_quality` | 3 wired children | Navigate to data quality tab, verify KPI metrics | | `streams_processing_step` | 3 wired children | Open processor form, configure grok processor, save | | `streams_retention` | 3 wired children | Open retention modal, toggle inherit, set custom retention | | `streams_field_mapping` | 3 wired children + 200 fields | Open schema flyout, add keyword field, review & submit | | `streams_wired_hierarchy` | 1 000 wired children | Expand/collapse large tree, search children, navigate to detail | ### Data creation strategies - **Classic streams at scale**: Uses ES `_bulk` API to auto-create 5 000 unmanaged data streams in batches of 250, bypassing the Streams backend global lock. Raises `cluster.max_shards_per_node` to accommodate the shard count. - **Wired hierarchy at scale**: Uses batched content pack imports (20 batches of 50 children) with retry logic for transient 409/422 errors, followed by a final root routing update via the ingest API. Handles idempotency for "already exists" conflicts caused by timed-out-but-successful prior attempts. - **Small wired hierarchy**: Serial fork of 3 children from `logs.otel` for the lightweight journeys. ### Infrastructure - `.buildkite/pipelines/performance/streams_weekly.yml` — weekly Streams-only pipeline (`JOURNEYS_GROUP=streams`) on `kb-static-scalability-2` - `.buildkite/pipeline-resource-definitions/kibana-streams-performance-weekly.yml` + `.buildkite/pipeline-resource-definitions/locations.yml` — Buildkite pipeline resource definition - `streams_heavy_config.ts` — extended FTR config with 1-hour mocha timeout (covers `beforeSteps` data setup) - `streams` journey group added to `run_performance_cli.ts` for `--group streams` execution - Metrics are reported via `report_performance_metrics.sh`. ### Follow-ups - Add a dedicated classic-stream mapping-at-scale journey that exercises very large field counts (up to 10k), in a separate PR. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Tests** - Added comprehensive Streams performance journeys (listing, data quality, field mapping, processing, retention, wired hierarchy), supporting heavy-profile runs with extended timeouts and a new journey group; included extensive setup utilities for bulk data, large wired hierarchies, and scaled test orchestration. - **Chores** - Updated test metadata and scheduling to disable certain streams jobs in scheduled pipelines, expanded project references/dependencies, and added ownership entries for Streams performance tests. <!-- end of auto-generated comment: release notes by coderabbit.ai --> ### Buildkite run https://buildkite.com/elastic/kibana-single-user-performance/builds/19020#019d002d-9541-4221-b66d-ad31c1b71df0 --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
…52288) ## Summary Adds six `@kbn/journeys` performance journeys that validate the Streams feature at scale, covering the primary user flows across listing, detail, and management pages. These journeys are excluded from PR CI and run only in scheduled performance pipelines. The heavy wired-hierarchy journey (`streams_wired_hierarchy`) runs in a dedicated weekly Streams-only pipeline due to its large data setup. ### Journeys | Journey | Scale | What it exercises | | ------------------------- | -------------------------------- | --------------------------------------------------------------- | | `streams_listing_page` | 5 000 classic + 3 wired children | Load, search, expand/collapse, navigate to detail | | `streams_data_quality` | 3 wired children | Navigate to data quality tab, verify KPI metrics | | `streams_processing_step` | 3 wired children | Open processor form, configure grok processor, save | | `streams_retention` | 3 wired children | Open retention modal, toggle inherit, set custom retention | | `streams_field_mapping` | 3 wired children + 200 fields | Open schema flyout, add keyword field, review & submit | | `streams_wired_hierarchy` | 1 000 wired children | Expand/collapse large tree, search children, navigate to detail | ### Data creation strategies - **Classic streams at scale**: Uses ES `_bulk` API to auto-create 5 000 unmanaged data streams in batches of 250, bypassing the Streams backend global lock. Raises `cluster.max_shards_per_node` to accommodate the shard count. - **Wired hierarchy at scale**: Uses batched content pack imports (20 batches of 50 children) with retry logic for transient 409/422 errors, followed by a final root routing update via the ingest API. Handles idempotency for "already exists" conflicts caused by timed-out-but-successful prior attempts. - **Small wired hierarchy**: Serial fork of 3 children from `logs.otel` for the lightweight journeys. ### Infrastructure - `.buildkite/pipelines/performance/streams_weekly.yml` — weekly Streams-only pipeline (`JOURNEYS_GROUP=streams`) on `kb-static-scalability-2` - `.buildkite/pipeline-resource-definitions/kibana-streams-performance-weekly.yml` + `.buildkite/pipeline-resource-definitions/locations.yml` — Buildkite pipeline resource definition - `streams_heavy_config.ts` — extended FTR config with 1-hour mocha timeout (covers `beforeSteps` data setup) - `streams` journey group added to `run_performance_cli.ts` for `--group streams` execution - Metrics are reported via `report_performance_metrics.sh`. ### Follow-ups - Add a dedicated classic-stream mapping-at-scale journey that exercises very large field counts (up to 10k), in a separate PR. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Tests** - Added comprehensive Streams performance journeys (listing, data quality, field mapping, processing, retention, wired hierarchy), supporting heavy-profile runs with extended timeouts and a new journey group; included extensive setup utilities for bulk data, large wired hierarchies, and scaled test orchestration. - **Chores** - Updated test metadata and scheduling to disable certain streams jobs in scheduled pipelines, expanded project references/dependencies, and added ownership entries for Streams performance tests. <!-- end of auto-generated comment: release notes by coderabbit.ai --> ### Buildkite run https://buildkite.com/elastic/kibana-single-user-performance/builds/19020#019d002d-9541-4221-b66d-ad31c1b71df0 --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Matches the upper bound called out in elastic#252288 review. Buildkite streams-performance pipeline will be triggered manually against this branch before merge, so we will see at 10k whether the schema editor loads within journey timeouts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds six
@kbn/journeysperformance journeys that validate the Streams feature at scale, covering the primary user flows across listing, detail, and management pages. These journeys are excluded from PR CI and run only in scheduled performance pipelines. The heavy wired-hierarchy journey (streams_wired_hierarchy) runs in a dedicated weekly Streams-only pipeline due to its large data setup.Journeys
streams_listing_pagestreams_data_qualitystreams_processing_stepstreams_retentionstreams_field_mappingstreams_wired_hierarchyData creation strategies
_bulkAPI to auto-create 5 000 unmanaged data streams in batches of 250, bypassing the Streams backend global lock. Raisescluster.max_shards_per_nodeto accommodate the shard count.logs.otelfor the lightweight journeys.Infrastructure
.buildkite/pipelines/performance/streams_weekly.yml— weekly Streams-only pipeline (JOURNEYS_GROUP=streams) onkb-static-scalability-2.buildkite/pipeline-resource-definitions/kibana-streams-performance-weekly.yml+.buildkite/pipeline-resource-definitions/locations.yml— Buildkite pipeline resource definitionstreams_heavy_config.ts— extended FTR config with 1-hour mocha timeout (coversbeforeStepsdata setup)streamsjourney group added torun_performance_cli.tsfor--group streamsexecutionreport_performance_metrics.sh.Follow-ups
Summary by CodeRabbit
Tests
Chores
Buildkite run
https://buildkite.com/elastic/kibana-single-user-performance/builds/19020#019d002d-9541-4221-b66d-ad31c1b71df0