[Observability Onboarding] Add data detection & loading indicators to onboarding flows by rStelmach · Pull Request #257870 · elastic/kibana

rStelmach · 2026-03-16T10:03:42Z

Summary 📚

This PR adds data detection and loading indicators to all observability onboarding flows. Previously, flows either had no data detection (OTel Host, OTel APM, Cloud Forwarder) or lacked separation between logs and metrics arrival (OTel K8s). Now there is a progress indicator while data is being ingested, troubleshooting guidance if data doesn't arrive, and contextual action links once the relevant data types are confirmed.

Detection strategies per flow

Flow	Detection	Polling trigger	Index patterns	Correlation
OTel Host	Time-window	Blur (user runs command)	`logs-.otel-`, `logs.otel`, `metrics-.otel-`	`@timestamp >= sessionStart`
OTel APM	Time-window	Blur (user runs command)	`traces-apm`, `logs-apm`, `metrics-apm`, `-.otel-`, `apm-*`	`@timestamp >= sessionStart`
Cloud Forwarder	Time-window	Button click (launches CloudFormation)	`logs-aws.{vpcflow,elbaccess,cloudtrail}.otel-*`	`@timestamp >= sessionStart`
OTel K8s	Correlation ID	Blur (user runs command)	`logs-`, `logs.`, `metrics-`, `metrics.`	`onboarding.id` via Helm `--set` processor
K8s EA	Correlation ID	Blur (existing)	`logs-`, `logs.`, `metrics-`, `metrics.`	`fields.onboarding_id` (existing)

What's new

Client-side

useTimeWindowDataDetection hook: shared hook replacing ~150 lines of duplicated polling/state/telemetry logic across OTel Host, OTel APM, and Cloud Forwarder flows. Handles configurable polling interval, telemetry on data received, and troubleshooting visibility after a delay. Treats fetch failures as "no data" to prevent infinite spinner.
DataIngestStatus generalized: now accepts actionLinks with a requires field ('any' | 'logs' | 'metrics' | 'traces') for conditional link visibility. Separates telemetry (fires on any data) from step completion (fires when all required data types arrive). Drives both K8s EA and OTel K8s flows.
ActionLink.requires on GetStartedPanel: action links declare which data type they need, so Dashboard links wait for metrics while Explore Logs links appear as soon as logs arrive.
Step status indicators: all flows now show current → complete step status based on actual data arrival.

Server-side

4 new has-data routes: /cloudforwarder/has-data, /otel_apm/has-data, /otel_host/has-data, and enhanced /kubernetes/has-data (now returns hasLogs/hasMetrics separately).
Shared error handler (isNoShardsAvailableError / throwHasDataSearchError): deduplicates identical catch blocks across all has-data routes.
K8s route improvements: removed deprecated bare logs/metrics root indices (per [Observability Onboarding] Wired streams onboarding #249705), added runtime field with ._rt suffix to avoid shadowing indexed fields on classic streams, queries both resource.attributes.onboarding.id and labels.onboarding_id for broad correlation coverage.
Cloudforwarder logType validation: t.keyof({ vpcflow, elbaccess, cloudtrail }) replaces unsafe t.string + as LogType cast.

OTel K8s Helm command

Injects resource/onboarding_id processor via --set flags to tag data with onboarding.id resource attribute.
Processor index management documented: base values file uses processors[0..7], custom processors start at index 8.

E2E Ensemble tests

Updated OTel Host and OTel K8s specs to dispatch blur + wait for data indicator instead of static timeouts.
Updated POMs with correct data-test-subj selectors for the new GetStartedPanel-based action links.

Notes 📓

Why not inject `onboarding.id` into every flow for exact correlation?

Injecting a correlation ID (like onboarding.id) as an OTel resource attribute is permanent, it stays on all events from that collector for its entire lifetime, not just during onboarding. For K8s OTel this trade-off is acceptable because the Helm chart is an explicit deployment that can be reconfigured, and multi-cluster disambiguation benefits from a stable ID. For OTel Host, OTel APM, and Cloud Forwarder, we chose time-window detection (@timestamp >= start) to avoid persisting a non-functional field in production data.

Known edge cases of the detection strategies used in this PR:

Time-window false positives (OTel Host, APM, Cloud Forwarder): If the cluster already has data in the queried indices (e.g. logs-*.otel-*, logs.otel) from a previous setup, the @timestamp >= sessionStart query will match pre-existing data and report "ready" before the new collector actually sends anything.
Correlation ID persistence (K8s OTel): The onboarding.id resource attribute injected via Helm is permanent, it stays on all events from that collector for its entire lifetime, not just during onboarding. This was an intentional trade-off for exact correlation (no false positives), but it means a non-functional field persists in production data.
K8s EA fields.onboarding_id (existing, unchanged): Same persistence concern - the Elastic Agent integration injects onboarding_id into all events permanently.

How to test 🧪

Use this guide to test each flow: https://github.com/elastic/observability-dev/tree/main/docs/obs-onboarding/onboarding-flows

Demos 🎥

otel-k8s-demo.mov

otel-demo.mov

Summary by CodeRabbit

New Features
- Event-driven data monitoring with live progress indicators and conditional troubleshooting guidance
- New data-availability APIs for Cloud Forwarder, Kubernetes, APM, and Host flows
- Onboarding identifier surfaced and propagated to improve tracking and deep links
Refactor
- Generalized data ingest/status UI to support multiple onboarding flows and richer action links
- Unified time-window data detection and monitoring logic across flows
- Improved handling of transient Elasticsearch shard unavailability
Tests
- E2E tests updated to use event-driven checks instead of fixed timeouts

…tion-loading-indicators-to-onboarding-flows

elasticmachine · 2026-03-18T15:01:03Z

Pinging @elastic/obs-onboarding-team (Team:obs-onboarding)

…tion-loading-indicators-to-onboarding-flows

github-actions · 2026-03-18T15:01:07Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

coderabbitai · 2026-03-18T15:41:09Z

Caution

Review failed

An error occurred during the review process. Please try again later.

📝 Walkthrough

Walkthrough

Replaces fixed timeouts with event-driven data detection across onboarding flows; adds server-side has-data routes querying Elasticsearch; introduces time-window detection hook and progress UI components; propagates onboardingId into install commands and tests update to assert data-received indicators.

Changes

Cohort / File(s)	Summary
Configuration & Constants `x-pack/solutions/observability/plugins/observability_onboarding/common/aws_cloudforwarder.ts`	Added `CLOUDFORWARDER_INDEX_PATTERNS` mapping LogType → index pattern.
E2E Tests – Data Detection `x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/host_otel.spec.ts`, `x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/kubernetes_otel.spec.ts`	Replaced fixed timeouts with window blur trigger and explicit assertDataReceivedIndicator checks; added 2-minute propagation buffer.
POM Pages – Data Indicators `x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_host_flow.page.ts`, `.../pom/pages/otel_kubernetes_flow.page.ts`	Added `dataReceivedIndicator` locator and `assertDataReceivedIndicator()` methods; updated logs/metrics selector IDs.
Cloud Forwarder UI Flow `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/cloudforwarder/index.tsx`	Added monitoring hooks, sessionStartTime, disabled inputs during deployment, ProgressIndicator-driven visualization, and Discover link rendering when data arrives.
Kubernetes Data Ingest Component `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx`	Reworked to generic `onboardingFlowType`/`actionLinks`, computes `hasData`/`hasLogs`/`hasMetrics`, emits telemetry, and invokes `onDataReceived` when ready.
Kubernetes UI Panel `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx`	Wired pricing gating, created `kubernetesActionLinks`, passed extra props to `DataIngestStatus`, and advanced steps on data receipt; propagates `onboardingId`.
OTel APM / Host / Logs UI Flows `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_apm/index.tsx`, `.../otel_logs/index.tsx`	Added monitoring/time-window state, `useWindowBlurDataMonitoringTrigger`, `useTimeWindowDataDetection`, ProgressIndicator-driven visualize step, troubleshooting gating, and GetStartedPanel rendering when data available.
OTel Kubernetes Panel `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/otel_kubernetes_panel.tsx`	Replaced static GetStartedPanel with `DataIngestStatus`, added async dataReceived state, onboardingId propagation, and conditional action links (cluster/services/logs).
Shared UI & Types `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/get_started_panel.tsx`	Introduced exported `ActionLink` interface (includes optional `requires`) and updated `GetStartedPanel` props to use it.
Data Detection Hook `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`	New exported hook `useTimeWindowDataDetection` with polling, fallback behavior, telemetry emission on first data, and troubleshooting visibility calculation.
Error Handling Utility `x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts`	New helpers `isNoShardsAvailableError()` and `throwHasDataSearchError()` for search error handling.
Server Routes – has-data `x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts`, `.../otel_apm/route.ts`, `.../otel_host/route.ts`	Added `GET /internal/.../has-data` routes that query ES (terminate_after/size:0) and return `{ hasData }`, handling NoShardsAvailable by returning false.
Kubernetes Server Route `x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/route.ts`	Parallel searches against logs and metrics with runtimeMappings for `onboarding.id`; response includes `hasLogs?` and `hasMetrics?`; uses shared error helpers.
Install Stack Command Changes & Tests `x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.ts`, `.../build_install_stack_command.test.ts`	Added `onboardingId` parameter to command builder, injects `onboarding_id` processor into logs and optionally metrics pipelines; updated tests to validate onboarding_id injection and ordering relative to wired_streams.

Sequence Diagram

sequenceDiagram
    participant User as User/Browser
    participant UI as Onboarding UI
    participant Hook as useTimeWindowDataDetection
    participant Server as Onboarding Server Route
    participant ES as Elasticsearch

    User->>UI: Blur window (leave tab)
    UI->>Hook: trigger monitoring (sessionStartTime)
    Hook->>Server: GET /internal/.../has-data?start=...
    Server->>ES: search logs/metrics since start (terminate_after:1, size:0)
    ES-->>Server: search result (hits>0 / 0)
    Server-->>Hook: { hasData: boolean, hasLogs?, hasMetrics? }
    alt hasData == true
        Hook->>UI: hasData = true
        UI->>UI: render GetStartedPanel (action links)
    else hasData == false
        Hook->>Hook: if elapsed > troubleshootingDelay -> isTroubleshootingVisible = true
        UI->>UI: render ProgressIndicator (and troubleshooting if true)
    end
    Note over Hook,UI: Polling repeats at FETCH_INTERVAL until hasData

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

[Observability Onboarding] Add Wired Streams Ensemble e2e coverage #256265 — Related changes to observability_onboarding Playwright e2e tests and page-object assertions for data-ingest flows.

Suggested labels

release_note:skip

Suggested reviewers

flash1293
mohamedhamed-ahmed
jbudz

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately summarizes the main objective: adding data detection and loading indicators to observability onboarding flows, which directly aligns with the extensive changes across multiple flow implementations.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

🛠️ Update Documentation: Commit on current branch
🛠️ Update Documentation: Create PR

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (8)

x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts (2)

70-74: Avoid duplicating logType keys across schema and index map.

logType is declared inline in the route and also encoded in CLOUDFORWARDER_INDEX_PATTERNS. If they drift, indexPattern can become undefined at runtime. Prefer a shared runtime key source.

Also applies to: 86-86

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts`
around lines 70 - 74, The route duplicates the logType keys inline in the params
schema and in CLOUDFORWARDER_INDEX_PATTERNS which can drift; extract a shared
constant (e.g., LOG_TYPE_KEYS or CLOUDFORWARDER_LOG_TYPES) that lists the
allowed keys ("vpcflow", "elbaccess", "cloudtrail") and use that constant both
when building the t.keyof(...) in the params schema and when constructing
CLOUDFORWARDER_INDEX_PATTERNS, so the source of truth for log types is a single
symbol referenced from the route's params and the index pattern map.

9-9: Replace wildcard io-ts import with explicit imports.

This keeps the module surface explicit and aligns with the TS import guideline.

♻️ Proposed refactor

-import * as t from 'io-ts';
+import { keyof as ioKeyof, string as ioString, type as ioType } from 'io-ts';

-  params: t.type({
-    query: t.type({
-      logType: t.keyof({ vpcflow: null, elbaccess: null, cloudtrail: null }),
-      start: t.string,
+  params: ioType({
+    query: ioType({
+      logType: ioKeyof({ vpcflow: null, elbaccess: null, cloudtrail: null }),
+      start: ioString,
     }),
   }),

As per coding guidelines, "Prefer explicit import/exports over wildcard imports/exports in TypeScript."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts`
at line 9, Replace the wildcard import "import * as t from 'io-ts';" with
explicit named imports of only the symbols used in this file: scan for all
occurrences of "t." in route.ts (e.g., t.type, t.string, t.union, etc.), collect
those identifiers, and change the import to "import { <Identifiers> } from
'io-ts';" so the module surface is explicit and aligns with the TS import
guideline.

x-pack/solutions/observability/plugins/observability_onboarding/common/aws_cloudforwarder.ts (1)

31-35: Prefer an immutable typed map for index patterns.

Record<LogType, string> widens the object and leaves properties mutable. Consider as const + satisfies Readonly<Record<LogType, string>> for safer config semantics.

♻️ Proposed refactor

-export const CLOUDFORWARDER_INDEX_PATTERNS: Record<LogType, string> = {
+export const CLOUDFORWARDER_INDEX_PATTERNS = {
   vpcflow: 'logs-aws.vpcflow.otel-*',
   elbaccess: 'logs-aws.elbaccess.otel-*',
   cloudtrail: 'logs-aws.cloudtrail.otel-*',
-};
+} as const satisfies Readonly<Record<LogType, string>>;

As per coding guidelines, "Prefer readonly and as const for immutable structures in TypeScript."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/common/aws_cloudforwarder.ts`
around lines 31 - 35, CLOUDFORWARDER_INDEX_PATTERNS is declared as a mutable
Record<LogType, string>; make it an immutable, fully-typed map by appending as
const and using the satisfies Readonly<Record<LogType, string>> pattern so the
object literal is inferred with literal types and its properties are readonly;
update the declaration for CLOUDFORWARDER_INDEX_PATTERNS (referencing the
LogType type) accordingly.

x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts (1)

57-68: Consider typing the dynamic endpoint more safely.

The as any cast on line 60 bypasses type checking for the dynamic endpoint. While this is a pragmatic workaround for routes not known at compile time, it reduces type safety.

Consider defining a union type for known endpoints or using a more specific type that captures the expected response shape:

type HasDataEndpoint = 
  | '/internal/observability_onboarding/otel_host/has-data'
  | '/internal/observability_onboarding/otel_apm/has-data'
  | '/internal/observability_onboarding/cloudforwarder/has-data';

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`
around lines 57 - 68, The dynamic endpoint passed into callApi inside the
useFetcher in use_time_window_data_detection.ts is cast with `as any`, losing
type safety; define a union type (e.g., HasDataEndpoint) listing the allowed
endpoint string literals and use that type for the `endpoint` variable (and the
callApi argument) so the call becomes type-checked against known routes, and
update any function signatures or generics around useFetcher/callApi to accept
HasDataEndpoint (and the expected response shape) instead of using `any`.

x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx (2)

45-52: Consider memoizing logsLocatorParams to maintain referential stability.

The logsLocatorParams object is recreated on every render since it depends on useWiredStreams. If logsLocator?.getRedirectUrl(logsLocatorParams) or downstream components perform equality checks, this could cause unnecessary recalculations.

♻️ Suggested improvement

+ import React, { useEffect, useState, useMemo } from 'react';
...
  const useWiredStreams = ingestionMode === 'wired';
- const logsLocatorParams = useWiredStreams ? { dataViewSpec: WIRED_ECS_DATA_VIEW_SPEC } : {};
+ const logsLocatorParams = useMemo(
+   () => (useWiredStreams ? { dataViewSpec: WIRED_ECS_DATA_VIEW_SPEC } : {}),
+   [useWiredStreams]
+ );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx`
around lines 45 - 52, logsLocatorParams is recreated each render causing
referential instability; wrap its creation in a memo so it only changes when
useWiredStreams or WIRED_ECS_DATA_VIEW_SPEC change. Replace the inline object
creation with a memoized value (using React.useMemo) keyed on useWiredStreams
and WIRED_ECS_DATA_VIEW_SPEC so calls like
logsLocator?.getRedirectUrl(logsLocatorParams) receive a stable reference and
avoid unnecessary recalculations.

75-109: Consider memoizing kubernetesActionLinks to prevent unnecessary re-renders.

This array is recreated on every render. Since it depends on metricsOnboardingEnabled, dashboardLocator, logsLocator, and logsLocatorParams, memoizing it would improve performance and prevent unnecessary re-renders of child components.

♻️ Suggested improvement

+ const kubernetesActionLinks: ActionLink[] = useMemo(
+   () => [
      ...(metricsOnboardingEnabled
        ? [
            {
              id: CLUSTER_OVERVIEW_DASHBOARD_ID,
              label: i18n.translate(
                'xpack.observability_onboarding.kubernetesPanel.exploreDashboard',
                { defaultMessage: 'Explore Kubernetes cluster' }
              ),
              title: i18n.translate(
                'xpack.observability_onboarding.kubernetesPanel.monitoringCluster',
                {
                  defaultMessage: 'Overview your Kubernetes cluster with this pre-made dashboard',
                }
              ),
              requires: 'metrics' as const,
              href:
                dashboardLocator?.getRedirectUrl({
                  dashboardId: CLUSTER_OVERVIEW_DASHBOARD_ID,
                }) ?? '',
            },
          ]
        : []),
      {
        id: 'logs',
        title: i18n.translate('xpack.observability_onboarding.kubernetesPanel.logsTitle', {
          defaultMessage: 'View and analyze your logs:',
        }),
        label: i18n.translate('xpack.observability_onboarding.kubernetesPanel.logsLabel', {
          defaultMessage: 'Explore logs',
        }),
        requires: 'logs' as const,
        href: logsLocator?.getRedirectUrl(logsLocatorParams) ?? '',
      },
+   ],
+   [metricsOnboardingEnabled, dashboardLocator, logsLocator, logsLocatorParams]
+ );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx`
around lines 75 - 109, The kubernetesActionLinks array is rebuilt on every
render; wrap its creation in React's useMemo to memoize it and avoid unnecessary
re-renders, using [metricsOnboardingEnabled, dashboardLocator, logsLocator,
logsLocatorParams] as the dependency array so kubernetesActionLinks is
recomputed only when those values change (keep the existing structure and values
and reference the same symbols: kubernetesActionLinks, metricsOnboardingEnabled,
dashboardLocator, logsLocator, logsLocatorParams).

x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx (1)

93-97: Consider ensuring onDataReceived callback stability to prevent repeated invocations.

The effect depends on onDataReceived, which is passed as an inline arrow function from the parent (() => setDataReceived(true)). Each parent render creates a new function reference, causing this effect to re-run. While functionally harmless (setting state to the same value is a no-op), it's inefficient.

♻️ Suggested fix in the parent component (index.tsx)

+ import React, { useEffect, useState, useCallback } from 'react';
...
  const [dataReceived, setDataReceived] = useState(false);
+ const handleDataReceived = useCallback(() => setDataReceived(true), []);
...
          <DataIngestStatus
            onboardingId={data.onboardingId}
            onboardingFlowType="kubernetes"
            dataset="kubernetes"
            integration="kubernetes"
            actionLinks={kubernetesActionLinks}
-           onDataReceived={() => setDataReceived(true)}
+           onDataReceived={handleDataReceived}
          />

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx`
around lines 93 - 97, The effect in useEffect in data_ingest_status.tsx depends
on onDataReceived which is passed from the parent as an inline arrow, causing a
new function reference each render; fix by stabilizing the callback in the
parent (where you currently pass () => setDataReceived(true))—wrap that handler
with useCallback (e.g., const handleDataReceived = useCallback(() =>
setDataReceived(true), [setDataReceived]) ) and pass handleDataReceived into the
child so useEffect (in data_ingest_status.tsx) no longer retriggers due to a
changing onDataReceived reference.

x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/route.ts (1)

141-192: Split the scripted fallback out of the main polling query.

The _rt clause is runtime-scripted, and this handler polls broad logs-*/logs.* and metrics-*/metrics.* patterns without a time window. That makes the "no data yet" path much more expensive than it needs to be. Consider keeping the indexed terms in the first search and moving the runtime fallback to a second query scoped to the passthrough streams.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/route.ts`
around lines 141 - 192, The current combined search builds query (variable
query) with both indexed terms and the scripted runtime field
'resource.attributes.onboarding.id._rt' via runtimeMappings and then polls broad
index patterns in the two searches (logsResult, metricsResult), which is
inefficient; change it to a two-step approach: 1) call
elasticsearch.client.asCurrentUser.search for logs and metrics using the
existing query but remove runtime_mappings and drop any should-clause term that
references 'resource.attributes.onboarding.id._rt' so the first fast check only
uses indexed fields (termQuery results); 2) only if those first searches return
no hits, run a second targeted search scoped to passthrough streams (narrow
index pattern or filter for logs.otel/passthrough streams) that includes
runtime_mappings (runtimeMappings) and a query containing just the runtime '_rt'
should-clause; update the code around runtimeMappings, query, and the two search
calls (logsResult, metricsResult) accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/host_otel.spec.ts`:
- Around line 72-76: Replace the fixed 2-minute sleep (page.waitForTimeout(2 *
60000)) with a condition-based wait that polls the actual signals that OTEL data
has landed for this branch; for example, use page.waitForSelector to wait for a
specific dashboard panel or Discover hit-count element to appear, or use
page.waitForResponse/ page.waitForFunction to wait for the API that returns
documents/metrics (replace the timeout call in host_otel.spec.ts where
page.waitForTimeout is used). Ensure the new wait targets branch-specific UI
elements or network endpoints that indicate data readiness (e.g., Discover
document count, a saved search result, or a specific visualization DOM node) and
include a reasonable timeout for the wait operation instead of a fixed 2-minute
delay.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/kubernetes_otel.spec.ts`:
- Around line 105-109: Remove the unconditional await page.waitForTimeout(2 *
60000); and replace it with a deterministic polling-based readiness check:
implement a retry loop (or use page.waitForFunction / page.waitForResponse) that
periodically verifies the specific readiness signal (e.g., the presence of a
dashboard/Discover element, expected network response, or API endpoint returning
data) until success or a max timeout, keeping the existing overall timeout but
eliminating the fixed 120s sleep; target the call site by replacing the
page.waitForTimeout invocation in kubernetes_otel.spec.ts and reference
page.waitForFunction, page.waitForResponse, or a helper like
waitForCondition/checkDataReady to perform the checks.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_host_flow.page.ts`:
- Around line 67-72: The assertDataReceivedIndicator helper currently uses the
default expect timeout and has no explicit return type; update the method
signature to declare an explicit return type (Promise<void>) and call
toBeVisible with an ingestion-safe timeout option (e.g., await
expect(this.dataReceivedIndicator, '...').toBeVisible({ timeout:
INGESTION_SAFE_TIMEOUT_MS })) or use the project's shared ingestion timeout
constant, ensuring you reference the assertDataReceivedIndicator method and the
dataReceivedIndicator locator when making the change.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_kubernetes_flow.page.ts`:
- Around line 89-94: The assertDataReceivedIndicator method uses Playwright's
default expect timeout which is too short for ingestion waits and lacks an
explicit return type; update the public method signature to return Promise<void>
and call expect(this.dataReceivedIndicator, ...) with an increased timeout
option (e.g., { timeout: 30000 } or a shared constant) so the assertion waits
longer for onboarding data to arrive while keeping the descriptive message
intact.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts`:
- Around line 15-30: Change both function signatures to accept a typed
Elasticsearch error (e.g., an Error object with optional statusCode/status and
meta.body fields) instead of any; update isNoShardsAvailableError to use that
typed parameter when inspecting meta.body.error.type and
meta.body.error.root_cause[0].type, and modify throwHasDataSearchError to
rethrow the original status code by boomifying the original error (use
Boom.boomify or equivalent) with statusCode taken from error.statusCode ||
error.status || error.meta?.statusCode before falling back to Boom.internal, and
include the original error message; reference the functions
isNoShardsAvailableError and throwHasDataSearchError when applying these
changes.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_apm/route.ts`:
- Around line 62-65: The route's input schema (the params object with params:
t.type({ query: t.type({ start: t.string }) })) accepts any string for
query.start but that value is immediately used as a date/timestamp; update the
route boundary to validate start as a timestamp (e.g., use an io-ts number type
or a branded/refined string that matches an epoch/ms timestamp/ISO date) and add
explicit error handling to return a typed 400 Bad Request when validation fails;
adjust the params schema and the route handler code that reads
params.query.start so it only proceeds when the validated/parsed timestamp is
present, and return a clear bad-request response (rather than letting the
downstream search error) when parsing/validation fails.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts`:
- Around line 87-90: The route's query codec currently allows any string for
`start` (params -> query -> start) so malformed timestamps aren't rejected;
update the codec for `params` (the query.start field in route.ts) to validate
that `start` is a proper timestamp (e.g., an ISO8601 string or numeric epoch)
using a refined/branded io-ts type or a custom validator instead of plain
t.string, and ensure the handler returns a 400 typed error when validation fails
so malformed `start` values are rejected at the route boundary.

---

Nitpick comments:
In
`@x-pack/solutions/observability/plugins/observability_onboarding/common/aws_cloudforwarder.ts`:
- Around line 31-35: CLOUDFORWARDER_INDEX_PATTERNS is declared as a mutable
Record<LogType, string>; make it an immutable, fully-typed map by appending as
const and using the satisfies Readonly<Record<LogType, string>> pattern so the
object literal is inferred with literal types and its properties are readonly;
update the declaration for CLOUDFORWARDER_INDEX_PATTERNS (referencing the
LogType type) accordingly.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx`:
- Around line 93-97: The effect in useEffect in data_ingest_status.tsx depends
on onDataReceived which is passed from the parent as an inline arrow, causing a
new function reference each render; fix by stabilizing the callback in the
parent (where you currently pass () => setDataReceived(true))—wrap that handler
with useCallback (e.g., const handleDataReceived = useCallback(() =>
setDataReceived(true), [setDataReceived]) ) and pass handleDataReceived into the
child so useEffect (in data_ingest_status.tsx) no longer retriggers due to a
changing onDataReceived reference.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx`:
- Around line 45-52: logsLocatorParams is recreated each render causing
referential instability; wrap its creation in a memo so it only changes when
useWiredStreams or WIRED_ECS_DATA_VIEW_SPEC change. Replace the inline object
creation with a memoized value (using React.useMemo) keyed on useWiredStreams
and WIRED_ECS_DATA_VIEW_SPEC so calls like
logsLocator?.getRedirectUrl(logsLocatorParams) receive a stable reference and
avoid unnecessary recalculations.
- Around line 75-109: The kubernetesActionLinks array is rebuilt on every
render; wrap its creation in React's useMemo to memoize it and avoid unnecessary
re-renders, using [metricsOnboardingEnabled, dashboardLocator, logsLocator,
logsLocatorParams] as the dependency array so kubernetesActionLinks is
recomputed only when those values change (keep the existing structure and values
and reference the same symbols: kubernetesActionLinks, metricsOnboardingEnabled,
dashboardLocator, logsLocator, logsLocatorParams).

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`:
- Around line 57-68: The dynamic endpoint passed into callApi inside the
useFetcher in use_time_window_data_detection.ts is cast with `as any`, losing
type safety; define a union type (e.g., HasDataEndpoint) listing the allowed
endpoint string literals and use that type for the `endpoint` variable (and the
callApi argument) so the call becomes type-checked against known routes, and
update any function signatures or generics around useFetcher/callApi to accept
HasDataEndpoint (and the expected response shape) instead of using `any`.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts`:
- Around line 70-74: The route duplicates the logType keys inline in the params
schema and in CLOUDFORWARDER_INDEX_PATTERNS which can drift; extract a shared
constant (e.g., LOG_TYPE_KEYS or CLOUDFORWARDER_LOG_TYPES) that lists the
allowed keys ("vpcflow", "elbaccess", "cloudtrail") and use that constant both
when building the t.keyof(...) in the params schema and when constructing
CLOUDFORWARDER_INDEX_PATTERNS, so the source of truth for log types is a single
symbol referenced from the route's params and the index pattern map.
- Line 9: Replace the wildcard import "import * as t from 'io-ts';" with
explicit named imports of only the symbols used in this file: scan for all
occurrences of "t." in route.ts (e.g., t.type, t.string, t.union, etc.), collect
those identifiers, and change the import to "import { <Identifiers> } from
'io-ts';" so the module surface is explicit and aligns with the TS import
guideline.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/route.ts`:
- Around line 141-192: The current combined search builds query (variable query)
with both indexed terms and the scripted runtime field
'resource.attributes.onboarding.id._rt' via runtimeMappings and then polls broad
index patterns in the two searches (logsResult, metricsResult), which is
inefficient; change it to a two-step approach: 1) call
elasticsearch.client.asCurrentUser.search for logs and metrics using the
existing query but remove runtime_mappings and drop any should-clause term that
references 'resource.attributes.onboarding.id._rt' so the first fast check only
uses indexed fields (termQuery results); 2) only if those first searches return
no hits, run a second targeted search scoped to passthrough streams (narrow
index pattern or filter for logs.otel/passthrough streams) that includes
runtime_mappings (runtimeMappings) and a query containing just the runtime '_rt'
should-clause; update the code around runtimeMappings, query, and the two search
calls (logsResult, metricsResult) accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 0794ac4b-2162-41a9-9bc2-7fada33ebb33

📥 Commits

Reviewing files that changed from the base of the PR and between d0b55d1 and 533244b.

📒 Files selected for processing (20)

x-pack/solutions/observability/plugins/observability_onboarding/common/aws_cloudforwarder.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/host_otel.spec.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/kubernetes_otel.spec.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_host_flow.page.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_kubernetes_flow.page.ts
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/cloudforwarder/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_apm/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.test.ts
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.ts
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/otel_kubernetes_panel.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/get_started_panel.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/route.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_apm/route.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts

coderabbitai · 2026-03-18T16:54:01Z

+  /**
+   * Additional buffer to ensure data has propagated
+   * to dashboards and Discover before navigating.
+   */
+  await page.waitForTimeout(2 * 60000);


⚠️ Potential issue | 🟠 Major

Avoid fixed 2-minute buffer; wait on branch-specific readiness instead.

Line 76 adds a static delay that slows all runs and is still less reliable than condition-based waiting.

Proposed fix

- await page.waitForTimeout(2 * 60000); + const readinessLinkTestId = + useWiredStreams || isLogsEssentialsMode + ? 'observabilityOnboardingDataIngestStatusActionLink-logs' + : 'observabilityOnboardingDataIngestStatusActionLink-metrics'; + await page.getByTestId(readinessLinkTestId).waitFor({ + state: 'visible', + timeout: 2 * 60_000, + });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/host_otel.spec.ts` around lines 72 - 76, Replace the fixed 2-minute sleep (page.waitForTimeout(2 * 60000)) with a condition-based wait that polls the actual signals that OTEL data has landed for this branch; for example, use page.waitForSelector to wait for a specific dashboard panel or Discover hit-count element to appear, or use page.waitForResponse/ page.waitForFunction to wait for the API that returns documents/metrics (replace the timeout call in host_otel.spec.ts where page.waitForTimeout is used). Ensure the new wait targets branch-specific UI elements or network endpoints that indicate data readiness (e.g., Discover document count, a saved search result, or a specific visualization DOM node) and include a reasonable timeout for the wait operation instead of a fixed 2-minute delay.

coderabbitai · 2026-03-18T16:54:01Z

+  /**
+   * Additional buffer to ensure data has propagated
+   * to dashboards and Discover before navigating.
+   */
+  await page.waitForTimeout(2 * 60000);


⚠️ Potential issue | 🟠 Major

Remove unconditional 2-minute sleep.

Line 109 introduces a fixed 120s delay on every run, which inflates suite time and still isn’t a deterministic readiness check.

Proposed fix

- await page.waitForTimeout(2 * 60000); + const readinessLinkTestId = + useWiredStreams || isLogsEssentialsMode + ? 'observabilityOnboardingDataIngestStatusActionLink-logs' + : 'observabilityOnboardingDataIngestStatusActionLink-kubernetes_otel-cluster-overview'; + await page.getByTestId(readinessLinkTestId).waitFor({ + state: 'visible', + timeout: 2 * 60_000, + });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/kubernetes_otel.spec.ts` around lines 105 - 109, Remove the unconditional await page.waitForTimeout(2 * 60000); and replace it with a deterministic polling-based readiness check: implement a retry loop (or use page.waitForFunction / page.waitForResponse) that periodically verifies the specific readiness signal (e.g., the presence of a dashboard/Discover element, expected network response, or API endpoint returning data) until success or a max timeout, keeping the existing overall timeout but eliminating the fixed 120s sleep; target the call site by replacing the page.waitForTimeout invocation in kubernetes_otel.spec.ts and reference page.waitForFunction, page.waitForResponse, or a helper like waitForCondition/checkDataReady to perform the checks.

coderabbitai · 2026-03-18T16:54:01Z

+  public async assertDataReceivedIndicator() {
+    await expect(
+      this.dataReceivedIndicator,
+      'Data received indicator should be visible'
+    ).toBeVisible();
+  }


⚠️ Potential issue | 🟠 Major

Use an ingestion-safe timeout and explicit return type in the new assertion helper.

Line 71 relies on default expect timeout, which can be too short for real data arrival in stateful onboarding tests.

Proposed fix

- public async assertDataReceivedIndicator() { + public async assertDataReceivedIndicator(timeoutMs: number = 5 * 60_000): Promise<void> { await expect( this.dataReceivedIndicator, 'Data received indicator should be visible' - ).toBeVisible(); + ).toBeVisible({ timeout: timeoutMs }); }

As per coding guidelines, "Prefer explicit return types for public APIs and exported functions in TypeScript."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public async assertDataReceivedIndicator() {

await expect(

this.dataReceivedIndicator,

'Data received indicator should be visible'

).toBeVisible();

}

public async assertDataReceivedIndicator(timeoutMs: number = 5 * 60_000): Promise<void> {

await expect(

this.dataReceivedIndicator,

'Data received indicator should be visible'

).toBeVisible({ timeout: timeoutMs });

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_host_flow.page.ts` around lines 67 - 72, The assertDataReceivedIndicator helper currently uses the default expect timeout and has no explicit return type; update the method signature to declare an explicit return type (Promise<void>) and call toBeVisible with an ingestion-safe timeout option (e.g., await expect(this.dataReceivedIndicator, '...').toBeVisible({ timeout: INGESTION_SAFE_TIMEOUT_MS })) or use the project's shared ingestion timeout constant, ensuring you reference the assertDataReceivedIndicator method and the dataReceivedIndicator locator when making the change.

coderabbitai · 2026-03-18T16:54:02Z

+  public async assertDataReceivedIndicator() {
+    await expect(
+      this.dataReceivedIndicator,
+      'Data received indicator should be visible'
+    ).toBeVisible();
+  }


⚠️ Potential issue | 🟠 Major

Increase assertion timeout for ingestion waits and type the method return.

Line 93 currently uses the default Playwright expect timeout, which is often too short for onboarding data ingestion. This can fail before data arrives.

Proposed fix

- public async assertDataReceivedIndicator() { + public async assertDataReceivedIndicator(timeoutMs: number = 5 * 60_000): Promise<void> { await expect( this.dataReceivedIndicator, 'Data received indicator should be visible' - ).toBeVisible(); + ).toBeVisible({ timeout: timeoutMs }); }

As per coding guidelines, "Prefer explicit return types for public APIs and exported functions in TypeScript."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_kubernetes_flow.page.ts` around lines 89 - 94, The assertDataReceivedIndicator method uses Playwright's default expect timeout which is too short for ingestion waits and lacks an explicit return type; update the public method signature to return Promise<void> and call expect(this.dataReceivedIndicator, ...) with an increased timeout option (e.g., { timeout: 30000 } or a shared constant) so the assertion waits longer for onboarding data to arrive while keeping the descriptive message intact.

coderabbitai · 2026-03-18T16:54:02Z

+export function isNoShardsAvailableError(error: any): boolean {
+  const errorType = error?.meta?.body?.error?.type;
+  const rootCauseType = error?.meta?.body?.error?.root_cause?.[0]?.type;
+
+  if (
+    errorType === 'search_phase_execution_exception' &&
+    rootCauseType === 'no_shard_available_action_exception'
+  ) {
+    return true;
+  }
+
+  return false;
+}
+
+export function throwHasDataSearchError(error: any): never {
+  throw Boom.internal(`Elasticsearch responded with an error. ${error.message}`);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts

Repository: elastic/kibana

Length of output: 2046

🏁 Script executed:

rg -l "throwHasDataSearchError|isNoShardsAvailableError" x-pack/solutions/observability --type ts --type tsx

Repository: elastic/kibana

Length of output: 2046

🏁 Script executed:

rg "@elastic/elasticsearch" x-pack/solutions/observability/plugins/observability_onboarding/server --type ts -A 2 | head -30

Repository: elastic/kibana

Length of output: 2046

Type error parameter and preserve Elasticsearch error status codes.

Both functions currently accept error: any and throwHasDataSearchError unconditionally wraps failures in Boom.internal(), which always returns HTTP 500. This erases upstream Elasticsearch error status codes—401/403/400 responses become indistinguishable from actual internal errors in the UI. Type the error parameter to match the Elasticsearch client's error contract and preserve the original status code when available.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts` around lines 15 - 30, Change both function signatures to accept a typed Elasticsearch error (e.g., an Error object with optional statusCode/status and meta.body fields) instead of any; update isNoShardsAvailableError to use that typed parameter when inspecting meta.body.error.type and meta.body.error.root_cause[0].type, and modify throwHasDataSearchError to rethrow the original status code by boomifying the original error (use Boom.boomify or equivalent) with statusCode taken from error.statusCode || error.status || error.meta?.statusCode before falling back to Boom.internal, and include the original error message; reference the functions isNoShardsAvailableError and throwHasDataSearchError when applying these changes.

coderabbitai · 2026-03-18T16:54:02Z

+  params: t.type({
+    query: t.type({
+      start: t.string,
+    }),


⚠️ Potential issue | 🟡 Minor

Validate start as a timestamp at the route boundary.

t.string is too loose for a value that is immediately used in a date range query. A malformed start currently goes down the generic search-error path instead of returning a clear bad-request response.

As per coding guidelines, **/*.{ts,tsx}: Handle errors explicitly; return typed errors from APIs when possible.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_apm/route.ts` around lines 62 - 65, The route's input schema (the params object with params: t.type({ query: t.type({ start: t.string }) })) accepts any string for query.start but that value is immediately used as a date/timestamp; update the route boundary to validate start as a timestamp (e.g., use an io-ts number type or a branded/refined string that matches an epoch/ms timestamp/ISO date) and add explicit error handling to return a typed 400 Bad Request when validation fails; adjust the params schema and the route handler code that reads params.query.start so it only proceeds when the validated/parsed timestamp is present, and return a clear bad-request response (rather than letting the downstream search error) when parsing/validation fails.

coderabbitai · 2026-03-18T16:54:02Z

+  params: t.type({
+    query: t.type({
+      start: t.string,
+    }),


⚠️ Potential issue | 🟡 Minor

Validate start as a timestamp at the route boundary.

Right now any string gets through the API contract, so a malformed start falls into the generic search-error path instead of being rejected as a bad request. Tightening the query codec here will keep this endpoint deterministic.

As per coding guidelines, **/*.{ts,tsx}: Handle errors explicitly; return typed errors from APIs when possible.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts` around lines 87 - 90, The route's query codec currently allows any string for `start` (params -> query -> start) so malformed timestamps aren't rejected; update the codec for `params` (the query.start field in route.ts) to validate that `start` is a proper timestamp (e.g., an ISO8601 string or numeric epoch) using a refined/branded io-ts type or a custom validator instead of plain t.string, and ensure the handler returns a 400 typed error when validation fails so malformed `start` values are rejected at the route boundary.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (6)

x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.ts (1)
19-28: Consider adding runtime guard for empty onboardingId.

The onboardingId parameter is required at the type level, but an empty string could still be passed at runtime, resulting in onboarding.id= being set in the Helm command. If the caller can guarantee non-empty values, this is fine; otherwise, consider adding a guard.
Optional defensive validation
 }: {
   isMetricsOnboardingEnabled: boolean;
   isManagedOtlpServiceAvailable: boolean;
   managedOtlpEndpointUrl: string;
   elasticsearchUrl: string;
   apiKeyEncoded: string;
   agentVersion: string;
   useWiredStreams?: boolean;
   onboardingId: string;
 }): string {
+  if (!onboardingId) {
+    throw new Error('onboardingId is required');
+  }
   const ingestEndpointUrl = isManagedOtlpServiceAvailable
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.ts`
around lines 19 - 28, The code accepts onboardingId but doesn't guard against an
empty string; update the command-building logic in build_install_stack_command
(where onboardingId is appended to the Helm/CLI args) to only include the
onboarding.id flag when onboardingId is a non-empty trimmed string (e.g., check
onboardingId && onboardingId.trim().length > 0) so you don't emit `--set
onboarding.id=`; reference the onboardingId parameter and the function that
constructs the Helm command to apply this conditional guard and ensure any
whitespace-only values are treated as empty.
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts (2)
25-34: Add explicit return type for the exported function.

As per coding guidelines, public APIs should have explicit return types.
♻️ Add explicit return type
+interface UseTimeWindowDataDetectionResult {
+  hasData: boolean;
+  isTroubleshootingVisible: boolean;
+}
+
 export function useTimeWindowDataDetection({
   isMonitoringActive,
   sessionStartTime,
   fetchInterval,
   troubleshootingDelay,
   flowType,
   onboardingId,
   endpoint,
   extraQueryParams,
-}: UseTimeWindowDataDetectionOptions) {
+}: UseTimeWindowDataDetectionOptions): UseTimeWindowDataDetectionResult {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`
around lines 25 - 34, The exported function useTimeWindowDataDetection currently
lacks an explicit return type; update its declaration to include the precise
return type that the hook returns (e.g., ReturnType or a specific interface
describing the returned values) to satisfy the public API guideline. Locate the
function definition useTimeWindowDataDetection(...) and add the explicit return
type annotation after the parameter list, ensuring you import or declare any
return type interface (or use ReturnType<typeof
useTimeWindowDataDetectionImplementation> if applicable) so TypeScript infers no
any types for the exported API.
81-91: The telemetry step is hardcoded to 'logs-ingest'.

This may not be accurate for all flow types using this hook (e.g., metrics-only or traces flows). Consider making the step configurable via the options interface if different flows need distinct telemetry labels.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`
around lines 81 - 91, The telemetry step string is hardcoded to 'logs-ingest' in
the useEffect inside the useTimeWindowDataDetection hook; update the hook
signature (useTimeWindowDataDetection) and its options interface to accept a
telemetryStep (or derive it from flowType) and replace the hardcoded
'logs-ingest' with that value when calling analytics.reportEvent; ensure callers
that invoke useTimeWindowDataDetection (and any tests) supply the appropriate
telemetryStep for metrics/traces-only flows and keep existing behavior by
defaulting telemetryStep to 'logs-ingest' if no option is provided, leaving
other symbols (hasDataResponse, dataReceivedTelemetrySent, analytics,
OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT) unchanged.
x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts (1)
29-31: Consider preserving the original error for debugging.

The current implementation only includes error.message in the Boom error. For better debugging in server logs, consider preserving the original error.
♻️ Optional: Preserve original error for debugging
 export function throwHasDataSearchError(error: any): never {
-  throw Boom.internal(`Elasticsearch responded with an error. ${error.message}`);
+  throw Boom.internal(`Elasticsearch responded with an error. ${error.message}`, error);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts`
around lines 29 - 31, The thrown Boom error in throwHasDataSearchError only
includes error.message; update throwHasDataSearchError to preserve the original
error by attaching it to the Boom error (e.g., set the original error on the
Boom object's data or cause) so logs/debuggers can access full error details
while keeping the existing message produced by Boom.internal.
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx (1)
98-114: Consider deferring sessionStartTime initialization for consistency.

Unlike otel_apm (which sets sessionStartTime when monitoring becomes active to narrow the detection window), this flow initializes it on mount. For APM, this reduces false positives from other services already ingesting data.

For host monitoring, this may be less critical since existing OTel host data is less common, but the pattern inconsistency across flows could be worth aligning if false positives become an issue.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx`
around lines 98 - 114, sessionStartTime is created on mount but should be set
when monitoring actually becomes active to match the otel_apm flow and avoid
false positives; change the initialization of sessionStartTime so it is set when
isMonitoringStepActive turns true (useState to hold a nullable string and a
useEffect that sets sessionStartTime = new Date().toISOString() when
isMonitoringStepActive becomes true), keep the
useWindowBlurDataMonitoringTrigger usage and ensure useTimeWindowDataDetection
receives the updated sessionStartTime (or a safe empty/undefined value until
set) so the detection window only starts once monitoring is active.
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx (1)
34-41: Prefer const-arrow exported component with explicit return type.

This exported React component currently uses a function declaration with inferred return type.
Proposed refactor
-export function DataIngestStatus({
+export const DataIngestStatus = ({
   onboardingId,
   onboardingFlowType,
   dataset,
   integration,
   actionLinks,
   onDataReceived,
-}: Props) {
+}: Props): JSX.Element => {
@@
-}
+};
As per coding guidelines: "Prefer const arrow functions in TypeScript." and "Prefer explicit return types for public APIs and exported functions in TypeScript."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx`
around lines 34 - 41, Replace the exported function declaration
DataIngestStatus(...) with a const arrow component and add an explicit return
type: declare "export const DataIngestStatus: (props: Props) => JSX.Element" or
"export const DataIngestStatus: React.FC<Props>" and implement it as "const
DataIngestStatus = ({ onboardingId, onboardingFlowType, dataset, integration,
actionLinks, onDataReceived }: Props) => { return ( ... ); }"; keep the same
prop names and body, ensure Props type is used in the signature and the export
remains the same.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx`:
- Around line 61-63: The readiness/filtering currently only checks for requires
=== 'metrics' and defaults unknown requires to hasData, which misses traces-only
links; add explicit handling for requires === 'traces' by introducing a
needsTraces boolean (e.g., const needsTraces = actionLinks.some(al =>
al.requires === 'traces')) and update isReady to consider traces (const isReady
= needsMetrics ? hasMetrics : needsTraces ? hasTraces : hasData), and likewise
adjust any filtering logic that maps requires -> readiness so that 'traces'
checks use hasTraces while preserving the existing default behavior for unknown
values.
- Around line 93-97: The effect currently depends on both isReady and
onDataReceived so when the parent passes a new inline callback the effect can
re-run and call onDataReceived again; to fix, keep the effect dependent only on
isReady and call a stable ref to the latest callback: create a ref (e.g.,
latestOnDataReceivedRef) and update it whenever onDataReceived prop changes,
then in the useEffect for readiness (useEffect with [isReady]) call
latestOnDataReceivedRef.current?.() when isReady is true; update references to
onDataReceived in this component to use the ref so repeated identity changes
don’t retrigger the readiness side effect.

---

Nitpick comments:
In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx`:
- Around line 34-41: Replace the exported function declaration
DataIngestStatus(...) with a const arrow component and add an explicit return
type: declare "export const DataIngestStatus: (props: Props) => JSX.Element" or
"export const DataIngestStatus: React.FC<Props>" and implement it as "const
DataIngestStatus = ({ onboardingId, onboardingFlowType, dataset, integration,
actionLinks, onDataReceived }: Props) => { return ( ... ); }"; keep the same
prop names and body, ensure Props type is used in the signature and the export
remains the same.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.ts`:
- Around line 19-28: The code accepts onboardingId but doesn't guard against an
empty string; update the command-building logic in build_install_stack_command
(where onboardingId is appended to the Helm/CLI args) to only include the
onboarding.id flag when onboardingId is a non-empty trimmed string (e.g., check
onboardingId && onboardingId.trim().length > 0) so you don't emit `--set
onboarding.id=`; reference the onboardingId parameter and the function that
constructs the Helm command to apply this conditional guard and ensure any
whitespace-only values are treated as empty.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx`:
- Around line 98-114: sessionStartTime is created on mount but should be set
when monitoring actually becomes active to match the otel_apm flow and avoid
false positives; change the initialization of sessionStartTime so it is set when
isMonitoringStepActive turns true (useState to hold a nullable string and a
useEffect that sets sessionStartTime = new Date().toISOString() when
isMonitoringStepActive becomes true), keep the
useWindowBlurDataMonitoringTrigger usage and ensure useTimeWindowDataDetection
receives the updated sessionStartTime (or a safe empty/undefined value until
set) so the detection window only starts once monitoring is active.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`:
- Around line 25-34: The exported function useTimeWindowDataDetection currently
lacks an explicit return type; update its declaration to include the precise
return type that the hook returns (e.g., ReturnType or a specific interface
describing the returned values) to satisfy the public API guideline. Locate the
function definition useTimeWindowDataDetection(...) and add the explicit return
type annotation after the parameter list, ensuring you import or declare any
return type interface (or use ReturnType<typeof
useTimeWindowDataDetectionImplementation> if applicable) so TypeScript infers no
any types for the exported API.
- Around line 81-91: The telemetry step string is hardcoded to 'logs-ingest' in
the useEffect inside the useTimeWindowDataDetection hook; update the hook
signature (useTimeWindowDataDetection) and its options interface to accept a
telemetryStep (or derive it from flowType) and replace the hardcoded
'logs-ingest' with that value when calling analytics.reportEvent; ensure callers
that invoke useTimeWindowDataDetection (and any tests) supply the appropriate
telemetryStep for metrics/traces-only flows and keep existing behavior by
defaulting telemetryStep to 'logs-ingest' if no option is provided, leaving
other symbols (hasDataResponse, dataReceivedTelemetrySent, analytics,
OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT) unchanged.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts`:
- Around line 29-31: The thrown Boom error in throwHasDataSearchError only
includes error.message; update throwHasDataSearchError to preserve the original
error by attaching it to the Boom error (e.g., set the original error on the
Boom object's data or cause) so logs/debuggers can access full error details
while keeping the existing message produced by Boom.internal.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e3b66f59-4623-4721-8dd9-6d0e055d9328

📥 Commits

Reviewing files that changed from the base of the PR and between d0b55d1 and 533244b.

📒 Files selected for processing (20)

x-pack/solutions/observability/plugins/observability_onboarding/common/aws_cloudforwarder.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/host_otel.spec.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/kubernetes_otel.spec.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_host_flow.page.ts
x-pack/solutions/observability/plugins/observability_onboarding/e2e/playwright/stateful/pom/pages/otel_kubernetes_flow.page.ts
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/cloudforwarder/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_apm/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.test.ts
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/build_install_stack_command.ts
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_kubernetes/otel_kubernetes_panel.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/get_started_panel.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/lib/handle_has_data_search_error.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/cloudforwarder/route.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/route.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_apm/route.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts

coderabbitai · 2026-03-18T17:05:43Z

+  const needsMetrics = actionLinks.some((actionLink) => actionLink.requires === 'metrics');
+  const isReady = needsMetrics ? hasMetrics : hasData;
+


⚠️ Potential issue | 🟠 Major

requires: 'traces' is not handled explicitly in readiness/filtering.

Line 103 defaults unknown requires values to hasData, so a traces-only link can appear before traces are confirmed. Line 62 also ignores traces in isReady.

Proposed fix

const hasData = data?.hasData ?? false; const hasLogs = data?.hasLogs ?? hasData; const hasMetrics = data?.hasMetrics ?? hasData; +const hasTraces = data?.hasTraces ?? false; const needsMetrics = actionLinks.some((actionLink) => actionLink.requires === 'metrics'); -const isReady = needsMetrics ? hasMetrics : hasData; +const needsTraces = actionLinks.some((actionLink) => actionLink.requires === 'traces'); +const isReady = needsMetrics ? hasMetrics : needsTraces ? hasTraces : hasData; @@ if (requires === 'metrics') { return hasMetrics; } +if (requires === 'traces') { + return hasTraces; +} + return hasData;

Also applies to: 102-114

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx` around lines 61 - 63, The readiness/filtering currently only checks for requires === 'metrics' and defaults unknown requires to hasData, which misses traces-only links; add explicit handling for requires === 'traces' by introducing a needsTraces boolean (e.g., const needsTraces = actionLinks.some(al => al.requires === 'traces')) and update isReady to consider traces (const isReady = needsMetrics ? hasMetrics : needsTraces ? hasTraces : hasData), and likewise adjust any filtering logic that maps requires -> readiness so that 'traces' checks use hasTraces while preserving the existing default behavior for unknown values.

coderabbitai · 2026-03-18T17:05:43Z

+  useEffect(() => {
+    if (isReady) {
+      onDataReceived?.();
+    }
+  }, [isReady, onDataReceived]);


⚠️ Potential issue | 🟠 Major

onDataReceived can fire repeatedly after readiness.

At Line 93, the effect re-runs whenever onDataReceived identity changes. With inline parent callbacks, this can trigger duplicate side effects after isReady becomes true.

Proposed fix

const [checkDataStartTime] = useState(Date.now()); const [dataReceivedTelemetrySent, setDataReceivedTelemetrySent] = useState(false); +const [dataReceivedNotified, setDataReceivedNotified] = useState(false); @@ useEffect(() => { - if (isReady) { - onDataReceived?.(); - } -}, [isReady, onDataReceived]); + if (!isReady || dataReceivedNotified) { + return; + } + + onDataReceived?.(); + setDataReceivedNotified(true); +}, [isReady, onDataReceived, dataReceivedNotified]);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

useEffect(() => {

if (isReady) {

onDataReceived?.();

}

}, [isReady, onDataReceived]);

useEffect(() => {

if (!isReady || dataReceivedNotified) {

return;

}

onDataReceived?.();

setDataReceivedNotified(true);

}, [isReady, onDataReceived, dataReceivedNotified]);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx` around lines 93 - 97, The effect currently depends on both isReady and onDataReceived so when the parent passes a new inline callback the effect can re-run and call onDataReceived again; to fix, keep the effect dependent only on isReady and call a stable ref to the latest callback: create a ref (e.g., latestOnDataReceivedRef) and update it whenever onDataReceived prop changes, then in the useEffect for readiness (useEffect with [isReady]) call latestOnDataReceivedRef.current?.() when isReady is true; update references to onDataReceived in this component to use the ref so repeated identity changes don’t retrigger the readiness side effect.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx`:
- Around line 113-132: The osType currently passed to useTimeWindowDataDetection
is read from the live selectedTab on each poll, so change the component to
"freeze" the OS when monitoring starts: add a new state like sessionOsType and
set it inside the same effect that sets sessionStartTime (useEffect that checks
isMonitoringStepActive && sessionStartTime === null), initializing it from
OS_TYPE_MAP[selectedTab]; then pass that frozen value to
useTimeWindowDataDetection via extraQueryParams (e.g., extraQueryParams: {
osType: sessionOsType ?? OS_TYPE_MAP[selectedTab] }) so subsequent polls keep
the original OS filter even if selectedTab changes. Ensure symbols touched are
sessionStartTime, sessionOsType (new), selectedTab, OS_TYPE_MAP, and
useTimeWindowDataDetection.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts`:
- Around line 41-52: The hook currently only initializes checkDataStartTime once
and never resets noDataPollCount, causing subsequent monitoring sessions to
reuse prior timers and counters; update the useEffect that watches
isMonitoringActive (and the similar effect around lines 84-96) to reset both
setCheckDataStartTime(Date.now()) and setNoDataPollCount(0) whenever a new
sessionStartTime/isMonitoringActive transition begins (i.e., when starting a new
detection window), and ensure the effect dependencies include the session start
identifier so the state is reinitialized for each new session.

In
`@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts`:
- Around line 111-139: The current handler uses Promise.all to run the two
searches so a no_shard_available error from either call causes the whole
operation to be treated as "no data"; change the logic so logs and metrics
probes are handled independently (e.g., use Promise.allSettled for the two
searches or run them in separate try/catch blocks) and compute hasLogs from the
logsResult outcome and hasMetrics from the metricsResult outcome, treating only
the failing probe with isNoShardsAvailableError as { hasX: false } while
rethrowing non-shard errors via throwHasDataSearchError; update references to
logsResult, metricsResult, isNoShardsAvailableError, and throwHasDataSearchError
in the route handler accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 4e20b41b-512c-4e8e-a584-a07c7618bfaa

📥 Commits

Reviewing files that changed from the base of the PR and between 533244b and 926badd.

📒 Files selected for processing (3)

x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx
x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts
x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts

coderabbitai · 2026-03-19T12:35:29Z

+  // Set sessionStartTime when monitoring begins (first blur) rather than on
+  // mount, to narrow the time-window and reduce false positives from other
+  // OTel collectors already ingesting data on the same cluster.
+  const [sessionStartTime, setSessionStartTime] = useState<string | null>(null);
+  useEffect(() => {
+    if (isMonitoringStepActive && sessionStartTime === null) {
+      setSessionStartTime(new Date().toISOString());
+    }
+  }, [isMonitoringStepActive, sessionStartTime]);
+
+  const { hasData, isTroubleshootingVisible } = useTimeWindowDataDetection({
+    isMonitoringActive: isMonitoringStepActive && sessionStartTime !== null,
+    sessionStartTime: sessionStartTime ?? '',
+    fetchInterval: FETCH_INTERVAL,
+    troubleshootingDelay: SHOW_TROUBLESHOOTING_DELAY,
+    flowType: 'otel_logs',
+    onboardingId: setupData?.onboardingId ?? '',
+    endpoint: '/internal/observability_onboarding/otel_host/has-data',
+    extraQueryParams: { osType: OS_TYPE_MAP[selectedTab] },
+  });


⚠️ Potential issue | 🟡 Minor

Freeze the OS filter when monitoring starts.

Line 131 recomputes osType from the live tab selection on every poll. If the user starts the Linux flow, blurs the window, and then clicks another platform tab while waiting, the query switches to that new OS and the success state is delayed until the shared hook drops the filter after 30 misses.

💡 Suggested fix

- const [sessionStartTime, setSessionStartTime] = useState<string | null>(null); + const [sessionStartTime, setSessionStartTime] = useState<string | null>(null); + const [monitoringOsType, setMonitoringOsType] = useState<string | null>(null); useEffect(() => { if (isMonitoringStepActive && sessionStartTime === null) { setSessionStartTime(new Date().toISOString()); + setMonitoringOsType(OS_TYPE_MAP[selectedTab]); } - }, [isMonitoringStepActive, sessionStartTime]); + }, [isMonitoringStepActive, sessionStartTime, selectedTab]); const { hasData, isTroubleshootingVisible } = useTimeWindowDataDetection({ isMonitoringActive: isMonitoringStepActive && sessionStartTime !== null, sessionStartTime: sessionStartTime ?? '', fetchInterval: FETCH_INTERVAL, troubleshootingDelay: SHOW_TROUBLESHOOTING_DELAY, flowType: 'otel_logs', onboardingId: setupData?.onboardingId ?? '', endpoint: '/internal/observability_onboarding/otel_host/has-data', - extraQueryParams: { osType: OS_TYPE_MAP[selectedTab] }, + extraQueryParams: monitoringOsType ? { osType: monitoringOsType } : undefined, });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Set sessionStartTime when monitoring begins (first blur) rather than on

// mount, to narrow the time-window and reduce false positives from other

// OTel collectors already ingesting data on the same cluster.

const [sessionStartTime, setSessionStartTime] = useState<string | null>(null);

useEffect(() => {

if (isMonitoringStepActive && sessionStartTime === null) {

setSessionStartTime(new Date().toISOString());

}

}, [isMonitoringStepActive, sessionStartTime]);

const { hasData, isTroubleshootingVisible } = useTimeWindowDataDetection({

isMonitoringActive: isMonitoringStepActive && sessionStartTime !== null,

sessionStartTime: sessionStartTime ?? '',

fetchInterval: FETCH_INTERVAL,

troubleshootingDelay: SHOW_TROUBLESHOOTING_DELAY,

flowType: 'otel_logs',

onboardingId: setupData?.onboardingId ?? '',

endpoint: '/internal/observability_onboarding/otel_host/has-data',

extraQueryParams: { osType: OS_TYPE_MAP[selectedTab] },

});

const [sessionStartTime, setSessionStartTime] = useState<string | null>(null);

const [monitoringOsType, setMonitoringOsType] = useState<string | null>(null);

useEffect(() => {

if (isMonitoringStepActive && sessionStartTime === null) {

setSessionStartTime(new Date().toISOString());

setMonitoringOsType(OS_TYPE_MAP[selectedTab]);

}

}, [isMonitoringStepActive, sessionStartTime, selectedTab]);

const { hasData, isTroubleshootingVisible } = useTimeWindowDataDetection({

isMonitoringActive: isMonitoringStepActive && sessionStartTime !== null,

sessionStartTime: sessionStartTime ?? '',

fetchInterval: FETCH_INTERVAL,

troubleshootingDelay: SHOW_TROUBLESHOOTING_DELAY,

flowType: 'otel_logs',

onboardingId: setupData?.onboardingId ?? '',

endpoint: '/internal/observability_onboarding/otel_host/has-data',

extraQueryParams: monitoringOsType ? { osType: monitoringOsType } : undefined,

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/otel_logs/index.tsx` around lines 113 - 132, The osType currently passed to useTimeWindowDataDetection is read from the live selectedTab on each poll, so change the component to "freeze" the OS when monitoring starts: add a new state like sessionOsType and set it inside the same effect that sets sessionStartTime (useEffect that checks isMonitoringStepActive && sessionStartTime === null), initializing it from OS_TYPE_MAP[selectedTab]; then pass that frozen value to useTimeWindowDataDetection via extraQueryParams (e.g., extraQueryParams: { osType: sessionOsType ?? OS_TYPE_MAP[selectedTab] }) so subsequent polls keep the original OS filter even if selectedTab changes. Ensure symbols touched are sessionStartTime, sessionOsType (new), selectedTab, OS_TYPE_MAP, and useTimeWindowDataDetection.

I ran into this bheavior, it's kinda weird. Should we even do the osType filter? It's another thing that can go wrong

@flash1293 That's a fair point. The sessionStartTime already handles the main false-positive risk — any data that arrived before the user started the flow is excluded. The osType filter only guards against the narrow edge case of another OTel collector from a different OS that began ingesting after the session start on the same cluster. That edge case barely justifies the extra complexity, especially since the fallback in useTimeWindowDataDetection already silently drops the filter after FALLBACK_POLL_THRESHOLD misses anyway (meaning in bad cases it self-heals, but delays detection in the meantime).

Dropping osType entirely would:

Eliminate the tab-switching race I flagged

Remove the fallback dependency for this flow

Keep the logic consistent with the other flows (OTel APM, Cloud Forwarder) that don't use an OS-scoped filter

So yes, I'd suggest removing extraQueryParams here:

const { hasData, isTroubleshootingVisible } = useTimeWindowDataDetection({ isMonitoringActive: isMonitoringStepActive && sessionStartTime !== null, sessionStartTime: sessionStartTime ?? '', fetchInterval: FETCH_INTERVAL, troubleshootingDelay: SHOW_TROUBLESHOOTING_DELAY, flowType: 'otel_logs', onboardingId: setupData?.onboardingId ?? '', endpoint: '/internal/observability_onboarding/otel_host/has-data', - extraQueryParams: { osType: OS_TYPE_MAP[selectedTab] }, });

And the OS_TYPE_MAP constant can be removed as well if no other caller needs it.

🧠 Learnings used

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/security/test/security_solution_cypress/.cursor/rules/flaky_test_doctor.mdc:0-0 Timestamp: 2026-03-06T10:31:57.700Z Learning: Applies to x-pack/solutions/security/test/security_solution_cypress/**/*.cy.ts : Investigate real application bugs vs test flakiness: if a test consistently catches the same failure after a feature flag change, the application likely has a race condition that needs fixing

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/security/test/security_solution_cypress/.cursor/rules/flaky_test_doctor.mdc:0-0 Timestamp: 2026-03-06T00:27:42.746Z Learning: Applies to x-pack/solutions/security/test/security_solution_cypress/**/*.cy.ts : When a feature flag change causes a previously stable Cypress test to become flaky, investigate whether the flag change alters component loading order or data fetching timing. The fix is usually in the application code where a race condition was exposed by the new timing, not in the test. Do not skip the test—investigate the application.

Learnt from: yansavitski Repo: elastic/kibana PR: 258871 File: x-pack/platform/plugins/shared/search_inference_endpoints/public/components/settings/use_model_settings_form.ts:76-78 Timestamp: 2026-03-20T21:38:32.153Z Learning: In `x-pack/platform/plugins/shared/search_inference_endpoints/public/components/settings/use_model_settings_form.ts`, the `useEffect(() => { setAssignments(defaultAssignments); }, [defaultAssignments])` pattern is intentional and safe. `defaultAssignments` (a `useMemo`) only recomputes on initial load or after a successful PUT mutation calls `setQueryData`. There is no background refetch or polling on the inference settings endpoint, so in-flight user edits cannot be clobbered. Do not flag this as a race condition or stale-overwrite issue.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/security/test/security_solution_cypress/.cursor/rules/flaky_test_doctor.mdc:0-0 Timestamp: 2026-03-06T00:27:42.746Z Learning: Applies to x-pack/solutions/security/test/security_solution_cypress/**/*.cy.ts : Do not assume the flaky test is wrong without verification. A flaky Cypress test may be correctly identifying a real bug in the application. Always follow the analysis framework to determine if the issue is in the test or the application, particularly after feature flag changes or React rendering changes.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T01:30:09.037Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/insights/**/*.tsx : AI Insights conversation handoff must pass attachments containing the generated summary, raw prefetched context, and domain-specific identifiers (e.g., alertId, errorId)

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T10:48:39.498Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Include instructions for generating Kibana deep links to services, traces, errors, and other entities via markdown links in AI Insights prompts

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T00:28:47.567Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/insights/*.ts?(x) : Pass attachments containing the generated summary, raw prefetched context, and domain-specific identifiers (e.g., `alertId`, `errorId`) when handing off to Agent conversation.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/AGENTS.md:0-0 Timestamp: 2026-03-18T14:14:05.681Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/**/*.{ts,tsx,js} : Design AI Insights with prefetched context and hardcoded prompts, ensure proper prompt design, UI integration, conversation handoff, and test thoroughly

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T10:48:39.498Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/ai_insight/**/*.tsx : Shared AI Insights rendering logic should live in public/components/ai_insight/ to be reused across all insight types

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/AGENTS.md:0-0 Timestamp: 2026-03-06T09:23:07.998Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/tools/**/*.{ts,tsx,js} : Follow tool design principles, ECS/OTel compatibility, APM data types, parameter conventions, testing workflows, and pre-merge checklist for Observability tools as documented in server/tools/AGENTS.md

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T12:00:53.329Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Define specific output format sections (e.g., Summary / Assessment / Next Steps) in AI Insights system prompts for each insight type

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T01:30:09.037Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/insights/**/*.tsx : AI Insights React components should be thin wrappers that define which API endpoint to call and what attachments to build for conversation handoff

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T01:30:09.037Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/insights/**/*.{ts,tsx} : Create thin React component wrappers in `public/components/insights/` for each AI Insights type (e.g., `AlertAiInsight`) that define API endpoint and attachments

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T00:28:47.567Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/*.ts : Include instructions in prompts for generating Kibana deep links to services, traces, errors, and other entities via markdown links.

coderabbitai · 2026-03-19T12:35:30Z

+  const [checkDataStartTime, setCheckDataStartTime] = useState<number | null>(null);
+  const [dataReceivedTelemetrySent, setDataReceivedTelemetrySent] = useState(false);
+  const [noDataPollCount, setNoDataPollCount] = useState(0);
+  const {
+    services: { analytics },
+  } = useKibana<ObservabilityOnboardingContextValue>();
+
+  useEffect(() => {
+    if (isMonitoringActive && checkDataStartTime === null) {
+      setCheckDataStartTime(Date.now());
+    }
+  }, [isMonitoringActive, checkDataStartTime]);


⚠️ Potential issue | 🟠 Major

Reset retry state for each monitoring session.

Lines 48-52 only initialize checkDataStartTime once, and noDataPollCount is never cleared. When a caller starts a second detection window with a new sessionStartTime, this shared hook reuses the previous troubleshooting timer and fallback counter, so retries can show troubleshooting immediately and skip straight to the unfiltered query.

💡 Suggested fix

const [checkDataStartTime, setCheckDataStartTime] = useState<number | null>(null); const [dataReceivedTelemetrySent, setDataReceivedTelemetrySent] = useState(false); const [noDataPollCount, setNoDataPollCount] = useState(0); @@ - useEffect(() => { - if (isMonitoringActive && checkDataStartTime === null) { - setCheckDataStartTime(Date.now()); - } - }, [isMonitoringActive, checkDataStartTime]); + useEffect(() => { + if (!isMonitoringActive) { + setCheckDataStartTime(null); + setNoDataPollCount(0); + return; + } + + setCheckDataStartTime(Date.now()); + setNoDataPollCount(0); + }, [isMonitoringActive, sessionStartTime]); @@ useEffect(() => { + if (!isMonitoringActive) { + return; + } + const pendingStatusList = [FETCH_STATUS.LOADING, FETCH_STATUS.NOT_INITIATED]; if (pendingStatusList.includes(hasDataStatus) || hasDataResponse?.hasData === true) { return; } @@ - }, [hasDataResponse?.hasData, refetchHasData, hasDataStatus, fetchInterval]); + }, [isMonitoringActive, hasDataResponse?.hasData, refetchHasData, hasDataStatus, fetchInterval]);

Also applies to: 84-96

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/use_time_window_data_detection.ts` around lines 41 - 52, The hook currently only initializes checkDataStartTime once and never resets noDataPollCount, causing subsequent monitoring sessions to reuse prior timers and counters; update the useEffect that watches isMonitoringActive (and the similar effect around lines 84-96) to reset both setCheckDataStartTime(Date.now()) and setNoDataPollCount(0) whenever a new sessionStartTime/isMonitoringActive transition begins (i.e., when starting a new detection window), and ensure the effect dependencies include the session start identifier so the state is reinitialized for each new session.

coderabbitai · 2026-03-19T12:35:30Z

+      const [logsResult, metricsResult] = await Promise.all([
+        elasticsearch.client.asCurrentUser.search({
+          index: ['logs-*.otel-*', 'logs.otel', 'logs.otel.*'],
+          ignore_unavailable: true,
+          allow_partial_search_results: true,
+          size: 0,
+          terminate_after: 1,
+          query,
+        }),
+        elasticsearch.client.asCurrentUser.search({
+          index: ['metrics-*.otel-*'],
+          ignore_unavailable: true,
+          allow_partial_search_results: true,
+          size: 0,
+          terminate_after: 1,
+          query,
+        }),
+      ]);
+
+      const hasLogs = (logsResult.hits.total as estypes.SearchTotalHits).value > 0;
+      const hasMetrics = (metricsResult.hits.total as estypes.SearchTotalHits).value > 0;
+
+      return { hasData: hasLogs || hasMetrics };
+    } catch (error) {
+      if (isNoShardsAvailableError(error)) {
+        return { hasData: false };
+      }
+
+      throwHasDataSearchError(error);


⚠️ Potential issue | 🟠 Major

Handle the log and metric probes independently.

Line 111 uses Promise.all(), so a no_shard_available_action_exception from either search makes the whole handler return { hasData: false }, even when the other index set already has documents. For an "any data?" route, this should be Promise.allSettled() or per-branch catches so shard issues only zero out the failing probe.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@x-pack/solutions/observability/plugins/observability_onboarding/server/routes/otel_host/route.ts` around lines 111 - 139, The current handler uses Promise.all to run the two searches so a no_shard_available error from either call causes the whole operation to be treated as "no data"; change the logic so logs and metrics probes are handled independently (e.g., use Promise.allSettled for the two searches or run them in separate try/catch blocks) and compute hasLogs from the logsResult outcome and hasMetrics from the metricsResult outcome, treating only the failing probe with isNoShardsAvailableError as { hasX: false } while rethrowing non-shard errors via throwHasDataSearchError; update references to logsResult, metricsResult, isNoShardsAvailableError, and throwHasDataSearchError in the route handler accordingly.

that's a good point, we've seen this happening in telemetry

@flash1293, here's a fix using Promise.allSettled so a shard failure on one index set doesn't mask data in the other:

const [logsResult, metricsResult] = await Promise.allSettled([ elasticsearch.client.asCurrentUser.search({ index: ['logs-*.otel-*', 'logs.otel', 'logs.otel.*'], ignore_unavailable: true, allow_partial_search_results: true, size: 0, terminate_after: 1, query, }), elasticsearch.client.asCurrentUser.search({ index: ['metrics-*.otel-*'], ignore_unavailable: true, allow_partial_search_results: true, size: 0, terminate_after: 1, query, }), ]); const resolveProbe = ( result: PromiseSettledResult<estypes.SearchResponse> ): boolean => { if (result.status === 'fulfilled') { return (result.value.hits.total as estypes.SearchTotalHits).value > 0; } if (isNoShardsAvailableError(result.reason)) { return false; } throwHasDataSearchError(result.reason); }; const hasLogs = resolveProbe(logsResult); const hasMetrics = resolveProbe(metricsResult); return { hasData: hasLogs || hasMetrics };

The outer try/catch can be removed (or kept as a last-resort guard) since each probe is now handled individually — a shard error on metrics won't hide already-ingested logs, which matches what you've been seeing in telemetry.

🧠 Learnings used

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T20:41:20.088Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/route.ts : AI Insights API endpoints must be POST endpoints using the internal path pattern /internal/observability_agent_builder/ai_insights/{type}

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T11:58:56.168Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/route.ts : Define required body parameters in each AI Insights route schema and document them in `route.ts`

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T00:28:47.567Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/route.ts : Make all AI Insights endpoints internal and return Server-Sent Event (SSE) streams with the pattern `POST /internal/observability_agent_builder/ai_insights/{type}`.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T11:58:56.168Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Use the endpoint pattern `POST /internal/observability_agent_builder/ai_insights/{type}` for all AI Insights API routes

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T00:28:47.567Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/route.ts : Define required body parameters per insight type in the route schema.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T11:58:56.168Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : AI Insights endpoints must return Server-Sent Events (SSE) streams

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T11:58:56.168Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Locate AI Insights server route implementations in `server/routes/ai_insights/` with separate files per insight type (e.g., `get_alert_ai_insights.ts`, `generate_alert_ai_insight.ts`)

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T11:58:56.168Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Prefetch all data up front via predetermined queries without LLM consultation for AI Insights context assembly

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T11:58:56.168Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Return SSE (Server-Sent Events) streams from all AI Insights endpoints

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/security/test/security_solution_cypress/.cursor/rules/flaky_test_doctor.mdc:0-0 Timestamp: 2026-03-06T10:31:57.700Z Learning: Applies to x-pack/solutions/security/test/security_solution_cypress/**/*.cy.ts : Investigate real application bugs vs test flakiness: if a test consistently catches the same failure after a feature flag change, the application likely has a race condition that needs fixing

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T00:28:47.567Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/*.ts : Use a system prompt that establishes the LLM as an 'SRE Assistant' analyzing Observability data, specifying the exact output sections expected, rules against speculation, and formatting requirements (markdown).

Learnt from: sphilipse Repo: elastic/kibana PR: 256071 File: x-pack/platform/plugins/shared/inference/server/chat_complete/callback_api.ts:358-384 Timestamp: 2026-03-10T14:25:11.950Z Learning: In `x-pack/platform/plugins/shared/inference/server/chat_complete/callback_api.ts`, the `catchError` in `resolveAndCreatePipeline` intentionally wraps the full `createChatCompletePipeline` (not just the `resolve()` call). On the `isInferenceEndpoint` path, any 404 triggers `endpointIdCache.invalidate()` and rethrows the original error. On the connector path, any 404 is translated to "No connector or inference endpoint found". This broad scope is by design — do not flag it as incorrectly capturing non-resolution 404s.

Learnt from: yiannisnikolopoulos Repo: elastic/kibana PR: 258385 File: x-pack/solutions/observability/plugins/observability/test/scout/ui/fixtures/helpers.ts:30-34 Timestamp: 2026-03-20T12:21:57.608Z Learning: In `x-pack/solutions/observability/plugins/observability/test/scout/ui/fixtures/helpers.ts`, the `deleteRuleByName` helper intentionally uses `.catch(() => {})` to swallow all deletion errors. This is by design: teardown helpers in Scout UI tests must not let cleanup failures (network blip, race condition, rule already deleted by another parallel worker) mask the original test failure or block subsequent tests. Do not flag this silent-swallow pattern as a bug in teardown/cleanup helpers in this test suite.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/security/.cursor/rules/bug_validator.mdc:0-0 Timestamp: 2026-03-06T01:30:09.037Z Learning: Applies to x-pack/solutions/security/plugins/security_solution/public/common/components/**/*.{ts,tsx} : In Security Solution shared UI components used by multiple features (e.g., Inspect modal, filter bars, embeddables), verify that data format assumptions (e.g., standard ES search response format with 'took' field) are compatible with alternative query types (e.g., ESQL responses with 'columns'/'values' format) or provide feature-specific handling

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-09T19:00:02.484Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/*.ts : Use 'SRE Assistant' role in AI Insights system prompts analyzing Observability data with technical, actionable language for incident responders

Learnt from: crespocarlos Repo: elastic/kibana PR: 253786 File: src/platform/packages/shared/kbn-synthtrace/src/lib/service_graph_logs/log_utils/noise_logs.ts:72-75 Timestamp: 2026-03-06T17:56:37.865Z Learning: In `src/platform/packages/shared/kbn-synthtrace/src/lib/service_graph_logs/log_utils/noise_logs.ts`, the noise channel deliberately allows ~35% `info`-level logs even during degraded mode. This is intentional: `HEALTH_PROBS.failing = { error: 0.15, warn: 0.65 }` leaves ~20% chance of warn/error combined at ~80%, with the remaining ~20% landing on `info` to simulate realistic background chatter. Forcing all degraded noise to `warn` would eliminate the probability distribution and reduce synthetic data realism. Do not flag this as a bug.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-09T19:00:02.484Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/server/routes/ai_insights/**/prompt*.ts : AI Insights prompts must enforce strict factuality — the LLM must only reference data provided in the context with no speculation or hallucination

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/security/.cursor/rules/bug_validator.mdc:0-0 Timestamp: 2026-03-10T16:00:36.052Z Learning: Applies to x-pack/solutions/security/plugins/security_solution/server/**/*.{ts,tsx} : In Security Solution Elasticsearch queries and API responses, verify that hardcoded size limits (e.g., size: 10000, MAX_* constants) are accompanied by pagination logic (scroll, search_after) to handle result sets beyond the limit, rather than silently dropping data beyond the threshold

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T00:28:47.567Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/insights/*.ts?(x) : Pass attachments containing the generated summary, raw prefetched context, and domain-specific identifiers (e.g., `alertId`, `errorId`) when handing off to Agent conversation.

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/AGENTS.md:0-0 Timestamp: 2026-03-06T09:23:07.998Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/tools/**/*.{ts,tsx,js} : Follow tool design principles, ECS/OTel compatibility, APM data types, parameter conventions, testing workflows, and pre-merge checklist for Observability tools as documented in server/tools/AGENTS.md

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/AGENTS.md:0-0 Timestamp: 2026-03-06T01:30:09.037Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/public/components/insights/**/*.tsx : AI Insights conversation handoff must pass attachments containing the generated summary, raw prefetched context, and domain-specific identifiers (e.g., alertId, errorId)

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/AGENTS.md:0-0 Timestamp: 2026-03-18T14:14:05.681Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/**/*.{ts,tsx,js} : Follow design principles, ECS/OTel compatibility, APM data types, parameter conventions, and testing workflows as documented in the Development Guide

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/AGENTS.md:0-0 Timestamp: 2026-03-18T14:14:05.681Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/**/*.{ts,tsx,js} : Design AI Insights with prefetched context and hardcoded prompts, ensure proper prompt design, UI integration, conversation handoff, and test thoroughly

Learnt from: CR Repo: elastic/kibana PR: 0 File: x-pack/solutions/observability/plugins/observability_agent_builder/AGENTS.md:0-0 Timestamp: 2026-03-06T12:00:53.329Z Learning: Applies to x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/**/*.{ts,tsx,js} : Follow AI Insights architecture guidelines including prefetched context, hardcoded prompts, prompt design, UI integration, conversation handoff, and testing as documented in server/routes/ai_insights/AGENTS.md

…tion-loading-indicators-to-onboarding-flows

flash1293 · 2026-03-31T08:28:21Z

Thanks, Robert. I'm not sure whether we should do it even for wired streams in that case. If it's that slow, then we probably should fall back to a non-onboarding ID keyed strategy for looking for data flowing or not flowing to avoid that performance issue. If we do this for wired streams, then we will still have that issue if the user is using wired streams, most likely it will time out for them, which is not really a good experience.

Once we improve the querying performance on the wired streams side, we can still add it back.

…r wired streams Replace the Painless runtime mapping (which timed out on clusters with real data volume) with a time-range-only fallback for wired stream indices. Classic data streams still use fast indexed onboarding ID fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

macroscopeapp · 2026-03-31T09:54:49Z

+  if (result.status === 'fulfilled') {
+    return (result.value.hits.total as estypes.SearchTotalHits).value > 0;
+  }


🟢 Low kubernetes/resolve_has_data_probes.ts:19

When result.value.hits.total is a number rather than a SearchTotalHits object, the type assertion (result.value.hits.total as estypes.SearchTotalHits).value produces undefined, and undefined > 0 evaluates to false. This causes resolveProbe to return false (no data) when documents were actually found. Consider handling both cases: if total is a number, use it directly; otherwise use total.value.

if (result.status === 'fulfilled') { - return (result.value.hits.total as estypes.SearchTotalHits).value > 0; + const total = result.value.hits.total; + const count = typeof total === 'number' ? total : (total as estypes.SearchTotalHits).value; + return count > 0; }

🤖 Copy this AI Prompt to have your agent fix this:

In file x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/resolve_has_data_probes.ts around lines 19-21: When `result.value.hits.total` is a `number` rather than a `SearchTotalHits` object, the type assertion `(result.value.hits.total as estypes.SearchTotalHits).value` produces `undefined`, and `undefined > 0` evaluates to `false`. This causes `resolveProbe` to return `false` (no data) when documents were actually found. Consider handling both cases: if `total` is a number, use it directly; otherwise use `total.value`. Evidence trail: 1. x-pack/solutions/observability/plugins/observability_onboarding/server/routes/kubernetes/resolve_has_data_probes.ts:19-20 (REVIEWED_COMMIT) - shows the problematic code using `(result.value.hits.total as estypes.SearchTotalHits).value > 0` 2. src/platform/packages/shared/kbn-search-connectors/utils/fetch_with_pagination.ts:52-67 (REVIEWED_COMMIT) - shows the `totalToPaginateTotal` function that explicitly handles `number | SearchTotalHits | undefined` as input type, proving `hits.total` can be a number 3. Various other Kibana files use defensive patterns like `(body.hits.total as estypes.SearchTotalHits).value ?? body.hits.total ?? 0` (see git_grep results for `hits.total as estypes.SearchTotalHits`)

rStelmach · 2026-03-31T10:58:45Z

@flash1293 Yeah, right now it might work fine but on clusters with high data volume the runtime mapping would time out. I dropped it and switched wired stream detection to the same time-window approach used by the OTel host and APM flows.

macroscopeapp · 2026-03-31T12:32:34Z

@@ -67,34 +77,88 @@ export function DataIngestStatus({ onboardingId, ingestionMode }: Props) {
    }, FETCH_INTERVAL);

    return () => clearTimeout(timeout);
-  }, [data?.hasData, refetch, status]);
+  }, [isReady, hasPreExistingData, refetch, status]);


🟠 High kubernetes/data_ingest_status.tsx:68

When hasPreExistingData is true but hasMetrics is false while needsMetrics is true, the polling useEffect at line 71 incorrectly stops refetching because hasPreExistingData triggers the early return. The UI shows "Waiting for metrics to be shipped" with a loading spinner that never updates, since the polling is disabled while the component is still waiting for metrics. Consider removing hasPreExistingData from the early return condition so polling continues until isReady is true.

const pendingStatusList = [FETCH_STATUS.LOADING, FETCH_STATUS.NOT_INITIATED]; - if (pendingStatusList.includes(status) || isReady || hasPreExistingData) { + if (pendingStatusList.includes(status) || isReady) { return; }

🤖 Copy this AI Prompt to have your agent fix this:

In file x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx around lines 68-80: When `hasPreExistingData` is true but `hasMetrics` is false while `needsMetrics` is true, the polling useEffect at line 71 incorrectly stops refetching because `hasPreExistingData` triggers the early return. The UI shows "Waiting for metrics to be shipped" with a loading spinner that never updates, since the polling is disabled while the component is still waiting for metrics. Consider removing `hasPreExistingData` from the early return condition so polling continues until `isReady` is true. Evidence trail: x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx: - Lines 64-65: `isReady = needsMetrics ? hasMetrics : hasData` - Lines 68-78: useEffect polling with early return condition `if (... || isReady || hasPreExistingData)` - Lines 114-117: Progress title shows "Waiting for metrics" when `hasData && needsMetrics && !hasMetrics` - Line 139: ProgressIndicator `isLoading={needsMetrics ? !hasMetrics : !hasData}`

This is intentional by design. When hasPreExistingData is true, time-range-based detection is unreliable (it would produce false positives from data that was already flowing before the user started onboarding). So we deliberately stop polling and skip the progress indicator entirely.

flash1293 · 2026-04-01T07:49:03Z

@rStelmach Some fair points that came out of automated code review

Troubleshooting UX when hasPreExistingData && !hasData

In useTimeWindowDataDetection, isTroubleshootingVisible still treats hasData === false as “no data” (noDataConfirmed) and does not exclude hasPreExistingData.
DataIngestStatus uses the same pattern: troubleshooting is gated only on hasData === false and delay.
So users with only the pre-existing case can still get “having trouble?” after the delay, even though the product has deliberately decided detection is unreliable—worth gating troubleshooting on !hasPreExistingData (or equivalent).

Telemetry
The “logs-ingest / complete” event still fires only when hasDataResponse?.hasData === true, not when onboarding is advanced purely via hasPreExistingData. If funnels depend on that event, you may under-count successful flows in the pre-existing scenario.

Another point, if we fall back to just data availability without the onboarding ID, then we don't show the link to the next page right away but require the window blur. That feels a little weird - we only give the user the link to the next page if they do this action that is pretty hidden. Maybe it's fine... Thoughts @LucaWintergerst ?

… there

macroscopeapp · 2026-04-01T10:00:30Z

+
+    if (hasData === true) {
      setDataReceivedTelemetrySent(true);
      analytics.reportEvent(OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT.eventType, {


🟢 Low kubernetes/data_ingest_status.tsx:87

analytics.reportEvent(...) is called without optional chaining on lines 87 and 95, but analytics can be undefined (as shown by the analytics?. usage in GetStartedPanel). When the effect runs while analytics is undefined, it throws a TypeError. Consider adding optional chaining (analytics?.reportEvent(...)) to both calls.

Suggested change

analytics.reportEvent(OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT.eventType, {

analytics?.reportEvent(OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT.eventType, {

🤖 Copy this AI Prompt to have your agent fix this:

In file x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx around line 87: `analytics.reportEvent(...)` is called without optional chaining on lines 87 and 95, but `analytics` can be `undefined` (as shown by the `analytics?.` usage in `GetStartedPanel`). When the effect runs while `analytics` is undefined, it throws a `TypeError`. Consider adding optional chaining (`analytics?.reportEvent(...)`) to both calls. Evidence trail: 1. x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/kubernetes/data_ingest_status.tsx lines 87, 95: `analytics.reportEvent(...)` without optional chaining 2. x-pack/solutions/observability/plugins/observability_onboarding/public/application/quickstart_flows/shared/get_started_panel.tsx line 58: `analytics?.reportEvent(...)` with optional chaining 3. src/platform/plugins/shared/kibana_react/public/context/types.ts line 15: `export type KibanaServices = Partial<CoreStart>;` 4. src/platform/plugins/shared/kibana_react/public/context/context.tsx lines 17-20: default context value is `services: {}`

rStelmach · 2026-04-01T10:03:14Z

@flash1293
Yeah, about the window blur you are right, we don't need to have it when we have pre-existing data.

I addressed your comments including the window blur one

…tion-loading-indicators-to-onboarding-flows

flash1293 · 2026-04-01T16:02:57Z

This is working well know, but the core issue I mentioned before is left - "data is flowing" is not binary. I went through the k8s otel flow and when it told me "your data is flowing", I got this:

keep refreshing:

and even more refreshing:

Not sure whether this is acceptable or not

flash1293 · 2026-04-02T09:02:47Z

Sorry for finding new edge cases here, but I just realized that the time range check with using session start as the beginning of the time range can be a bit problematic. For example, let's say I go, I have APM data streaming in, but only with like a 30 second delay or something, which isn't unusual. I go to the APM onboarding page, then the session start is recorded and from the beginning there's no data in there, so it starts polling. And then after a while data from existing shippers arrives and then it says, oh, your data is flowing now, which wasn't related to the new API key I just created, but it's just the regular data just having a slight delay. So maybe we shouldn't use session start, but something like couple minutes before session start to avoid these false positives. And then if that's the case falling back to the data is already flowing case.

flash1293 · 2026-04-02T09:06:57Z

Also on the latest serverless deployment from above, it seems like in the Kubernetes Otel flow, even when I have classic stream selected and there is already Kubernetes Otel data flowing in, we are showing, we are showing the case as if we wouldn't be able to check whether data is coming in for that particular instance, but I thought we do have an onboarding ID in this situation. Right? We can't just can't check it for the wired streams case because it's too expensive, but we can check it for the classic streams case or did we remove that part?

We should just default to this for the wired streams case, because for the wired streams case, it's too expensive to check whether data for that specific onboarding ID started flowing, but for the classic streams case, we can do it.

…tion-loading-indicators-to-onboarding-flows

flash1293 · 2026-04-07T10:04:06Z

@rStelmach Just tested on the serverless instance from above, and the comment #257870 (comment) is still a problem, see recording:

apm_issue.mov

flash1293 · 2026-04-07T10:05:09Z

Looks like the APM flow is not checking for pre-existing data at all

elasticmachine · 2026-04-07T13:12:24Z

💚 Build Succeeded

Buildkite Build
Commit: 477be11
Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-257870-477be11d42e8

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`observabilityOnboarding`	136	138	+2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`observabilityOnboarding`	157.9KB	166.7KB	+8.8KB

Unknown metric groups

ESLint disabled line counts

id	before	after	diff
`observabilityOnboarding`	10	11	+1

Total ESLint disabled count

id	before	after	diff
`observabilityOnboarding`	13	14	+1

History

💛 Build #422999 was flaky 518df1c
💚 Build #421240 succeeded f076ee0
💚 Build #420501 succeeded 03b17cd
💛 Build #420379 was flaky 1e991e5
💛 Build #419669 was flaky c4a01c3

flash1293

LGTM, I'm not comfortable with the cloudforwarder bit, could we reach out to the team owning the forwarder itself so they can check and make sure? If something's wrong it can go in via bugfix PR.

Things I'm not sure about:

Should it check waf data streams as well?
Are the ones in the list right now the correct ones?
Some general issue I'm not seeing right now

* commit 'bfc2446fdbcba2b3183f4518817c9757198c95ef': [Cascade] make cascade layout enabled by default (elastic#260698) [Dashboard Agent] Extract safe dashboard attachment integration refactors (elastic#261422) [One Workflow] Replace workflows:aiAgent:enabled with agentBuilder:experimentalFeatures (elastic#261330) [EDR Workflows] Osquery: hide query code from dropdown and show Elastic for automated Run By (elastic#261394) [Observability Onboarding] Add data detection & loading indicators to onboarding flows (elastic#257870) [Significant events] Format event count with locale-aware number separators (elastic#261570) [Fleet] Fix deprecated filter in browse integrations (elastic#261459) [Lens as code] Split `xyStateSchema` config (elastic#261089) [Data Views as Code] Use `ref_id` and add metadata in data views schemas (elastic#261181) Made-with: Cursor # Conflicts: # x-pack/platform/packages/shared/dashboard-agent/dashboard-agent-common/types.ts

…r-uid-to-id * commit '6868ae2f195462f1f6809a6a544114f54e48239e': [One Workflow] Replace workflows:aiAgent:enabled with agentBuilder:experimentalFeatures (elastic#261330) [EDR Workflows] Osquery: hide query code from dropdown and show Elastic for automated Run By (elastic#261394) [Observability Onboarding] Add data detection & loading indicators to onboarding flows (elastic#257870) [Significant events] Format event count with locale-aware number separators (elastic#261570) [Fleet] Fix deprecated filter in browse integrations (elastic#261459) [Lens as code] Split `xyStateSchema` config (elastic#261089) [Data Views as Code] Use `ref_id` and add metadata in data views schemas (elastic#261181)

## Summary PR #257870 introduced a `checkPreExistingData` mechanism to the shared `DataIngestStatus` component that conditionally hides the `ProgressIndicator` when pre-existing data is detected but no new data has arrived yet. This broke the Ensemble nightly Playwright tests for the Kubernetes EA, OTel Kubernetes, and OTel Host onboarding flows, which were asserting the visibility of the progress indicator text ("We are monitoring your cluster" / "We are monitoring your host") as proof that data ingestion was detected. The fix updates each flow's page object to assert the "Explore logs" action link instead — an element that is always rendered once data (or pre-existing data) is detected, regardless of whether the progress indicator is shown or hidden.

## Summary PR elastic#257870 introduced a `checkPreExistingData` mechanism to the shared `DataIngestStatus` component that conditionally hides the `ProgressIndicator` when pre-existing data is detected but no new data has arrived yet. This broke the Ensemble nightly Playwright tests for the Kubernetes EA, OTel Kubernetes, and OTel Host onboarding flows, which were asserting the visibility of the progress indicator text ("We are monitoring your cluster" / "We are monitoring your host") as proof that data ingestion was detected. The fix updates each flow's page object to assert the "Explore logs" action link instead — an element that is always rendered once data (or pre-existing data) is detected, regardless of whether the progress indicator is shown or hidden.

rStelmach added 4 commits March 12, 2026 11:00

k8s otel flow

699aa42

add otel host flow

d73a7c4

apm otel flow

5384013

cloud-forwarder

8032e28

rStelmach changed the title ~~5296 observability onboarding add data detection loading indicators to onboarding flows~~ [Observability Onboarding] Add data detection & loading indicators to onboarding flows Mar 16, 2026

rStelmach and others added 6 commits March 16, 2026 17:26

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

ac24ca3

…tion-loading-indicators-to-onboarding-flows

add runtime field for k8s logs essentials + wired flow

a433a87

dedup + cleanup

4b99f82

cleanup

3bd3dbd

update ensemble tests

1cace10

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

449ae88

…tion-loading-indicators-to-onboarding-flows

rStelmach added release_note:enhancement backport:skip This PR does not require backporting ci:project-deploy-observability Create an Observability project Team:obs-onboarding Observability Onboarding Team Feature: Observability Onboarding labels Mar 18, 2026

rStelmach marked this pull request as ready for review March 18, 2026 15:00

rStelmach requested a review from a team as a code owner March 18, 2026 15:01

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

533244b

…tion-loading-indicators-to-onboarding-flows

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

rStelmach and others added 2 commits March 19, 2026 12:12

harden data detection

384d8c7

Changes from node scripts/eslint_all_files --no-cache --fix

926badd

coderabbitai Bot reviewed Mar 19, 2026

View reviewed changes

rStelmach and others added 3 commits March 20, 2026 11:09

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

2f1f077

…tion-loading-indicators-to-onboarding-flows

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

f93c257

…tion-loading-indicators-to-onboarding-flows

remove any

7c32546

macroscopeapp Bot reviewed Mar 31, 2026

View reviewed changes

detect already flowing data, disable detection if true

9ec039b

macroscopeapp Bot reviewed Mar 31, 2026

View reviewed changes

kibanamachine added 2 commits March 31, 2026 12:47

Changes from node scripts/lint_ts_projects --fix

3ba7dd1

Changes from node scripts/regenerate_moon_projects.js --update

c4a01c3

address comments, remove window blur action when preexisting data is…

1e991e5

… there

macroscopeapp Bot reviewed Apr 1, 2026

View reviewed changes

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

03b17cd

…tion-loading-indicators-to-onboarding-flows

rStelmach and others added 2 commits April 2, 2026 12:57

widen pre-existing data window and gate bypass on wired streams

f076ee0

Merge branch 'main' into 5296-observability-onboarding-add-data-detec…

518df1c

…tion-loading-indicators-to-onboarding-flows

add precheck for APM otel flow

477be11

flash1293 approved these changes Apr 7, 2026

View reviewed changes

rStelmach merged commit dba08a0 into elastic:main Apr 7, 2026
21 checks passed

kibanamachine added the v9.4.0 label Apr 7, 2026

diogolima-elastic mentioned this pull request Apr 10, 2026

[ObsOnboarding] Fix playwright tests #262525

Merged

		const needsMetrics = actionLinks.some((actionLink) => actionLink.requires === 'metrics');
		const isReady = needsMetrics ? hasMetrics : hasData;

	analytics.reportEvent(OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT.eventType, {
	analytics?.reportEvent(OBSERVABILITY_ONBOARDING_TELEMETRY_EVENT.eventType, {

Conversation

rStelmach commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary 📚

Detection strategies per flow

What's new

Notes 📓

Why not inject onboarding.id into every flow for exact correlation?

Known edge cases of the detection strategies used in this PR:

How to test 🧪

Demos 🎥

Summary by CodeRabbit

Uh oh!

elasticmachine commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 18, 2026

🤖 GitHub comments

Uh oh!

coderabbitai Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flash1293 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

rStelmach commented Mar 16, 2026 •

edited

Loading

Why not inject `onboarding.id` into every flow for exact correlation?

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading

coderabbitai Bot Mar 19, 2026 •

edited

Loading

coderabbitai Bot Mar 19, 2026 •

edited

Loading

flash1293 commented Mar 31, 2026 •

edited

Loading

elasticmachine commented Apr 7, 2026 •

edited

Loading