Skip to content

[APM][Scout] Stabilize service map embeddable test panel time range#268459

Merged
jennypavlova merged 1 commit into
elastic:mainfrom
jennypavlova:fix-service-map-dash-test
May 8, 2026
Merged

[APM][Scout] Stabilize service map embeddable test panel time range#268459
jennypavlova merged 1 commit into
elastic:mainfrom
jennypavlova:fix-service-map-dash-test

Conversation

@jennypavlova
Copy link
Copy Markdown
Member

@jennypavlova jennypavlova commented May 8, 2026

Summary

Fixes the flaky Service map embeddable Scout test by removing the
implicit dependency on wall-clock drift between global setup and test
execution.

Why it was flaky

  • Global setup ingests `serviceMapMultiEnv` synthtrace data at
    `now-15m..now` (anchored to the moment global setup runs).
  • The Service Map embeddable factory hard-codes a panel-level default
    `time_range: { from: 'now-15m', to: 'now' }` that overrides the
    dashboard's range, and that range cannot be configured from the
    editor flyout
    at panel creation time.
  • For the panel's window to contain the synth window, the entire
    parallel suite (global setup + everything that runs before this spec)
    has to complete in well under ~15 minutes. Once the cumulative
    `testNow - setupNow` drift exceeds the 15-minute panel window, the
    synth data slides fully outside the panel's range and
    `service-map-test`:

Why we have to use the edit flow

The editor flyout exposes only `service_name`, `environment` and
`kuery`; it has no UI for `time_range`. The factory always seeds the
new panel with the 15-minute default, so the only way to widen the
panel's queried window is the dashboard-level Customize panel flow
after the panel exists. To make that widened window actually take
effect for the suggestions endpoint, we then have to re-open the editor
in edit mode — `onEdit` passes
`timeRangeManager.api.timeRange$.getValue()` (now 24h) into the
flyout, so the combo boxes' suggestions are resolved in the wider
window. We can't apply the filters first and widen later, because the
first `selectSingleOption` is exactly what was failing.

What changed

`x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel_tests/service_map/service_map_embeddable.spec.ts`:

  1. Add the Service Map panel without filters (just save the empty
    editor) → panel exists with the factory's default 15-minute custom
    range and the `CUSTOM_TIME_RANGE_BADGE` is asserted.
  2. Open the Customize panel flyout, ensure the custom time range
    toggle is on, set the panel's range to Last 24 hours via the
    panel-scoped `superDatePicker` quick menu, save.
  3. Re-open the editor in edit mode via
    `embeddablePanelAction-editPanel` and apply
    `service_name` / `environment` / `kuery`. With the 24h panel
    range, the `apmServiceMapEditorServiceNameComboBox` reliably
    resolves `service-map-test` even with realistic setup-to-test
    delay.
  4. The remaining assertions (visibility, popover, maximize, fills,
    disable-custom-time-range, View full service map) run against the
    24h panel range, so they no longer depend on the 15-minute synth
    window aligning with wall-clock "now".

`global.setup.ts` is unchanged — synth still ingests at
`now-15m..now`. 24h easily covers any realistic delay between global
setup and this spec running.

References

Closes #265639

Test plan

  • CI green
  • Local run: `npx playwright test --config x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel.playwright.config.ts --project local --grep stateful-classic x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel_tests/service_map/service_map_embeddable.spec.ts`
  • Repeat-each smoke (`--repeat-each=10`) to confirm flake is gone

Made with Cursor

The flake on this test was caused by a time-drift between when synthtrace
ingested data in global setup (`now-15m..now` at setup time) and when the
spec actually ran. The Service Map embeddable factory hard-codes a default
panel-level `time_range: { from: 'now-15m', to: 'now' }` that overrides the
dashboard's range, and that range cannot be configured from the editor
flyout itself. So even with the dashboard widened, the panel queried only
`now-15m..now` at view time -- if the parallel test suite spent more than
~15 minutes between setup and this spec running, the synth window slid
fully outside the panel's window and `service-map-test` was missing from
both the suggestions endpoint (combo box) and the rendered service map.

Restructure the spec so the panel is added with no filters first, then
its custom time range is widened to "Last 24 hours" via the Customize
panel flow before re-opening the editor in edit mode to apply filters.
The editor flyout in edit mode is fed the panel's current time range,
so the suggestions endpoint now resolves `service-map-test` in a 24h
window, and the rendered service map sees the synth data regardless of
realistic setup-to-test drift.

Closes elastic#265639

Co-authored-by: Cursor <cursoragent@cursor.com>
@jennypavlova jennypavlova requested review from a team as code owners May 8, 2026 14:12
@jennypavlova jennypavlova added release_note:skip Skip the PR/issue when compiling release notes backport:version Backport to applied version labels v9.3.0 v9.4.0 Team:obs-presentation Focus: APM UI, Infra UI, Hosts UI, Universal Profiling, Obs Overview and left Navigation labels May 8, 2026
@infra-vault-gh-plugin-prod
Copy link
Copy Markdown

Pinging @elastic/obs-presentation-team (Team:obs-presentation)

@jennypavlova
Copy link
Copy Markdown
Member Author

/flaky scoutConfig:x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel.playwright.config.ts:50

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner

✅ Build triggered - kibana-flaky-test-suite-runner#12204

  • x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel.playwright.config.ts x50

Copy link
Copy Markdown
Contributor

@sbelastic sbelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming flaky tests pass, LGTM.

Note: In a future improvement/implementation we should consider instead of having a wider time range, to take advantage of page.clock.install() and page.clock.fastForward()

@jennypavlova
Copy link
Copy Markdown
Member Author

Note: In a future improvement/implementation we should consider instead of having a wider time range, to take advantage of page.clock.install() and page.clock.fastForward()

@sbelastic that's a good idea, I would add that in terms of UX it would be nice to have the time picker on top before the filters so it is clear why the service is not visible for example if the user is looking for a service which is not present in the last 15 mins (but this is only during the creation the time range is visible in the panel after)

@kibanamachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#12204

[✅] x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel.playwright.config.ts (--arch stateful --domain classic): 50/50 tests passed.
[✅] x-pack/solutions/observability/plugins/apm/test/scout/ui/parallel.playwright.config.ts (--arch serverless --domain observability_complete): 50/50 tests passed.

see run history

@jennypavlova jennypavlova merged commit 799f6f1 into elastic:main May 8, 2026
107 checks passed
@kibanamachine
Copy link
Copy Markdown
Contributor

Starting backport for target branches: 9.3, 9.4

https://github.com/elastic/kibana/actions/runs/25566111245

@kibanamachine
Copy link
Copy Markdown
Contributor

💔 All backports failed

Status Branch Result
9.3 Backport failed because of merge conflicts

You might need to backport the following PRs to 9.3:
- test: stabilize profiling Scout has_setup_with_integrations API tests (#268361)
- [scout] migrate Lens API tests (#267993)
- chore(axios,security-solution): remove axios from telemetry/role scripts (#267944)
9.4 Backport failed because of merge conflicts

You might need to backport the following PRs to 9.4:
- test: stabilize profiling Scout has_setup_with_integrations API tests (#268361)
- Fix broken translation string (#268460)
- [scout] migrate Lens API tests (#267993)
- [One Workflow] Unskip Workflow editor: validation performance tests (#268149)
- [scout] fix cleanup in SO managerment find test (#268349)

Manual backport

To create the backport manually run:

node scripts/backport --pr 268459

Questions ?

Please refer to the Backport tool documentation

@kibanamachine
Copy link
Copy Markdown
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create automatically backports add a backport:* label or prevent reminders by adding the backport:skip label.
You can also create backports manually by running node scripts/backport --pr 268459 locally
cc: @jennypavlova

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport missing Added to PRs automatically when the are determined to be missing a backport. backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes Team:obs-presentation Focus: APM UI, Infra UI, Hosts UI, Universal Profiling, Obs Overview and left Navigation v9.3.0 v9.4.0 v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failing test: Service map embeddable - adds Service map panel with service name, environment and KQL filter

3 participants