Skip to content

[FTR] Configure undici timeouts on KbnClient dispatcher#270932

Merged
rStelmach merged 4 commits into
elastic:mainfrom
rStelmach:kbn-client-undici-timeout-fix
May 25, 2026
Merged

[FTR] Configure undici timeouts on KbnClient dispatcher#270932
rStelmach merged 4 commits into
elastic:mainfrom
rStelmach:kbn-client-undici-timeout-fix

Conversation

@rStelmach
Copy link
Copy Markdown
Contributor

@rStelmach rStelmach commented May 25, 2026

Summary

Set connect.timeout = 60s on the undici Agent used by KbnClientRequester (https path only).

Why

#268531 migrated KbnClient from axios to native fetch but did not override undici's 10s connect.timeout default. Axios had no equivalent cutoff, so FTR callers talking to a busy local Kibana started failing once that PR landed.

The kibana-streams-performance weekly pipeline went red in builds #9, #11, #12, and #13 with:

ConnectTimeoutError: Connect Timeout Error (attempted address: localhost:5620, timeout: 10000ms)

The 10000ms is undici's default. Bisect: build #8 last green (2026-05-11) → #9 first red (2026-05-18), with #268531 in the window.

What changed

src/platform/packages/shared/kbn-kbn-client/src/kbn_client/kbn_client_requester.ts: one constant, one option on the https Agent. http branch unchanged.

Related

Regression introduced in #268531. Companion streams perf PR: #270636.

Validation

https://buildkite.com/elastic/kibana-streams-performance/builds/14

rStelmach added 2 commits May 25, 2026 11:53
Since elastic#268531 migrated KbnClient from axios to native fetch, the
undici Agent is constructed without timeout overrides, so all FTR
Kibana traffic inherits undici's default 10s connect timeout and 5
min headers/body timeouts. Axios had no equivalent connect timeout,
which masked any Kibana stall before the new client landed.

The weekly kibana-streams-performance pipeline started failing in
builds elastic#9, elastic#11, elastic#12 and elastic#13 (2026-05-18 onward) with
"ConnectTimeoutError ... timeout: 10000ms" while POSTing a large
multipart content/import payload to Kibana under heavy load.

Set generous undici timeouts on the dispatcher for both http and
https targets:

- connect timeout: 60s (vs 10s default)
- headers timeout: 5 min (matches undici default, made explicit)
- body timeout:    10 min

These match the effectively-unlimited behaviour FTR relied on
under axios while keeping a finite safeguard. Always constructing
a dispatcher also lets us drop the optional spread in the fetch
call.
Self-review caught two no-op/speculative changes that should not ship:

- headersTimeout: 5*60_000 equals undici's default (300e3), so setting
  it explicitly is misleading.
- bodyTimeout: 10*60_000 was a speculative bump with no evidence that
  any current FTR caller hits the 5-min default. The original failure
  was a ConnectTimeoutError, full stop.

Also reverting the http branch back to a null dispatcher. Sibling
patterns in kbn-test-saml-auth/fetch_kibana_version and
kbn-synthtrace/cli/utils/ssl construct an https-only Agent and pass
undefined for http; the original kbn-client code matched that idiom
and no http caller has reported a stall.

Net change vs main is now a single override: connect.timeout = 60s on
the https Agent, plus the comment explaining why.
@rStelmach rStelmach added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting labels May 25, 2026
@rStelmach rStelmach marked this pull request as ready for review May 25, 2026 12:56
@rStelmach rStelmach requested review from a team as code owners May 25, 2026 12:56
@rStelmach rStelmach added Team:obs-onboarding Observability Onboarding Team Feature:Streams This is the label for the Streams Project labels May 25, 2026
@infra-vault-gh-plugin-prod
Copy link
Copy Markdown

Pinging @elastic/obs-onboarding-team (Team:obs-onboarding)

@kibanamachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Scout Lane #46 - serverless-observability_complete / default / local-serverless-observability_complete - Hosts Page - Empty State - should show onboarding page when no data is present

Metrics [docs]

✅ unchanged

History

@rStelmach rStelmach merged commit b00b51d into elastic:main May 25, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:Streams This is the label for the Streams Project release_note:skip Skip the PR/issue when compiling release notes Team:obs-onboarding Observability Onboarding Team v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants