Fix ES health check poller retries#248496
Conversation
|
Pinging @elastic/kibana-core (Team:Core) |
There was a problem hiding this comment.
Added expected info API call counts
💚 Build Succeeded
Metrics [docs]
|
| }, | ||
| complete: done, | ||
| complete: () => { | ||
| expect(internalClient.nodes.info).toHaveBeenCalledTimes(1); |
There was a problem hiding this comment.
Nit: just thinking maybe we can increase health retries here to something like 1000 to really make it clear that we only expect 1 network request for successful calls
| expect(result.isCompatible).toBeDefined(); | ||
| }, | ||
| complete: () => { | ||
| done(); |
There was a problem hiding this comment.
Can we add an expect against internalClient.info.nodes here for the expected number of calls?
| nodeInfosSuccessOnce(createNodes('5.1.1', '5.1.2', '5.1.3')); // emit | ||
| nodeInfosSuccessOnce(createNodes('5.1.1', '5.1.2', '5.1.3')); // ignore | ||
| nodeInfosSuccessOnce(createNodes('5.0.0', '5.1.0', '5.2.0')); // emit, different from previous version | ||
| nodeInfosSuccessOnce(createNodes('5.1.0', '5.1.0', '5.1.0')); // emit, no warning nodes, used to detect end of test |
| .map((key) => nodesInfoResponse.nodes[key]) | ||
| .map((node) => Object.assign({}, node, { name: getHumanizedNodeName(node) })); | ||
| .map((node) => Object.assign({}, node, { name: getHumanizedNodeName(node) })) | ||
| .sort(sortNodes); // Sorting ensures stable ordering for comparison |
|
Starting backport for target branches: 8.19, 9.1, 9.2, 9.3 https://github.com/elastic/kibana/actions/runs/20878265645 |
## Summary Ensure that our `retry` logic calls the network request again the configured number of retries. (cherry picked from commit 33fa57c)
💔 Some backports could not be created
Note: Successful backport PRs will be merged automatically after passing CI. Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `9.3`: - [Fix ES health check poller (#248496)](#248496) <!--- Backport version: 9.6.6 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Jean-Louis Leysens","email":"jeanlouis.leysens@elastic.co"},"sourceCommit":{"committedDate":"2026-01-10T12:24:38Z","message":"Fix ES health check poller (#248496)\n\n## Summary\n\nEnsure that our `retry` logic calls the network request again the\nconfigured number of retries.","sha":"33fa57c139429d8231c5f2d725379e729948d195","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Core","release_note:skip","backport:all-open","v9.4.0"],"title":"Fix ES health check poller retries","number":248496,"url":"https://github.com/elastic/kibana/pull/248496","mergeCommit":{"message":"Fix ES health check poller (#248496)\n\n## Summary\n\nEnsure that our `retry` logic calls the network request again the\nconfigured number of retries.","sha":"33fa57c139429d8231c5f2d725379e729948d195"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/248496","number":248496,"mergeCommit":{"message":"Fix ES health check poller (#248496)\n\n## Summary\n\nEnsure that our `retry` logic calls the network request again the\nconfigured number of retries.","sha":"33fa57c139429d8231c5f2d725379e729948d195"}}]}] BACKPORT--> Co-authored-by: Jean-Louis Leysens <jeanlouis.leysens@elastic.co>
## Summary Ensure that our `retry` logic calls the network request again the configured number of retries.
* commit 'c4304e27736c62f17af20d145770b2ae9d3fae30': (418 commits) skip failing suite (elastic#89079) [ES|QL] Update grammars (elastic#248600) skip failing test suite (elastic#248579) [ES|QL] Update function metadata (elastic#248601) skip failing test suite (elastic#248554) Fix flaky test runner serverless flag for Search solution (elastic#248559) [Security Solution][Attacks/Alerts][Attacks page][Table section] Remember last selected attack details tab (Summary or Alerts) (elastic#247519) (elastic#247988) Fix ES health check poller (elastic#248496) Fix collector schema ownership (elastic#241292) [api-docs] 2026-01-10 Daily api_docs build (elastic#248574) Update dependency cssstyle to v5.3.5 (main) (elastic#237637) Update dependency @octokit/rest to v22.0.1 (main) (elastic#243102) skip failing test suite (elastic#248504) skip failing test suite (elastic#247685) Remove broken ecommerce_dashboard journeys (elastic#248162) [Obs AI] Hide AI Insight component when there are no connectors (elastic#248542) skip failing suite (elastic#248433) [Security Solution][Attacks/Alerts][Attacks page][Table section] Hide tabs for generic attack groups (elastic#248444) [Agent Builder] [AI Infra] Adds product documentation tool and task evals (elastic#248370) [Controls Anywhere] Keep controls focused when creating + editing other panels (elastic#248021) ...
|
Starting backport for target branches: 8.19, 9.1, 9.2, 9.3 https://github.com/elastic/kibana/actions/runs/20916590759 |
💔 All backports failed
Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
Summary
Ensure that our
retrylogic calls the network request again the configured number of retries.Description
After introducing the retries to
pollEsNodesVersiona failed API request to the nodes info API would not actually retry the network request. Each "retry" would just be a no-op delayed by thehealthCheckInterval. So e.g. if a request times out we would waithealthCheckInterval*healthCheckRetryms and emit the failure. The nodes info API call would only be called again on the next healthCheckInterval.This PR changes the behavior to actually call the API during retries. Assuming the API calls continuously fails, this would create
healthCheckRetryAPI requests and only if all of these fail it would emit the failure.