Skip to content

Expose isRetryableEsClientError#228315

Merged
rudolf merged 23 commits intoelastic:mainfrom
rudolf:expose-is-retryable
Oct 15, 2025
Merged

Expose isRetryableEsClientError#228315
rudolf merged 23 commits intoelastic:mainfrom
rudolf:expose-is-retryable

Conversation

@rudolf
Copy link
Contributor

@rudolf rudolf commented Jul 16, 2025

Summary

For plugins to have robust error handling they need to retry transient errors. This PR exposes isRetryableEsClientError utility to help plugins know which errors are safe to retry.

Note for reviewers

Core's isRetryableEsClientError typically retries on more status codes than what most plugins do. Instead of keeping the existing plugin status codes I've adopted the default behavior which means your plugins will be retrying on more types of responses after this change. Carefully check that this makes sense.

In particular the two following status codes get retried that plugins weren't previously retrying:

    • 429 TooManyRequests (ES circuit breaker)
    • 504 GatewayTimeout (rare, but happens if cloud proxy cannot establish socket connection to ES due to ES thread pool being overloaded)

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
  • Flaky Test Runner was used on any tests changed
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines
  • Review the backport guidelines and apply applicable backport:* labels.

Identify risks

Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.

@rudolf rudolf added the Team:Core Platform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t// label Jul 16, 2025
@rudolf rudolf force-pushed the expose-is-retryable branch from 7479946 to 0eaa367 Compare August 21, 2025 19:52
Comment on lines +23 to +24
401, // AuthorizationException
403, // AuthenticationException
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shared function doesn't retry on auth errors since usually it's not a transient error.

Copy link
Contributor

@gsoldevila gsoldevila Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should populate the ones below by spreading the default const ...DEFAULT_RETRY_STATUS_CODES (that would require exposing the const 🤔 )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in other words, is there a scenario where it does not make sense to retry on any of the values below? I'm asking cause if we add a new status code on our DEFAULT list in the future, we might forget to also add it in the "custom" lists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to use an incremental approach:

{
  alsoRetryOn: [401, 403],
  doNotRetryOn: [xxx, yyy],
}

this way users can automatically benefit from updates in the DEFAULT_RETRY_STATUS_CODES list, and they can add their custom exceptions.

// index while snapshots are in progress. This should have been a 503
// so once https://github.com/elastic/elasticsearch/issues/65883 is
// fixed we can remove this.
e?.body?.error?.type === 'snapshot_in_progress_exception'))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snapshot_in_progress_exception no longer blocks other API calls

"@kbn/core-elasticsearch-client-server-internal": "link:src/core/packages/elasticsearch/client-server-internal",
"@kbn/core-elasticsearch-server": "link:src/core/packages/elasticsearch/server",
"@kbn/core-elasticsearch-server-internal": "link:src/core/packages/elasticsearch/server-internal",
"@kbn/core-elasticsearch-server-utils": "link:src/core/packages/elasticsearch/server-utils",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to introduce a new package to avoid circular dependencies:
server → client-server-mocks → client-server-internal → server

Comment on lines -11 to -12
401, // AuthorizationException
403, // AuthenticationException
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR will remove retrying on the 401/403 status codes. I think the cases index gets initialized in start, which means it happens after saved object migrations have run. Since both use the internal user it should not be possible for migrations to succeed and cases to fail here, but worth checking my reasoning.

@rudolf rudolf marked this pull request as ready for review August 22, 2025 15:14
@rudolf rudolf requested review from a team as code owners August 22, 2025 15:14
@rudolf rudolf requested a review from hop-dev August 22, 2025 15:14
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@botelastic botelastic bot added the Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. label Aug 22, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@rudolf rudolf merged commit c499684 into elastic:main Oct 15, 2025
13 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 9.1

https://github.com/elastic/kibana/actions/runs/18537134683

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/core-elasticsearch-server-internal 33 32 -1
Unknown metric groups

API count

id before after diff
@kbn/core-elasticsearch-server-internal 38 36 -2
@kbn/core-elasticsearch-server-utils - 3 +3
total +1

ESLint disabled line counts

id before after diff
securitySolution 695 694 -1

Total ESLint disabled count

id before after diff
securitySolution 797 796 -1

History

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
9.1 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 228315

Questions ?

Please refer to the Backport tool documentation

@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Oct 16, 2025
@kibanamachine
Copy link
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create automatically backports add a backport:* label or prevent reminders by adding the backport:skip label.
You can also create backports manually by running node scripts/backport --pr 228315 locally
cc: @rudolf

mgadewoll pushed a commit to tkajtoch/kibana that referenced this pull request Oct 17, 2025
## Summary

For plugins to have robust error handling they need to retry transient
errors. This PR exposes `isRetryableEsClientError` utility to help
plugins know which errors are safe to retry.

## Note for reviewers
Core's isRetryableEsClientError typically retries on more status codes
than what most plugins do. Instead of keeping the existing plugin status
codes I've adopted the default behavior which means your plugins will be
retrying on more types of responses after this change. Carefully check
that this makes sense.

In particular the two following status codes get retried that plugins
weren't previously retrying:
 *   - 429 TooManyRequests (ES circuit breaker)
* - 504 GatewayTimeout (rare, but happens if cloud proxy cannot
establish socket connection to ES due to ES thread pool being
overloaded)
 
### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
@rudolf rudolf added backport:skip This PR does not require backporting and removed v9.1.4 backport:version Backport to applied version labels labels Oct 17, 2025
@kibanamachine kibanamachine removed the backport missing Added to PRs automatically when the are determined to be missing a backport. label Oct 17, 2025
nickpeihl pushed a commit to nickpeihl/kibana that referenced this pull request Oct 23, 2025
## Summary

For plugins to have robust error handling they need to retry transient
errors. This PR exposes `isRetryableEsClientError` utility to help
plugins know which errors are safe to retry.

## Note for reviewers
Core's isRetryableEsClientError typically retries on more status codes
than what most plugins do. Instead of keeping the existing plugin status
codes I've adopted the default behavior which means your plugins will be
retrying on more types of responses after this change. Carefully check
that this makes sense.

In particular the two following status codes get retried that plugins
weren't previously retrying:
 *   - 429 TooManyRequests (ES circuit breaker)
* - 504 GatewayTimeout (rare, but happens if cloud proxy cannot
establish socket connection to ES due to ES thread pool being
overloaded)
 
### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
NicholasPeretti pushed a commit to NicholasPeretti/kibana that referenced this pull request Oct 27, 2025
## Summary

For plugins to have robust error handling they need to retry transient
errors. This PR exposes `isRetryableEsClientError` utility to help
plugins know which errors are safe to retry.

## Note for reviewers
Core's isRetryableEsClientError typically retries on more status codes
than what most plugins do. Instead of keeping the existing plugin status
codes I've adopted the default behavior which means your plugins will be
retrying on more types of responses after this change. Carefully check
that this makes sense.

In particular the two following status codes get retried that plugins
weren't previously retrying:
 *   - 429 TooManyRequests (ES circuit breaker)
* - 504 GatewayTimeout (rare, but happens if cloud proxy cannot
establish socket connection to ES due to ES thread pool being
overloaded)
 
### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
rudolf added a commit to rudolf/kibana that referenced this pull request Dec 1, 2025
## Summary

For plugins to have robust error handling they need to retry transient
errors. This PR exposes `isRetryableEsClientError` utility to help
plugins know which errors are safe to retry.

## Note for reviewers
Core's isRetryableEsClientError typically retries on more status codes
than what most plugins do. Instead of keeping the existing plugin status
codes I've adopted the default behavior which means your plugins will be
retrying on more types of responses after this change. Carefully check
that this makes sense.

In particular the two following status codes get retried that plugins
weren't previously retrying:
 *   - 429 TooManyRequests (ES circuit breaker)
* - 504 GatewayTimeout (rare, but happens if cloud proxy cannot
establish socket connection to ES due to ES thread pool being
overloaded)

### Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit c499684)

# Conflicts:
#	x-pack/platform/plugins/shared/cases/tsconfig.json
rudolf added a commit to rudolf/kibana that referenced this pull request Dec 1, 2025
## Summary

For plugins to have robust error handling they need to retry transient
errors. This PR exposes `isRetryableEsClientError` utility to help
plugins know which errors are safe to retry.

## Note for reviewers
Core's isRetryableEsClientError typically retries on more status codes
than what most plugins do. Instead of keeping the existing plugin status
codes I've adopted the default behavior which means your plugins will be
retrying on more types of responses after this change. Carefully check
that this makes sense.

In particular the two following status codes get retried that plugins
weren't previously retrying:
 *   - 429 TooManyRequests (ES circuit breaker)
* - 504 GatewayTimeout (rare, but happens if cloud proxy cannot
establish socket connection to ES due to ES thread pool being
overloaded)

### Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit c499684)

# Conflicts:
#	src/core/packages/saved-objects/migration-server-internal/src/actions/catch_retryable_es_client_errors.ts
#	x-pack/platform/plugins/shared/alerting/tsconfig.json
#	x-pack/platform/plugins/shared/cases/server/cases_analytics/tasks/synchronization_task/synchronization_sub_task.ts
#	x-pack/platform/plugins/shared/cases/server/cases_analytics/utils.ts
#	x-pack/platform/plugins/shared/cases/tsconfig.json
#	x-pack/platform/plugins/shared/streams/tsconfig.json
#	x-pack/solutions/observability/plugins/observability/tsconfig.json
#	x-pack/solutions/observability/plugins/slo/tsconfig.json
#	x-pack/solutions/security/plugins/security_solution/tsconfig.json
@rudolf
Copy link
Contributor Author

rudolf commented Dec 1, 2025

💚 All backports created successfully

Status Branch Result
9.2
9.1
8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

rudolf added a commit that referenced this pull request Dec 22, 2025
# Backport

This will backport the following commits from `main` to `9.2`:
- [Expose isRetryableEsClientError
(#228315)](#228315)

<!--- Backport version: 10.2.0 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Rudolf
Meijering","email":"skaapgif@gmail.com"},"sourceCommit":{"committedDate":"2025-10-15T17:21:24Z","message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd","branchLabelMapping":{"^v9.3.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Core","release_note:skip","backport:skip","Team:actionable-obs","v9.3.0"],"title":"Expose
isRetryableEsClientError","number":228315,"url":"https://github.com/elastic/kibana/pull/228315","mergeCommit":{"message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.3.0","branchLabelMappingKey":"^v9.3.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/228315","number":228315,"mergeCommit":{"message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd"}}]}]
BACKPORT-->

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
rudolf added a commit that referenced this pull request Dec 23, 2025
# Backport

This will backport the following commits from `main` to `9.1`:
- [Expose isRetryableEsClientError
(#228315)](#228315)

<!--- Backport version: 10.2.0 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Rudolf
Meijering","email":"skaapgif@gmail.com"},"sourceCommit":{"committedDate":"2025-10-15T17:21:24Z","message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd","branchLabelMapping":{"^v9.3.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Core","release_note:skip","backport:skip","Team:actionable-obs","v9.3.0"],"title":"Expose
isRetryableEsClientError","number":228315,"url":"https://github.com/elastic/kibana/pull/228315","mergeCommit":{"message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.3.0","branchLabelMappingKey":"^v9.3.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/228315","number":228315,"mergeCommit":{"message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd"}}]}]
BACKPORT-->

---------

Co-authored-by: Gerard Soldevila <gerard.soldevila@elastic.co>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
rudolf added a commit that referenced this pull request Dec 23, 2025
# Backport

This will backport the following commits from `main` to `8.19`:
- [Expose isRetryableEsClientError
(#228315)](#228315)

<!--- Backport version: 10.2.0 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Rudolf
Meijering","email":"skaapgif@gmail.com"},"sourceCommit":{"committedDate":"2025-10-15T17:21:24Z","message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd","branchLabelMapping":{"^v9.3.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Core","release_note:skip","backport:skip","Team:actionable-obs","v9.3.0"],"title":"Expose
isRetryableEsClientError","number":228315,"url":"https://github.com/elastic/kibana/pull/228315","mergeCommit":{"message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.3.0","branchLabelMappingKey":"^v9.3.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/228315","number":228315,"mergeCommit":{"message":"Expose
isRetryableEsClientError (#228315)\n\n## Summary\n\nFor plugins to have
robust error handling they need to retry transient\nerrors. This PR
exposes `isRetryableEsClientError` utility to help\nplugins know which
errors are safe to retry.\n\n## Note for reviewers\nCore's
isRetryableEsClientError typically retries on more status codes\nthan
what most plugins do. Instead of keeping the existing plugin
status\ncodes I've adopted the default behavior which means your plugins
will be\nretrying on more types of responses after this change.
Carefully check\nthat this makes sense.\n\nIn particular the two
following status codes get retried that plugins\nweren't previously
retrying:\n * - 429 TooManyRequests (ES circuit breaker)\n* - 504
GatewayTimeout (rare, but happens if cloud proxy cannot\nestablish
socket connection to ES due to ES thread pool being\noverloaded)\n \n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n-
[ ] Review the
[backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand
apply applicable `backport:*` labels.\n\n### Identify risks\n\nDoes this
PR introduce any risks? For example, consider risks like hard\nto test
bugs, performance regression, potential of data loss.\n\nDescribe the
risk, its severity, and mitigation for each identified\nrisk. Invite
stakeholders and evaluate how to proceed before merging.\n\n- [ ] [See
some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c4996849bdd293721ec32a158320af89d79e6fcd"}}]}]
BACKPORT-->

---------

Co-authored-by: Gerard Soldevila <gerard.soldevila@elastic.co>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. Team:Core Platform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t// v8.19.10 v9.1.10 v9.2.3 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.