Skip to content

Conversation

@kibanamachine
Copy link
Contributor

Backport

This will backport the following commits from main to 9.1:

Questions ?

Please refer to the Backport tool documentation

Fixes elastic#230499

The lock manager runs `setupLockManagerIndex` via lodash `once`, so it's
not happening on every call. However, if the first call to
`setupLockManagerIndex` errors out (e.g. because Elasticsearch isn't
ready yet), then every subsequent call will return the cached rejected
promise and fail as well, rendering all lock managers in that node
instance broken (since `once` keeps its state on module scope)

This leads to issues like this (timeout exception thrown from streams,
but the call stack originates from the slo plugin setup routine since
it's the cached rejected promise):
```
[2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams] TimeoutError: Request timed out
    at KibanaTransport._request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50)
    at processTicksAndRejections (node:internal/process/task_queues:105:5)
    at runNextTicks (node:internal/process/task_queues:69:3)
    at listOnTimeout (node:internal/timers:549:9)
    at processTimers (node:internal/timers:523:7)
    at /usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32
    at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20)
    at KibanaTransport.request (/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)
    at Cluster.putComponentTemplate (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16)
    at ensureTemplatesAndIndexCreated (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3)
    at setupLockManagerIndex (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3)
    at LockManager.acquire (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5)
    at withLock (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20)
    at /usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7
```

This PR fixes the problem by not using once but instead keeping the
state manually only if the promise succeeds, passing errors through.

(cherry picked from commit b4f8488)
@kibanamachine kibanamachine added the backport This PR is a backport of another PR label Aug 5, 2025
@kibanamachine kibanamachine enabled auto-merge (squash) August 5, 2025 11:48
@kibanamachine kibanamachine merged commit bca9737 into elastic:9.1 Aug 5, 2025
16 checks passed
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

cc @flash1293

flash1293 added a commit that referenced this pull request Aug 6, 2025
# Backport

This will backport the following commits from `main` to `8.19`:
- [Lock manager: Fix setup bug
(#230519)](#230519)

<!--- Backport version: 10.0.1 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Joe
Reuter","email":"johannes.reuter@elastic.co"},"sourceCommit":{"committedDate":"2025-08-05T11:41:28Z","message":"Lock
manager: Fix setup bug (#230519)\n\nFixes
https://github.com/elastic/kibana/issues/230499\n\nThe lock manager runs
`setupLockManagerIndex` via lodash `once`, so it's\nnot happening on
every call. However, if the first call to\n`setupLockManagerIndex`
errors out (e.g. because Elasticsearch isn't\nready yet), then every
subsequent call will return the cached rejected\npromise and fail as
well, rendering all lock managers in that node\ninstance broken (since
`once` keeps its state on module scope)\n\nThis leads to issues like
this (timeout exception thrown from streams,\nbut the call stack
originates from the slo plugin setup routine since\nit's the cached
rejected
promise):\n```\n[2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams]
TimeoutError: Request timed out\n at KibanaTransport._request
(/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50)\n
at processTicksAndRejections (node:internal/process/task_queues:105:5)\n
at runNextTicks (node:internal/process/task_queues:69:3)\n at
listOnTimeout (node:internal/timers:549:9)\n at processTimers
(node:internal/timers:523:7)\n at
/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32\n
at KibanaTransport.request
(/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20)\n
at KibanaTransport.request
(/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)\n
at Cluster.putComponentTemplate
(/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16)\n
at ensureTemplatesAndIndexCreated
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3)\n
at setupLockManagerIndex
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3)\n
at LockManager.acquire
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5)\n
at withLock
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20)\n
at
/usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7\n```\n\n\nThis
PR fixes the problem by not using once but instead keeping the\nstate
manually only if the promise succeeds, passing errors
through.","sha":"b4f8488f6c8d3758797e2b0efde3c67510b9707d","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:obs-knowledge","backport:version","v9.2.0","v9.1.1","v8.19.1"],"title":"Lock
manager: Fix setup
bug","number":230519,"url":"https://github.com/elastic/kibana/pull/230519","mergeCommit":{"message":"Lock
manager: Fix setup bug (#230519)\n\nFixes
https://github.com/elastic/kibana/issues/230499\n\nThe lock manager runs
`setupLockManagerIndex` via lodash `once`, so it's\nnot happening on
every call. However, if the first call to\n`setupLockManagerIndex`
errors out (e.g. because Elasticsearch isn't\nready yet), then every
subsequent call will return the cached rejected\npromise and fail as
well, rendering all lock managers in that node\ninstance broken (since
`once` keeps its state on module scope)\n\nThis leads to issues like
this (timeout exception thrown from streams,\nbut the call stack
originates from the slo plugin setup routine since\nit's the cached
rejected
promise):\n```\n[2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams]
TimeoutError: Request timed out\n at KibanaTransport._request
(/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50)\n
at processTicksAndRejections (node:internal/process/task_queues:105:5)\n
at runNextTicks (node:internal/process/task_queues:69:3)\n at
listOnTimeout (node:internal/timers:549:9)\n at processTimers
(node:internal/timers:523:7)\n at
/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32\n
at KibanaTransport.request
(/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20)\n
at KibanaTransport.request
(/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)\n
at Cluster.putComponentTemplate
(/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16)\n
at ensureTemplatesAndIndexCreated
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3)\n
at setupLockManagerIndex
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3)\n
at LockManager.acquire
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5)\n
at withLock
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20)\n
at
/usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7\n```\n\n\nThis
PR fixes the problem by not using once but instead keeping the\nstate
manually only if the promise succeeds, passing errors
through.","sha":"b4f8488f6c8d3758797e2b0efde3c67510b9707d"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/230519","number":230519,"mergeCommit":{"message":"Lock
manager: Fix setup bug (#230519)\n\nFixes
https://github.com/elastic/kibana/issues/230499\n\nThe lock manager runs
`setupLockManagerIndex` via lodash `once`, so it's\nnot happening on
every call. However, if the first call to\n`setupLockManagerIndex`
errors out (e.g. because Elasticsearch isn't\nready yet), then every
subsequent call will return the cached rejected\npromise and fail as
well, rendering all lock managers in that node\ninstance broken (since
`once` keeps its state on module scope)\n\nThis leads to issues like
this (timeout exception thrown from streams,\nbut the call stack
originates from the slo plugin setup routine since\nit's the cached
rejected
promise):\n```\n[2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams]
TimeoutError: Request timed out\n at KibanaTransport._request
(/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50)\n
at processTicksAndRejections (node:internal/process/task_queues:105:5)\n
at runNextTicks (node:internal/process/task_queues:69:3)\n at
listOnTimeout (node:internal/timers:549:9)\n at processTimers
(node:internal/timers:523:7)\n at
/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32\n
at KibanaTransport.request
(/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20)\n
at KibanaTransport.request
(/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)\n
at Cluster.putComponentTemplate
(/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16)\n
at ensureTemplatesAndIndexCreated
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3)\n
at setupLockManagerIndex
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3)\n
at LockManager.acquire
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5)\n
at withLock
(/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20)\n
at
/usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7\n```\n\n\nThis
PR fixes the problem by not using once but instead keeping the\nstate
manually only if the promise succeeds, passing errors
through.","sha":"b4f8488f6c8d3758797e2b0efde3c67510b9707d"}},{"branch":"9.1","label":"v9.1.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/230549","number":230549,"state":"MERGED","mergeCommit":{"sha":"bca9737ad0d48d378716b23a0d1c84ca866e164b","message":"[9.1]
Lock manager: Fix setup bug (#230519) (#230549)\n\n# Backport\n\nThis
will backport the following commits from `main` to `9.1`:\n- [Lock
manager: Fix setup
bug\n(#230519)](https://github.com/elastic/kibana/pull/230519)\n\n\n\n###
Questions ?\nPlease refer to the [Backport
tool\ndocumentation](https://github.com/sorenlouv/backport)\n\n\n\nCo-authored-by:
Joe Reuter
<johannes.reuter@elastic.co>"}},{"branch":"8.19","label":"v8.19.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport This PR is a backport of another PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants