Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[8.16] [Response Ops][Task Manager] Propagate `msearch` err…
…or status code so backpressure mechanism responds correctly (#197501) (#198034) # Backport This will backport the following commits from `main` to `8.16`: - [[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)](#197501) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-10-28T15:43:51Z","message":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)\n\nResolves https://github.com/elastic/response-ops-team/issues/240\r\n\r\n## Summary\r\n\r\nCreating an `MsearchError` class that preserves the status code from any\r\nmsearch errors. These errors are already piped to the managed\r\nconfiguration observable that watches for and responds to ES errors from\r\nthe update by query claim strategy so I updated that filter to filter\r\nfor msearch 429 and 503 errors as well.\r\n\r\n## To Verify\r\n\r\n1. Make sure you're using the mget claim strategy\r\n(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.\r\n2. Inject a 429 error into an msearch response.\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_store.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_store.ts\r\n@@ -571,6 +571,8 @@ export class TaskStore {\r\n });\r\n const { responses } = result;\r\n\r\n+ responses[0].status = 429;\r\n+\r\n const versionMap = this.createVersionMap([]);\r\n```\r\n\r\n3. See task manager log the msearch errors and eventually reduce polling\r\ncapacity\r\n\r\n```\r\n[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n...\r\n\r\n[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"043e18b6a097f4405ff37a99396c0c8c92db6b44","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0","v8.17.0"],"title":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly","number":197501,"url":"https://github.com/elastic/kibana/pull/197501","mergeCommit":{"message":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)\n\nResolves https://github.com/elastic/response-ops-team/issues/240\r\n\r\n## Summary\r\n\r\nCreating an `MsearchError` class that preserves the status code from any\r\nmsearch errors. These errors are already piped to the managed\r\nconfiguration observable that watches for and responds to ES errors from\r\nthe update by query claim strategy so I updated that filter to filter\r\nfor msearch 429 and 503 errors as well.\r\n\r\n## To Verify\r\n\r\n1. Make sure you're using the mget claim strategy\r\n(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.\r\n2. Inject a 429 error into an msearch response.\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_store.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_store.ts\r\n@@ -571,6 +571,8 @@ export class TaskStore {\r\n });\r\n const { responses } = result;\r\n\r\n+ responses[0].status = 429;\r\n+\r\n const versionMap = this.createVersionMap([]);\r\n```\r\n\r\n3. See task manager log the msearch errors and eventually reduce polling\r\ncapacity\r\n\r\n```\r\n[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n...\r\n\r\n[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"043e18b6a097f4405ff37a99396c0c8c92db6b44"}},"sourceBranch":"main","suggestedTargetBranches":["8.16","8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197501","number":197501,"mergeCommit":{"message":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)\n\nResolves https://github.com/elastic/response-ops-team/issues/240\r\n\r\n## Summary\r\n\r\nCreating an `MsearchError` class that preserves the status code from any\r\nmsearch errors. These errors are already piped to the managed\r\nconfiguration observable that watches for and responds to ES errors from\r\nthe update by query claim strategy so I updated that filter to filter\r\nfor msearch 429 and 503 errors as well.\r\n\r\n## To Verify\r\n\r\n1. Make sure you're using the mget claim strategy\r\n(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.\r\n2. Inject a 429 error into an msearch response.\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_store.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_store.ts\r\n@@ -571,6 +571,8 @@ export class TaskStore {\r\n });\r\n const { responses } = result;\r\n\r\n+ responses[0].status = 429;\r\n+\r\n const versionMap = this.createVersionMap([]);\r\n```\r\n\r\n3. See task manager log the msearch errors and eventually reduce polling\r\ncapacity\r\n\r\n```\r\n[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n...\r\n\r\n[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"043e18b6a097f4405ff37a99396c0c8c92db6b44"}},{"branch":"8.16","label":"v8.16.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <[email protected]>
- Loading branch information