[SecuritySolution][SIEM migrations] Implement background task API#197997
[SecuritySolution][SIEM migrations] Implement background task API#197997semd merged 20 commits intoelastic:mainfrom
Conversation
…ns/migration_rule_background_task
…ns/migration_rule_background_task' into 10850/siem_migrations/migration_rule_background_task
|
Pinging @elastic/security-threat-hunting (Team:Threat Hunting) |
|
Pinging @elastic/security-solution (Team: SecuritySolution) |
…ns/migration_rule_background_task' into 10850/siem_migrations/migration_rule_background_task
x-pack/plugins/security_solution/server/lib/siem_migrations/rules/api/create.ts
Show resolved
Hide resolved
x-pack/plugins/security_solution/server/lib/siem_migrations/rules/api/start.ts
Show resolved
Hide resolved
| const matchPrebuiltRuleNode = getMatchPrebuiltRuleNode({ model, prebuiltRulesMap, logger }); | ||
| const translationNode = getTranslateQueryNode({ inferenceClient, connectorId, logger }); | ||
|
|
||
| const translateRuleGraph = new StateGraph(migrateRuleState) |
There was a problem hiding this comment.
Can we have some comments about how StateGraph works? A url link will do too.
gsoldevila
left a comment
There was a problem hiding this comment.
LGTM inference is ai-infra so part of platform group (valid dependency).
x-pack/plugins/security_solution/common/siem_migrations/model/api/rules/rules_migration.gen.ts
Outdated
Show resolved
Hide resolved
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
History
cc @semd |
|
Starting backport for target branches: 8.18, 8.x https://github.com/elastic/kibana/actions/runs/11708722593 |
💔 All backports failed
Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…astic#197997) ## Summary It implements the background task to execute the rule migrations and the API to manage them. It also contains a basic implementation of the langGraph agent workflow that will perform the migration using generative AI. > [!NOTE] > This feature needs `siemMigrationsEnabled` experimental flag enabled to work. Otherwise, the new API routes won't be registered, and the `SiemRuleMigrationsService` _setup_ won't be called. So no migration task code can be reached, and no data stream/template will be installed to ES. ### The rule migration task implementation: - Retrieve a batch of N rule migration documents (50 rules initially, we may change that later) with `status: pending`. - Update those documents to `status: processing`. - Execute the migration for each of the N migrations in parallel. - If there is any error update the document with `status: error`. - For each rule migration that finishes we set the result to the storage, and also update `status: finished`. - When all the batch of rules is finished the task will check if there are still migration documents with `status: pending` if so it will process the next batch with a delay (10 seconds initially, we may change that later). - If the task is stopped (via API call or server shut-down), we do a bulk update for all the `status: processing` documents back to `status: pending`. ### Task API - `POST /internal/siem_migrations/rules` (implemented [here](elastic/security-team#10654)) -> Creates the migration on the backend and stores the original rules. It returns the `migration_id` - `GET /internal/siem_migrations/rules/stats` -> Retrieves the stats for all the existing migrations, aggregated by `migration_id`. - `GET /internal/siem_migrations/rules/{migration_id}` -> Retrieves all the migration rule documents of a specific migration. - `PUT /internal/siem_migrations/rules/{migration_id}/start` -> Starts the background task for a specific migration. - `GET /internal/siem_migrations/rules/{migration_id}/stats` -> Retrieves the stats of a specific migration task. The UI will do polling to this endpoint. - `PUT /internal/siem_migrations/rules/{migration_id}/stop` -> Stops the execution of a specific migration running task. When a migration is stopped, the executing task is aborted and all the rules in the batch being processed are moved back to pending, all finished rules will remain stored. When the Kibana server shuts down all the running migrations are stopped automatically. To resume the migration we can call `{migration_id}/start` again and it will take it from the same rules batch it was left. #### Stats (UI polling) response example: ``` { "status": "running", "rules": { "total": 34, "finished": 20, "pending": 4, "processing": 10, "failed": 0 }, "last_updated_at": "2024-10-29T15:04:49.618Z" } ``` ### LLM agent Graph The initial implementation of the agent graph that is executed per rule:  The first node tries to match the original rule with an Elastic prebuilt rule. If it does not succeed, the second node will try to translate the query as a custom rule using the ES|QL knowledge base, this composes previous PoCs: - elastic#193900 - elastic#196651 ## Testing locally Enable the flag ``` xpack.securitySolution.enableExperimental: ['siemMigrationsEnabled'] ``` cURL request examples: <details> <summary>Rules migration `create` POST request</summary> ``` curl --location --request POST 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules' \ --header 'kbn-xsrf;' \ --header 'x-elastic-internal-origin: security-solution' \ --header 'elastic-api-version: 1' \ --header 'Content-Type: application/json' \ --data '[ { "id": "f8c325ea-506e-4105-8ccf-da1492e90115", "vendor": "splunk", "title": "Linux Auditd Add User Account Type", "description": "The following analytic detects the suspicious add user account type. This behavior is critical for a SOC to monitor because it may indicate attempts to gain unauthorized access or maintain control over a system. Such actions could be signs of malicious activity. If confirmed, this could lead to serious consequences, including a compromised system, unauthorized access to sensitive data, or even a wider breach affecting the entire network. Detecting and responding to these signs early is essential to prevent potential security incidents.", "query": "sourcetype=\"linux:audit\" type=ADD_USER \n| rename hostname as dest \n| stats count min(_time) as firstTime max(_time) as lastTime by exe pid dest res UID type \n| `security_content_ctime(firstTime)` \n| `security_content_ctime(lastTime)`\n| search *", "query_language":"spl", "mitre_attack_ids": [ "T1136" ] }, { "id": "7b87c556-0ca4-47e0-b84c-6cd62a0a3e90", "vendor": "splunk", "title": "Linux Auditd Change File Owner To Root", "description": "The following analytic detects the use of the '\''chown'\'' command to change a file owner to '\''root'\'' on a Linux system. It leverages Linux Auditd telemetry, specifically monitoring command-line executions and process details. This activity is significant as it may indicate an attempt to escalate privileges by adversaries, malware, or red teamers. If confirmed malicious, this action could allow an attacker to gain root-level access, leading to full control over the compromised host and potential persistence within the environment.", "query": "`linux_auditd` `linux_auditd_normalized_proctitle_process`\r\n| rename host as dest \r\n| where LIKE (process_exec, \"%chown %root%\") \r\n| stats count min(_time) as firstTime max(_time) as lastTime by process_exec proctitle normalized_proctitle_delimiter dest \r\n| `security_content_ctime(firstTime)` \r\n| `security_content_ctime(lastTime)`\r\n| `linux_auditd_change_file_owner_to_root_filter`", "query_language": "spl", "mitre_attack_ids": [ "T1222" ] } ]' ``` </details> <details> <summary>Rules migration `start` task request</summary> - Assuming the connector `azureOpenAiGPT4o` is already created in the local environment. - Using the {{`migration_id`}} from the first POST request response ``` curl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/start' \ --header 'kbn-xsrf;' \ --header 'x-elastic-internal-origin: security-solution' \ --header 'elastic-api-version: 1' \ --header 'Content-Type: application/json' \ --data '{ "connectorId": "azureOpenAiGPT4o" }' ``` </details> <details> <summary>Rules migration `stop` task request</summary> - Using the {{`migration_id`}} from the first POST request response. ``` curl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stop' \ --header 'kbn-xsrf;' \ --header 'x-elastic-internal-origin: security-solution' \ --header 'elastic-api-version: 1' ``` </details> <details> <summary>Rules migration task `stats` request</summary> - Using the {{`migration_id`}} from the first POST request response. ``` curl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stats' \ --header 'kbn-xsrf;' \ --header 'x-elastic-internal-origin: security-solution' \ --header 'elastic-api-version: 1' ``` </details> <details> <summary>Rules migration rules documents request</summary> - Using the {{`migration_id`}} from the first POST request response. ``` curl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}' \ --header 'kbn-xsrf;' \ --header 'x-elastic-internal-origin: security-solution' \ --header 'elastic-api-version: 1' ``` </details> <details> <summary>Rules migration all stats request</summary> ``` curl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/stats' \ --header 'kbn-xsrf;' \ --header 'x-elastic-internal-origin: security-solution' \ --header 'elastic-api-version: 1' ``` </details> --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit cc66320) # Conflicts: # x-pack/plugins/security_solution/common/api/quickstart_client.gen.ts # x-pack/plugins/security_solution/server/types.ts # x-pack/test/api_integration/services/security_solution_api.gen.ts
…PI (#197997) (#199209) # Backport This will backport the following commits from `main` to `8.x`: - [[SecuritySolution][SIEM migrations] Implement background task API (#197997)](#197997) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Sergi Massaneda","email":"sergi.massaneda@elastic.co"},"sourceCommit":{"committedDate":"2024-11-06T17:25:24Z","message":"[SecuritySolution][SIEM migrations] Implement background task API (#197997)\n\n## Summary\r\n\r\nIt implements the background task to execute the rule migrations and the\r\nAPI to manage them. It also contains a basic implementation of the\r\nlangGraph agent workflow that will perform the migration using\r\ngenerative AI.\r\n\r\n> [!NOTE] \r\n> This feature needs `siemMigrationsEnabled` experimental flag enabled\r\nto work. Otherwise, the new API routes won't be registered, and the\r\n`SiemRuleMigrationsService` _setup_ won't be called. So no migration\r\ntask code can be reached, and no data stream/template will be installed\r\nto ES.\r\n\r\n### The rule migration task implementation:\r\n\r\n- Retrieve a batch of N rule migration documents (50 rules initially, we\r\nmay change that later) with `status: pending`.\r\n- Update those documents to `status: processing`.\r\n- Execute the migration for each of the N migrations in parallel.\r\n- If there is any error update the document with `status: error`.\r\n- For each rule migration that finishes we set the result to the\r\nstorage, and also update `status: finished`.\r\n- When all the batch of rules is finished the task will check if there\r\nare still migration documents with `status: pending` if so it will\r\nprocess the next batch with a delay (10 seconds initially, we may change\r\nthat later).\r\n- If the task is stopped (via API call or server shut-down), we do a\r\nbulk update for all the `status: processing` documents back to `status:\r\npending`.\r\n\r\n### Task API\r\n\r\n- `POST /internal/siem_migrations/rules` (implemented\r\n[here](elastic/security-team#10654)) ->\r\nCreates the migration on the backend and stores the original rules. It\r\nreturns the `migration_id`\r\n- `GET /internal/siem_migrations/rules/stats` -> Retrieves the stats for\r\nall the existing migrations, aggregated by `migration_id`.\r\n- `GET /internal/siem_migrations/rules/{migration_id}` -> Retrieves all\r\nthe migration rule documents of a specific migration.\r\n- `PUT /internal/siem_migrations/rules/{migration_id}/start` -> Starts\r\nthe background task for a specific migration.\r\n- `GET /internal/siem_migrations/rules/{migration_id}/stats` ->\r\nRetrieves the stats of a specific migration task. The UI will do polling\r\nto this endpoint.\r\n- `PUT /internal/siem_migrations/rules/{migration_id}/stop` -> Stops the\r\nexecution of a specific migration running task. When a migration is\r\nstopped, the executing task is aborted and all the rules in the batch\r\nbeing processed are moved back to pending, all finished rules will\r\nremain stored. When the Kibana server shuts down all the running\r\nmigrations are stopped automatically. To resume the migration we can\r\ncall `{migration_id}/start` again and it will take it from the same\r\nrules batch it was left.\r\n\r\n#### Stats (UI polling) response example:\r\n```\r\n{\r\n \"status\": \"running\",\r\n \"rules\": {\r\n \"total\": 34,\r\n \"finished\": 20,\r\n \"pending\": 4,\r\n \"processing\": 10,\r\n \"failed\": 0\r\n },\r\n \"last_updated_at\": \"2024-10-29T15:04:49.618Z\"\r\n}\r\n```\r\n\r\n### LLM agent Graph\r\n\r\nThe initial implementation of the agent graph that is executed per rule:\r\n\r\n\r\n\r\nThe first node tries to match the original rule with an Elastic prebuilt\r\nrule. If it does not succeed, the second node will try to translate the\r\nquery as a custom rule using the ES|QL knowledge base, this composes\r\nprevious PoCs:\r\n- https://github.com/elastic/kibana/pull/193900\r\n- https://github.com/elastic/kibana/pull/196651\r\n\r\n\r\n\r\n## Testing locally\r\n\r\nEnable the flag\r\n```\r\nxpack.securitySolution.enableExperimental: ['siemMigrationsEnabled']\r\n```\r\n\r\ncURL request examples:\r\n\r\n<details>\r\n <summary>Rules migration `create` POST request</summary>\r\n\r\n```\r\ncurl --location --request POST 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \\\r\n--header 'Content-Type: application/json' \\\r\n--data '[\r\n {\r\n \"id\": \"f8c325ea-506e-4105-8ccf-da1492e90115\",\r\n \"vendor\": \"splunk\",\r\n \"title\": \"Linux Auditd Add User Account Type\",\r\n \"description\": \"The following analytic detects the suspicious add user account type. This behavior is critical for a SOC to monitor because it may indicate attempts to gain unauthorized access or maintain control over a system. Such actions could be signs of malicious activity. If confirmed, this could lead to serious consequences, including a compromised system, unauthorized access to sensitive data, or even a wider breach affecting the entire network. Detecting and responding to these signs early is essential to prevent potential security incidents.\",\r\n \"query\": \"sourcetype=\\\"linux:audit\\\" type=ADD_USER \\n| rename hostname as dest \\n| stats count min(_time) as firstTime max(_time) as lastTime by exe pid dest res UID type \\n| `security_content_ctime(firstTime)` \\n| `security_content_ctime(lastTime)`\\n| search *\",\r\n \"query_language\":\"spl\",\r\n \"mitre_attack_ids\": [\r\n \"T1136\"\r\n ]\r\n },\r\n {\r\n \"id\": \"7b87c556-0ca4-47e0-b84c-6cd62a0a3e90\",\r\n \"vendor\": \"splunk\",\r\n \"title\": \"Linux Auditd Change File Owner To Root\",\r\n \"description\": \"The following analytic detects the use of the '\\''chown'\\'' command to change a file owner to '\\''root'\\'' on a Linux system. It leverages Linux Auditd telemetry, specifically monitoring command-line executions and process details. This activity is significant as it may indicate an attempt to escalate privileges by adversaries, malware, or red teamers. If confirmed malicious, this action could allow an attacker to gain root-level access, leading to full control over the compromised host and potential persistence within the environment.\",\r\n \"query\": \"`linux_auditd` `linux_auditd_normalized_proctitle_process`\\r\\n| rename host as dest \\r\\n| where LIKE (process_exec, \\\"%chown %root%\\\") \\r\\n| stats count min(_time) as firstTime max(_time) as lastTime by process_exec proctitle normalized_proctitle_delimiter dest \\r\\n| `security_content_ctime(firstTime)` \\r\\n| `security_content_ctime(lastTime)`\\r\\n| `linux_auditd_change_file_owner_to_root_filter`\",\r\n \"query_language\": \"spl\",\r\n \"mitre_attack_ids\": [\r\n \"T1222\"\r\n ]\r\n }\r\n]'\r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration `start` task request</summary>\r\n\r\n- Assuming the connector `azureOpenAiGPT4o` is already created in the\r\nlocal environment.\r\n- Using the {{`migration_id`}} from the first POST request response\r\n\r\n```\r\ncurl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/start' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \\\r\n--header 'Content-Type: application/json' \\\r\n--data '{\r\n \"connectorId\": \"azureOpenAiGPT4o\"\r\n}'\r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration `stop` task request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stop' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n\r\n<details>\r\n <summary>Rules migration task `stats` request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stats' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration rules documents request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration all stats request</summary>\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/stats' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"cc66320e970443cede6b9c9a4ab67fb16062e1a4","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Threat Hunting","Team: SecuritySolution","backport:prev-minor","v8.18.0"],"number":197997,"url":"https://github.com/elastic/kibana/pull/197997","mergeCommit":{"message":"[SecuritySolution][SIEM migrations] Implement background task API (#197997)\n\n## Summary\r\n\r\nIt implements the background task to execute the rule migrations and the\r\nAPI to manage them. It also contains a basic implementation of the\r\nlangGraph agent workflow that will perform the migration using\r\ngenerative AI.\r\n\r\n> [!NOTE] \r\n> This feature needs `siemMigrationsEnabled` experimental flag enabled\r\nto work. Otherwise, the new API routes won't be registered, and the\r\n`SiemRuleMigrationsService` _setup_ won't be called. So no migration\r\ntask code can be reached, and no data stream/template will be installed\r\nto ES.\r\n\r\n### The rule migration task implementation:\r\n\r\n- Retrieve a batch of N rule migration documents (50 rules initially, we\r\nmay change that later) with `status: pending`.\r\n- Update those documents to `status: processing`.\r\n- Execute the migration for each of the N migrations in parallel.\r\n- If there is any error update the document with `status: error`.\r\n- For each rule migration that finishes we set the result to the\r\nstorage, and also update `status: finished`.\r\n- When all the batch of rules is finished the task will check if there\r\nare still migration documents with `status: pending` if so it will\r\nprocess the next batch with a delay (10 seconds initially, we may change\r\nthat later).\r\n- If the task is stopped (via API call or server shut-down), we do a\r\nbulk update for all the `status: processing` documents back to `status:\r\npending`.\r\n\r\n### Task API\r\n\r\n- `POST /internal/siem_migrations/rules` (implemented\r\n[here](elastic/security-team#10654)) ->\r\nCreates the migration on the backend and stores the original rules. It\r\nreturns the `migration_id`\r\n- `GET /internal/siem_migrations/rules/stats` -> Retrieves the stats for\r\nall the existing migrations, aggregated by `migration_id`.\r\n- `GET /internal/siem_migrations/rules/{migration_id}` -> Retrieves all\r\nthe migration rule documents of a specific migration.\r\n- `PUT /internal/siem_migrations/rules/{migration_id}/start` -> Starts\r\nthe background task for a specific migration.\r\n- `GET /internal/siem_migrations/rules/{migration_id}/stats` ->\r\nRetrieves the stats of a specific migration task. The UI will do polling\r\nto this endpoint.\r\n- `PUT /internal/siem_migrations/rules/{migration_id}/stop` -> Stops the\r\nexecution of a specific migration running task. When a migration is\r\nstopped, the executing task is aborted and all the rules in the batch\r\nbeing processed are moved back to pending, all finished rules will\r\nremain stored. When the Kibana server shuts down all the running\r\nmigrations are stopped automatically. To resume the migration we can\r\ncall `{migration_id}/start` again and it will take it from the same\r\nrules batch it was left.\r\n\r\n#### Stats (UI polling) response example:\r\n```\r\n{\r\n \"status\": \"running\",\r\n \"rules\": {\r\n \"total\": 34,\r\n \"finished\": 20,\r\n \"pending\": 4,\r\n \"processing\": 10,\r\n \"failed\": 0\r\n },\r\n \"last_updated_at\": \"2024-10-29T15:04:49.618Z\"\r\n}\r\n```\r\n\r\n### LLM agent Graph\r\n\r\nThe initial implementation of the agent graph that is executed per rule:\r\n\r\n\r\n\r\nThe first node tries to match the original rule with an Elastic prebuilt\r\nrule. If it does not succeed, the second node will try to translate the\r\nquery as a custom rule using the ES|QL knowledge base, this composes\r\nprevious PoCs:\r\n- https://github.com/elastic/kibana/pull/193900\r\n- https://github.com/elastic/kibana/pull/196651\r\n\r\n\r\n\r\n## Testing locally\r\n\r\nEnable the flag\r\n```\r\nxpack.securitySolution.enableExperimental: ['siemMigrationsEnabled']\r\n```\r\n\r\ncURL request examples:\r\n\r\n<details>\r\n <summary>Rules migration `create` POST request</summary>\r\n\r\n```\r\ncurl --location --request POST 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \\\r\n--header 'Content-Type: application/json' \\\r\n--data '[\r\n {\r\n \"id\": \"f8c325ea-506e-4105-8ccf-da1492e90115\",\r\n \"vendor\": \"splunk\",\r\n \"title\": \"Linux Auditd Add User Account Type\",\r\n \"description\": \"The following analytic detects the suspicious add user account type. This behavior is critical for a SOC to monitor because it may indicate attempts to gain unauthorized access or maintain control over a system. Such actions could be signs of malicious activity. If confirmed, this could lead to serious consequences, including a compromised system, unauthorized access to sensitive data, or even a wider breach affecting the entire network. Detecting and responding to these signs early is essential to prevent potential security incidents.\",\r\n \"query\": \"sourcetype=\\\"linux:audit\\\" type=ADD_USER \\n| rename hostname as dest \\n| stats count min(_time) as firstTime max(_time) as lastTime by exe pid dest res UID type \\n| `security_content_ctime(firstTime)` \\n| `security_content_ctime(lastTime)`\\n| search *\",\r\n \"query_language\":\"spl\",\r\n \"mitre_attack_ids\": [\r\n \"T1136\"\r\n ]\r\n },\r\n {\r\n \"id\": \"7b87c556-0ca4-47e0-b84c-6cd62a0a3e90\",\r\n \"vendor\": \"splunk\",\r\n \"title\": \"Linux Auditd Change File Owner To Root\",\r\n \"description\": \"The following analytic detects the use of the '\\''chown'\\'' command to change a file owner to '\\''root'\\'' on a Linux system. It leverages Linux Auditd telemetry, specifically monitoring command-line executions and process details. This activity is significant as it may indicate an attempt to escalate privileges by adversaries, malware, or red teamers. If confirmed malicious, this action could allow an attacker to gain root-level access, leading to full control over the compromised host and potential persistence within the environment.\",\r\n \"query\": \"`linux_auditd` `linux_auditd_normalized_proctitle_process`\\r\\n| rename host as dest \\r\\n| where LIKE (process_exec, \\\"%chown %root%\\\") \\r\\n| stats count min(_time) as firstTime max(_time) as lastTime by process_exec proctitle normalized_proctitle_delimiter dest \\r\\n| `security_content_ctime(firstTime)` \\r\\n| `security_content_ctime(lastTime)`\\r\\n| `linux_auditd_change_file_owner_to_root_filter`\",\r\n \"query_language\": \"spl\",\r\n \"mitre_attack_ids\": [\r\n \"T1222\"\r\n ]\r\n }\r\n]'\r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration `start` task request</summary>\r\n\r\n- Assuming the connector `azureOpenAiGPT4o` is already created in the\r\nlocal environment.\r\n- Using the {{`migration_id`}} from the first POST request response\r\n\r\n```\r\ncurl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/start' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \\\r\n--header 'Content-Type: application/json' \\\r\n--data '{\r\n \"connectorId\": \"azureOpenAiGPT4o\"\r\n}'\r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration `stop` task request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stop' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n\r\n<details>\r\n <summary>Rules migration task `stats` request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stats' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration rules documents request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration all stats request</summary>\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/stats' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"cc66320e970443cede6b9c9a4ab67fb16062e1a4"}},"sourceBranch":"main","suggestedTargetBranches":["8.18"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","labelRegex":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197997","number":197997,"mergeCommit":{"message":"[SecuritySolution][SIEM migrations] Implement background task API (#197997)\n\n## Summary\r\n\r\nIt implements the background task to execute the rule migrations and the\r\nAPI to manage them. It also contains a basic implementation of the\r\nlangGraph agent workflow that will perform the migration using\r\ngenerative AI.\r\n\r\n> [!NOTE] \r\n> This feature needs `siemMigrationsEnabled` experimental flag enabled\r\nto work. Otherwise, the new API routes won't be registered, and the\r\n`SiemRuleMigrationsService` _setup_ won't be called. So no migration\r\ntask code can be reached, and no data stream/template will be installed\r\nto ES.\r\n\r\n### The rule migration task implementation:\r\n\r\n- Retrieve a batch of N rule migration documents (50 rules initially, we\r\nmay change that later) with `status: pending`.\r\n- Update those documents to `status: processing`.\r\n- Execute the migration for each of the N migrations in parallel.\r\n- If there is any error update the document with `status: error`.\r\n- For each rule migration that finishes we set the result to the\r\nstorage, and also update `status: finished`.\r\n- When all the batch of rules is finished the task will check if there\r\nare still migration documents with `status: pending` if so it will\r\nprocess the next batch with a delay (10 seconds initially, we may change\r\nthat later).\r\n- If the task is stopped (via API call or server shut-down), we do a\r\nbulk update for all the `status: processing` documents back to `status:\r\npending`.\r\n\r\n### Task API\r\n\r\n- `POST /internal/siem_migrations/rules` (implemented\r\n[here](elastic/security-team#10654)) ->\r\nCreates the migration on the backend and stores the original rules. It\r\nreturns the `migration_id`\r\n- `GET /internal/siem_migrations/rules/stats` -> Retrieves the stats for\r\nall the existing migrations, aggregated by `migration_id`.\r\n- `GET /internal/siem_migrations/rules/{migration_id}` -> Retrieves all\r\nthe migration rule documents of a specific migration.\r\n- `PUT /internal/siem_migrations/rules/{migration_id}/start` -> Starts\r\nthe background task for a specific migration.\r\n- `GET /internal/siem_migrations/rules/{migration_id}/stats` ->\r\nRetrieves the stats of a specific migration task. The UI will do polling\r\nto this endpoint.\r\n- `PUT /internal/siem_migrations/rules/{migration_id}/stop` -> Stops the\r\nexecution of a specific migration running task. When a migration is\r\nstopped, the executing task is aborted and all the rules in the batch\r\nbeing processed are moved back to pending, all finished rules will\r\nremain stored. When the Kibana server shuts down all the running\r\nmigrations are stopped automatically. To resume the migration we can\r\ncall `{migration_id}/start` again and it will take it from the same\r\nrules batch it was left.\r\n\r\n#### Stats (UI polling) response example:\r\n```\r\n{\r\n \"status\": \"running\",\r\n \"rules\": {\r\n \"total\": 34,\r\n \"finished\": 20,\r\n \"pending\": 4,\r\n \"processing\": 10,\r\n \"failed\": 0\r\n },\r\n \"last_updated_at\": \"2024-10-29T15:04:49.618Z\"\r\n}\r\n```\r\n\r\n### LLM agent Graph\r\n\r\nThe initial implementation of the agent graph that is executed per rule:\r\n\r\n\r\n\r\nThe first node tries to match the original rule with an Elastic prebuilt\r\nrule. If it does not succeed, the second node will try to translate the\r\nquery as a custom rule using the ES|QL knowledge base, this composes\r\nprevious PoCs:\r\n- https://github.com/elastic/kibana/pull/193900\r\n- https://github.com/elastic/kibana/pull/196651\r\n\r\n\r\n\r\n## Testing locally\r\n\r\nEnable the flag\r\n```\r\nxpack.securitySolution.enableExperimental: ['siemMigrationsEnabled']\r\n```\r\n\r\ncURL request examples:\r\n\r\n<details>\r\n <summary>Rules migration `create` POST request</summary>\r\n\r\n```\r\ncurl --location --request POST 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \\\r\n--header 'Content-Type: application/json' \\\r\n--data '[\r\n {\r\n \"id\": \"f8c325ea-506e-4105-8ccf-da1492e90115\",\r\n \"vendor\": \"splunk\",\r\n \"title\": \"Linux Auditd Add User Account Type\",\r\n \"description\": \"The following analytic detects the suspicious add user account type. This behavior is critical for a SOC to monitor because it may indicate attempts to gain unauthorized access or maintain control over a system. Such actions could be signs of malicious activity. If confirmed, this could lead to serious consequences, including a compromised system, unauthorized access to sensitive data, or even a wider breach affecting the entire network. Detecting and responding to these signs early is essential to prevent potential security incidents.\",\r\n \"query\": \"sourcetype=\\\"linux:audit\\\" type=ADD_USER \\n| rename hostname as dest \\n| stats count min(_time) as firstTime max(_time) as lastTime by exe pid dest res UID type \\n| `security_content_ctime(firstTime)` \\n| `security_content_ctime(lastTime)`\\n| search *\",\r\n \"query_language\":\"spl\",\r\n \"mitre_attack_ids\": [\r\n \"T1136\"\r\n ]\r\n },\r\n {\r\n \"id\": \"7b87c556-0ca4-47e0-b84c-6cd62a0a3e90\",\r\n \"vendor\": \"splunk\",\r\n \"title\": \"Linux Auditd Change File Owner To Root\",\r\n \"description\": \"The following analytic detects the use of the '\\''chown'\\'' command to change a file owner to '\\''root'\\'' on a Linux system. It leverages Linux Auditd telemetry, specifically monitoring command-line executions and process details. This activity is significant as it may indicate an attempt to escalate privileges by adversaries, malware, or red teamers. If confirmed malicious, this action could allow an attacker to gain root-level access, leading to full control over the compromised host and potential persistence within the environment.\",\r\n \"query\": \"`linux_auditd` `linux_auditd_normalized_proctitle_process`\\r\\n| rename host as dest \\r\\n| where LIKE (process_exec, \\\"%chown %root%\\\") \\r\\n| stats count min(_time) as firstTime max(_time) as lastTime by process_exec proctitle normalized_proctitle_delimiter dest \\r\\n| `security_content_ctime(firstTime)` \\r\\n| `security_content_ctime(lastTime)`\\r\\n| `linux_auditd_change_file_owner_to_root_filter`\",\r\n \"query_language\": \"spl\",\r\n \"mitre_attack_ids\": [\r\n \"T1222\"\r\n ]\r\n }\r\n]'\r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration `start` task request</summary>\r\n\r\n- Assuming the connector `azureOpenAiGPT4o` is already created in the\r\nlocal environment.\r\n- Using the {{`migration_id`}} from the first POST request response\r\n\r\n```\r\ncurl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/start' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \\\r\n--header 'Content-Type: application/json' \\\r\n--data '{\r\n \"connectorId\": \"azureOpenAiGPT4o\"\r\n}'\r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration `stop` task request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stop' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n\r\n<details>\r\n <summary>Rules migration task `stats` request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stats' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration rules documents request</summary>\r\n\r\n- Using the {{`migration_id`}} from the first POST request response.\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n<details>\r\n <summary>Rules migration all stats request</summary>\r\n\r\n```\r\ncurl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/stats' \\\r\n--header 'kbn-xsrf;' \\\r\n--header 'x-elastic-internal-origin: security-solution' \\\r\n--header 'elastic-api-version: 1' \r\n```\r\n</details>\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"cc66320e970443cede6b9c9a4ab67fb16062e1a4"}},{"branch":"8.18","label":"v8.18.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT-->
Summary
It implements the background task to execute the rule migrations and the API to manage them. It also contains a basic implementation of the langGraph agent workflow that will perform the migration using generative AI.
Note
This feature needs
siemMigrationsEnabledexperimental flag enabled to work. Otherwise, the new API routes won't be registered, and theSiemRuleMigrationsServicesetup won't be called. So no migration task code can be reached, and no data stream/template will be installed to ES.The rule migration task implementation:
status: pending.status: processing.status: error.status: finished.status: pendingif so it will process the next batch with a delay (10 seconds initially, we may change that later).status: processingdocuments back tostatus: pending.Task API
POST /internal/siem_migrations/rules(implemented here) -> Creates the migration on the backend and stores the original rules. It returns themigration_idGET /internal/siem_migrations/rules/stats-> Retrieves the stats for all the existing migrations, aggregated bymigration_id.GET /internal/siem_migrations/rules/{migration_id}-> Retrieves all the migration rule documents of a specific migration.PUT /internal/siem_migrations/rules/{migration_id}/start-> Starts the background task for a specific migration.GET /internal/siem_migrations/rules/{migration_id}/stats-> Retrieves the stats of a specific migration task. The UI will do polling to this endpoint.PUT /internal/siem_migrations/rules/{migration_id}/stop-> Stops the execution of a specific migration running task. When a migration is stopped, the executing task is aborted and all the rules in the batch being processed are moved back to pending, all finished rules will remain stored. When the Kibana server shuts down all the running migrations are stopped automatically. To resume the migration we can call{migration_id}/startagain and it will take it from the same rules batch it was left.Stats (UI polling) response example:
LLM agent Graph
The initial implementation of the agent graph that is executed per rule:
The first node tries to match the original rule with an Elastic prebuilt rule. If it does not succeed, the second node will try to translate the query as a custom rule using the ES|QL knowledge base, this composes previous PoCs:
Testing locally
Enable the flag
cURL request examples:
Rules migration `create` POST request
Rules migration `start` task request
azureOpenAiGPT4ois already created in the local environment.migration_id}} from the first POST request responseRules migration `stop` task request
migration_id}} from the first POST request response.Rules migration task `stats` request
migration_id}} from the first POST request response.Rules migration rules documents request
migration_id}} from the first POST request response.Rules migration all stats request