[Automatic Import] Adding support for larger samples in ECS graph#190426
Merged
P1llus merged 22 commits intoelastic:mainfrom Aug 23, 2024
Merged
[Automatic Import] Adding support for larger samples in ECS graph#190426P1llus merged 22 commits intoelastic:mainfrom
P1llus merged 22 commits intoelastic:mainfrom
Conversation
Member
Author
|
@spong FYI on dependency bump we talked about. |
Contributor
|
Pinging @elastic/security-scalability (Team:Security-Scalability) |
bhapas
reviewed
Aug 14, 2024
Contributor
bhapas
left a comment
There was a problem hiding this comment.
Overall looks good. Just minor questions / comments
x-pack/plugins/integration_assistant/server/graphs/ecs/chunk.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/integration_assistant/server/graphs/ecs/graph.ts
Outdated
Show resolved
Hide resolved
Member
Author
|
For the last failed types I am waiting on some guidance from the code owners, see if we can resolve the more strict type checking on agent state that might have been the result of bumping the dependencies. |
2 tasks
P1llus
added a commit
that referenced
this pull request
Aug 22, 2024
## Summary **NOTE** I will need help testing this before we merge it! I spoke with @spong about an upcoming PR we have here: #190426 which bumps the langgraph version from 0.0.31 to 0.0.34, unfortunately this caused a lot of type errors in the default assistant. After some more discussion we proposed to open a PR that removes some of the more complex layers and to fix up the type issues. Though I have not worked on this graph before, the changes hopefully makes sense 👍 Graph flow:  The PR changes the below items to remove some of the abstractions and resolve some of the type issues, also adds a few improvements in general: - Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId` to be invoke parameters rather than compile parameters. This allows them to be used in state, and removes the need to pass them everywhere as parameters. Adding them to the state also allows them to be available in langsmith. - Removes the constants defining each node with wrappers and rather expose them directly as async functions. This removes a lot of the boilerplate code and it makes reading the stacktraces much easier. - Moved to a single `stepRouter` used for the current conditional edges. This allows one to very easily extend the routing between either existing or new nodes, and makes it much easier to understand what conditions are routed where. - Exports a common `NodeType` object constant (no need for the extra compile overhead of Enums here, we are only using strings), to make the node name strings auto-complete and prevent hardcoded names for the router. - Added a `modelInput` node to be the starter node. This was first because adding nodes inside if conditions usually create errors, so it was created to be able to set the `hasRespondStep` state. However this node is nice to have as an entrypoint in which you find yourself wanting to change the state based on the invoke parameters or other conditions retrieved from other parts of the stack etc before it continues to any of the other nodes. - Added a `yarn draw-graph` command, that outputs to `docs/img/default_assistant_graph.png`. This is then also included in the readme. This makes it better for changes by other teams (like me) to understand the intended graph workflows easier. ### Checklist Delete any items that are not applicable to this PR. - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Member
Author
|
@elasticmachine merge upstream |
2 tasks
💚 Build Succeeded
Metrics [docs]Unknown metric groupsESLint disabled in files
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
kibanamachine
added a commit
to kibanamachine/kibana
that referenced
this pull request
Aug 26, 2024
…astic#190426) ## Summary This PR prepares the ECS Mapping graph to support larger samples by chunking and running certain parts of the graph concurrently side by side and merging the results rather than trying to use one large context. More details below, but in general there is only a slight modification to the actual code, most of the lines are related to moving code around to new files and updated tests. There are also some minor tweaks to the ECS graph code in general, below is the related changes: 1. Moved some code out of graph.ts to make it a bit smaller (moved model* functions to a new model.ts, moved state to its own file. 2. Added chunkSize as a optional input to the graph (default to 10 fields with an actual string value per chunk). Just to allow it to be overwritten if necessary later. 3. Renamed the `samples` state to `prefixedSamples` and `formattedSamples` to `combinedSamples` as it got really confusing at some point when debugging. I also updated the function argument names that used them to the new names to better understand which sample type they are using. 4. Renamed `modifySamples` to `prefixSamples` to clarify what it actually modifies 5. Moved `mapping`, `invalid`, `duplicate`, `missing` and `validate` nodes to its own subgraph. The `combinedSamples` state is now set when invoking the subgraph, the value will be its related `chunk`, so it only needs to work on this smaller subset of data. 6. The `currentMapping` state is now only used by the sub graph, once all the subgraphs has finished, the will post their own results to `finalMapping` state. This state uses a reducer function, that combines the existing state with the new, so all results from the X subgraphs running will be merged into the same resulting object as before this PR. ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 8e66a3e)
Contributor
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
P1llus
added a commit
to P1llus/kibana
that referenced
this pull request
Aug 27, 2024
## Summary **NOTE** I will need help testing this before we merge it! I spoke with @spong about an upcoming PR we have here: elastic#190426 which bumps the langgraph version from 0.0.31 to 0.0.34, unfortunately this caused a lot of type errors in the default assistant. After some more discussion we proposed to open a PR that removes some of the more complex layers and to fix up the type issues. Though I have not worked on this graph before, the changes hopefully makes sense 👍 Graph flow:  The PR changes the below items to remove some of the abstractions and resolve some of the type issues, also adds a few improvements in general: - Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId` to be invoke parameters rather than compile parameters. This allows them to be used in state, and removes the need to pass them everywhere as parameters. Adding them to the state also allows them to be available in langsmith. - Removes the constants defining each node with wrappers and rather expose them directly as async functions. This removes a lot of the boilerplate code and it makes reading the stacktraces much easier. - Moved to a single `stepRouter` used for the current conditional edges. This allows one to very easily extend the routing between either existing or new nodes, and makes it much easier to understand what conditions are routed where. - Exports a common `NodeType` object constant (no need for the extra compile overhead of Enums here, we are only using strings), to make the node name strings auto-complete and prevent hardcoded names for the router. - Added a `modelInput` node to be the starter node. This was first because adding nodes inside if conditions usually create errors, so it was created to be able to set the `hasRespondStep` state. However this node is nice to have as an entrypoint in which you find yourself wanting to change the state based on the invoke parameters or other conditions retrieved from other parts of the stack etc before it continues to any of the other nodes. - Added a `yarn draw-graph` command, that outputs to `docs/img/default_assistant_graph.png`. This is then also included in the readme. This makes it better for changes by other teams (like me) to understand the intended graph workflows easier. ### Checklist Delete any items that are not applicable to this PR. - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit b660d42) # Conflicts: # x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/execute_tools.ts # x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/generate_chat_title.ts # x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/run_agent.ts # x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/should_continue.ts
P1llus
added a commit
that referenced
this pull request
Aug 27, 2024
…191386) # Backport This will backport the following commits from `main` to `8.15`: - [[Elastic Assistant] Update default assistant graph (#190686)](#190686) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Marius Iversen","email":"marius.iversen@elastic.co"},"sourceCommit":{"committedDate":"2024-08-22T20:52:28Z","message":"[Elastic Assistant] Update default assistant graph (#190686)\n\n## Summary\n\n**NOTE** I will need help testing this before we merge it!\n\nI spoke with @spong about an upcoming PR we have here:\nhttps://github.com//pull/190426 which bumps the langgraph\nversion from 0.0.31 to 0.0.34, unfortunately this caused a lot of type\nerrors in the default assistant.\n\nAfter some more discussion we proposed to open a PR that removes some of\nthe more complex layers and to fix up the type issues. Though I have not\nworked on this graph before, the changes hopefully makes sense 👍\n\nGraph flow:\n\n\n\n\nThe PR changes the below items to remove some of the abstractions and\nresolve some of the type issues, also adds a few improvements in\ngeneral:\n\n- Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId`\nto be invoke parameters rather than compile parameters. This allows them\nto be used in state, and removes the need to pass them everywhere as\nparameters. Adding them to the state also allows them to be available in\nlangsmith.\n- Removes the constants defining each node with wrappers and rather\nexpose them directly as async functions. This removes a lot of the\nboilerplate code and it makes reading the stacktraces much easier.\n- Moved to a single `stepRouter` used for the current conditional edges.\nThis allows one to very easily extend the routing between either\nexisting or new nodes, and makes it much easier to understand what\nconditions are routed where.\n- Exports a common `NodeType` object constant (no need for the extra\ncompile overhead of Enums here, we are only using strings), to make the\nnode name strings auto-complete and prevent hardcoded names for the\nrouter.\n- Added a `modelInput` node to be the starter node. This was first\nbecause adding nodes inside if conditions usually create errors, so it\nwas created to be able to set the `hasRespondStep` state. However this\nnode is nice to have as an entrypoint in which you find yourself wanting\nto change the state based on the invoke parameters or other conditions\nretrieved from other parts of the stack etc before it continues to any\nof the other nodes.\n- Added a `yarn draw-graph` command, that outputs to\n`docs/img/default_assistant_graph.png`. This is then also included in\nthe readme. This makes it better for changes by other teams (like me) to\nunderstand the intended graph workflows easier.\n\n\n### Checklist\n\nDelete any items that are not applicable to this PR.\n\n- [x]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas added for features that require explanation or tutorials\n\n### For maintainers\n\n- [x] This was checked for breaking API changes and was [labeled\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n---------\n\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"b660d42b08a645bcbb8f1e5c78341f32f6c5d5fe","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:prev-minor","Feature:GenAI","Team:Security Generative AI","v8.16.0"],"number":190686,"url":"https://github.com/elastic/kibana/pull/190686","mergeCommit":{"message":"[Elastic Assistant] Update default assistant graph (#190686)\n\n## Summary\n\n**NOTE** I will need help testing this before we merge it!\n\nI spoke with @spong about an upcoming PR we have here:\nhttps://github.com//pull/190426 which bumps the langgraph\nversion from 0.0.31 to 0.0.34, unfortunately this caused a lot of type\nerrors in the default assistant.\n\nAfter some more discussion we proposed to open a PR that removes some of\nthe more complex layers and to fix up the type issues. Though I have not\nworked on this graph before, the changes hopefully makes sense 👍\n\nGraph flow:\n\n\n\n\nThe PR changes the below items to remove some of the abstractions and\nresolve some of the type issues, also adds a few improvements in\ngeneral:\n\n- Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId`\nto be invoke parameters rather than compile parameters. This allows them\nto be used in state, and removes the need to pass them everywhere as\nparameters. Adding them to the state also allows them to be available in\nlangsmith.\n- Removes the constants defining each node with wrappers and rather\nexpose them directly as async functions. This removes a lot of the\nboilerplate code and it makes reading the stacktraces much easier.\n- Moved to a single `stepRouter` used for the current conditional edges.\nThis allows one to very easily extend the routing between either\nexisting or new nodes, and makes it much easier to understand what\nconditions are routed where.\n- Exports a common `NodeType` object constant (no need for the extra\ncompile overhead of Enums here, we are only using strings), to make the\nnode name strings auto-complete and prevent hardcoded names for the\nrouter.\n- Added a `modelInput` node to be the starter node. This was first\nbecause adding nodes inside if conditions usually create errors, so it\nwas created to be able to set the `hasRespondStep` state. However this\nnode is nice to have as an entrypoint in which you find yourself wanting\nto change the state based on the invoke parameters or other conditions\nretrieved from other parts of the stack etc before it continues to any\nof the other nodes.\n- Added a `yarn draw-graph` command, that outputs to\n`docs/img/default_assistant_graph.png`. This is then also included in\nthe readme. This makes it better for changes by other teams (like me) to\nunderstand the intended graph workflows easier.\n\n\n### Checklist\n\nDelete any items that are not applicable to this PR.\n\n- [x]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas added for features that require explanation or tutorials\n\n### For maintainers\n\n- [x] This was checked for breaking API changes and was [labeled\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n---------\n\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"b660d42b08a645bcbb8f1e5c78341f32f6c5d5fe"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","labelRegex":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190686","number":190686,"mergeCommit":{"message":"[Elastic Assistant] Update default assistant graph (#190686)\n\n## Summary\n\n**NOTE** I will need help testing this before we merge it!\n\nI spoke with @spong about an upcoming PR we have here:\nhttps://github.com//pull/190426 which bumps the langgraph\nversion from 0.0.31 to 0.0.34, unfortunately this caused a lot of type\nerrors in the default assistant.\n\nAfter some more discussion we proposed to open a PR that removes some of\nthe more complex layers and to fix up the type issues. Though I have not\nworked on this graph before, the changes hopefully makes sense 👍\n\nGraph flow:\n\n\n\n\nThe PR changes the below items to remove some of the abstractions and\nresolve some of the type issues, also adds a few improvements in\ngeneral:\n\n- Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId`\nto be invoke parameters rather than compile parameters. This allows them\nto be used in state, and removes the need to pass them everywhere as\nparameters. Adding them to the state also allows them to be available in\nlangsmith.\n- Removes the constants defining each node with wrappers and rather\nexpose them directly as async functions. This removes a lot of the\nboilerplate code and it makes reading the stacktraces much easier.\n- Moved to a single `stepRouter` used for the current conditional edges.\nThis allows one to very easily extend the routing between either\nexisting or new nodes, and makes it much easier to understand what\nconditions are routed where.\n- Exports a common `NodeType` object constant (no need for the extra\ncompile overhead of Enums here, we are only using strings), to make the\nnode name strings auto-complete and prevent hardcoded names for the\nrouter.\n- Added a `modelInput` node to be the starter node. This was first\nbecause adding nodes inside if conditions usually create errors, so it\nwas created to be able to set the `hasRespondStep` state. However this\nnode is nice to have as an entrypoint in which you find yourself wanting\nto change the state based on the invoke parameters or other conditions\nretrieved from other parts of the stack etc before it continues to any\nof the other nodes.\n- Added a `yarn draw-graph` command, that outputs to\n`docs/img/default_assistant_graph.png`. This is then also included in\nthe readme. This makes it better for changes by other teams (like me) to\nunderstand the intended graph workflows easier.\n\n\n### Checklist\n\nDelete any items that are not applicable to this PR.\n\n- [x]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas added for features that require explanation or tutorials\n\n### For maintainers\n\n- [x] This was checked for breaking API changes and was [labeled\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n---------\n\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"b660d42b08a645bcbb8f1e5c78341f32f6c5d5fe"}}]}] BACKPORT-->
P1llus
added a commit
that referenced
this pull request
Aug 27, 2024
…aph (#190426) (#191314) # Backport This will backport the following commits from `main` to `8.15`: - [[Automatic Import] Adding support for larger samples in ECS graph (#190426)](#190426) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Marius Iversen","email":"marius.iversen@elastic.co"},"sourceCommit":{"committedDate":"2024-08-23T14:45:12Z","message":"[Automatic Import] Adding support for larger samples in ECS graph (#190426)\n\n## Summary\r\n\r\nThis PR prepares the ECS Mapping graph to support larger samples by\r\nchunking and running certain parts of the graph concurrently side by\r\nside and merging the results rather than trying to use one large\r\ncontext.\r\n\r\nMore details below, but in general there is only a slight modification\r\nto the actual code, most of the lines are related to moving code around\r\nto new files and updated tests.\r\n\r\nThere are also some minor tweaks to the ECS graph code in general, below\r\nis the related changes:\r\n\r\n1. Moved some code out of graph.ts to make it a bit smaller (moved\r\nmodel* functions to a new model.ts, moved state to its own file.\r\n2. Added chunkSize as a optional input to the graph (default to 10\r\nfields with an actual string value per chunk). Just to allow it to be\r\noverwritten if necessary later.\r\n3. Renamed the `samples` state to `prefixedSamples` and\r\n`formattedSamples` to `combinedSamples` as it got really confusing at\r\nsome point when debugging. I also updated the function argument names\r\nthat used them to the new names to better understand which sample type\r\nthey are using.\r\n4. Renamed `modifySamples` to `prefixSamples` to clarify what it\r\nactually modifies\r\n5. Moved `mapping`, `invalid`, `duplicate`, `missing` and `validate`\r\nnodes to its own subgraph. The `combinedSamples` state is now set when\r\ninvoking the subgraph, the value will be its related `chunk`, so it only\r\nneeds to work on this smaller subset of data.\r\n6. The `currentMapping` state is now only used by the sub graph, once\r\nall the subgraphs has finished, the will post their own results to\r\n`finalMapping` state. This state uses a reducer function, that combines\r\nthe existing state with the new, so all results from the X subgraphs\r\nrunning will be merged into the same resulting object as before this PR.\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n\r\n\r\n### For maintainers\r\n\r\n- [x] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"8e66a3e8ad06a9abfe176c0e8bc8bea976c1d171","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:prev-minor","v8.16.0","Team:Security-Scalability"],"title":"[Automatic Import] Adding support for larger samples in ECS graph","number":190426,"url":"https://github.com/elastic/kibana/pull/190426","mergeCommit":{"message":"[Automatic Import] Adding support for larger samples in ECS graph (#190426)\n\n## Summary\r\n\r\nThis PR prepares the ECS Mapping graph to support larger samples by\r\nchunking and running certain parts of the graph concurrently side by\r\nside and merging the results rather than trying to use one large\r\ncontext.\r\n\r\nMore details below, but in general there is only a slight modification\r\nto the actual code, most of the lines are related to moving code around\r\nto new files and updated tests.\r\n\r\nThere are also some minor tweaks to the ECS graph code in general, below\r\nis the related changes:\r\n\r\n1. Moved some code out of graph.ts to make it a bit smaller (moved\r\nmodel* functions to a new model.ts, moved state to its own file.\r\n2. Added chunkSize as a optional input to the graph (default to 10\r\nfields with an actual string value per chunk). Just to allow it to be\r\noverwritten if necessary later.\r\n3. Renamed the `samples` state to `prefixedSamples` and\r\n`formattedSamples` to `combinedSamples` as it got really confusing at\r\nsome point when debugging. I also updated the function argument names\r\nthat used them to the new names to better understand which sample type\r\nthey are using.\r\n4. Renamed `modifySamples` to `prefixSamples` to clarify what it\r\nactually modifies\r\n5. Moved `mapping`, `invalid`, `duplicate`, `missing` and `validate`\r\nnodes to its own subgraph. The `combinedSamples` state is now set when\r\ninvoking the subgraph, the value will be its related `chunk`, so it only\r\nneeds to work on this smaller subset of data.\r\n6. The `currentMapping` state is now only used by the sub graph, once\r\nall the subgraphs has finished, the will post their own results to\r\n`finalMapping` state. This state uses a reducer function, that combines\r\nthe existing state with the new, so all results from the X subgraphs\r\nrunning will be merged into the same resulting object as before this PR.\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n\r\n\r\n### For maintainers\r\n\r\n- [x] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"8e66a3e8ad06a9abfe176c0e8bc8bea976c1d171"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190426","number":190426,"mergeCommit":{"message":"[Automatic Import] Adding support for larger samples in ECS graph (#190426)\n\n## Summary\r\n\r\nThis PR prepares the ECS Mapping graph to support larger samples by\r\nchunking and running certain parts of the graph concurrently side by\r\nside and merging the results rather than trying to use one large\r\ncontext.\r\n\r\nMore details below, but in general there is only a slight modification\r\nto the actual code, most of the lines are related to moving code around\r\nto new files and updated tests.\r\n\r\nThere are also some minor tweaks to the ECS graph code in general, below\r\nis the related changes:\r\n\r\n1. Moved some code out of graph.ts to make it a bit smaller (moved\r\nmodel* functions to a new model.ts, moved state to its own file.\r\n2. Added chunkSize as a optional input to the graph (default to 10\r\nfields with an actual string value per chunk). Just to allow it to be\r\noverwritten if necessary later.\r\n3. Renamed the `samples` state to `prefixedSamples` and\r\n`formattedSamples` to `combinedSamples` as it got really confusing at\r\nsome point when debugging. I also updated the function argument names\r\nthat used them to the new names to better understand which sample type\r\nthey are using.\r\n4. Renamed `modifySamples` to `prefixSamples` to clarify what it\r\nactually modifies\r\n5. Moved `mapping`, `invalid`, `duplicate`, `missing` and `validate`\r\nnodes to its own subgraph. The `combinedSamples` state is now set when\r\ninvoking the subgraph, the value will be its related `chunk`, so it only\r\nneeds to work on this smaller subset of data.\r\n6. The `currentMapping` state is now only used by the sub graph, once\r\nall the subgraphs has finished, the will post their own results to\r\n`finalMapping` state. This state uses a reducer function, that combines\r\nthe existing state with the new, so all results from the X subgraphs\r\nrunning will be merged into the same resulting object as before this PR.\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n\r\n\r\n### For maintainers\r\n\r\n- [x] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"8e66a3e8ad06a9abfe176c0e8bc8bea976c1d171"}}]}] BACKPORT--> Co-authored-by: Marius Iversen <marius.iversen@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR prepares the ECS Mapping graph to support larger samples by chunking and running certain parts of the graph concurrently side by side and merging the results rather than trying to use one large context.
More details below, but in general there is only a slight modification to the actual code, most of the lines are related to moving code around to new files and updated tests.
There are also some minor tweaks to the ECS graph code in general, below is the related changes:
samplesstate toprefixedSamplesandformattedSamplestocombinedSamplesas it got really confusing at some point when debugging. I also updated the function argument names that used them to the new names to better understand which sample type they are using.modifySamplestoprefixSamplesto clarify what it actually modifiesmapping,invalid,duplicate,missingandvalidatenodes to its own subgraph. ThecombinedSamplesstate is now set when invoking the subgraph, the value will be its relatedchunk, so it only needs to work on this smaller subset of data.currentMappingstate is now only used by the sub graph, once all the subgraphs has finished, the will post their own results tofinalMappingstate. This state uses a reducer function, that combines the existing state with the new, so all results from the X subgraphs running will be merged into the same resulting object as before this PR.Checklist
Delete any items that are not applicable to this PR.
For maintainers