Skip to content

[Automatic Import] Adding support for larger samples in ECS graph#190426

Merged
P1llus merged 22 commits intoelastic:mainfrom
P1llus:automatic_import_ecs_chunking
Aug 23, 2024
Merged

[Automatic Import] Adding support for larger samples in ECS graph#190426
P1llus merged 22 commits intoelastic:mainfrom
P1llus:automatic_import_ecs_chunking

Conversation

@P1llus
Copy link
Copy Markdown
Member

@P1llus P1llus commented Aug 13, 2024

Summary

This PR prepares the ECS Mapping graph to support larger samples by chunking and running certain parts of the graph concurrently side by side and merging the results rather than trying to use one large context.

More details below, but in general there is only a slight modification to the actual code, most of the lines are related to moving code around to new files and updated tests.

There are also some minor tweaks to the ECS graph code in general, below is the related changes:

  1. Moved some code out of graph.ts to make it a bit smaller (moved model* functions to a new model.ts, moved state to its own file.
  2. Added chunkSize as a optional input to the graph (default to 10 fields with an actual string value per chunk). Just to allow it to be overwritten if necessary later.
  3. Renamed the samples state to prefixedSamples and formattedSamples to combinedSamples as it got really confusing at some point when debugging. I also updated the function argument names that used them to the new names to better understand which sample type they are using.
  4. Renamed modifySamples to prefixSamples to clarify what it actually modifies
  5. Moved mapping, invalid, duplicate, missing and validate nodes to its own subgraph. The combinedSamples state is now set when invoking the subgraph, the value will be its related chunk, so it only needs to work on this smaller subset of data.
  6. The currentMapping state is now only used by the sub graph, once all the subgraphs has finished, the will post their own results to finalMapping state. This state uses a reducer function, that combines the existing state with the new, so all results from the X subgraphs running will be merged into the same resulting object as before this PR.

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@P1llus P1llus added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting v8.16.0 Team:Security-Scalability Security Integrations Scalability Team labels Aug 13, 2024
@P1llus
Copy link
Copy Markdown
Member Author

P1llus commented Aug 13, 2024

@spong FYI on dependency bump we talked about.

@P1llus P1llus marked this pull request as ready for review August 14, 2024 09:28
@P1llus P1llus requested a review from a team as a code owner August 14, 2024 09:28
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/security-scalability (Team:Security-Scalability)

Copy link
Copy Markdown
Contributor

@bhapas bhapas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Just minor questions / comments

Copy link
Copy Markdown
Contributor

@bhapas bhapas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@P1llus
Copy link
Copy Markdown
Member Author

P1llus commented Aug 15, 2024

For the last failed types I am waiting on some guidance from the code owners, see if we can resolve the more strict type checking on agent state that might have been the result of bumping the dependencies.

P1llus added a commit that referenced this pull request Aug 22, 2024
## Summary

**NOTE** I will need help testing this before we merge it!

I spoke with @spong about an upcoming PR we have here:
#190426 which bumps the langgraph
version from 0.0.31 to 0.0.34, unfortunately this caused a lot of type
errors in the default assistant.

After some more discussion we proposed to open a PR that removes some of
the more complex layers and to fix up the type issues. Though I have not
worked on this graph before, the changes hopefully makes sense 👍

Graph flow:

![image](https://github.com/user-attachments/assets/911190c1-2cdc-429f-bd1b-2b4a6a343729)


The PR changes the below items to remove some of the abstractions and
resolve some of the type issues, also adds a few improvements in
general:

- Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId`
to be invoke parameters rather than compile parameters. This allows them
to be used in state, and removes the need to pass them everywhere as
parameters. Adding them to the state also allows them to be available in
langsmith.
- Removes the constants defining each node with wrappers and rather
expose them directly as async functions. This removes a lot of the
boilerplate code and it makes reading the stacktraces much easier.
- Moved to a single `stepRouter` used for the current conditional edges.
This allows one to very easily extend the routing between either
existing or new nodes, and makes it much easier to understand what
conditions are routed where.
- Exports a common `NodeType` object constant (no need for the extra
compile overhead of Enums here, we are only using strings), to make the
node name strings auto-complete and prevent hardcoded names for the
router.
- Added a `modelInput` node to be the starter node. This was first
because adding nodes inside if conditions usually create errors, so it
was created to be able to set the `hasRespondStep` state. However this
node is nice to have as an entrypoint in which you find yourself wanting
to change the state based on the invoke parameters or other conditions
retrieved from other parts of the stack etc before it continues to any
of the other nodes.
- Added a `yarn draw-graph` command, that outputs to
`docs/img/default_assistant_graph.png`. This is then also included in
the readme. This makes it better for changes by other teams (like me) to
understand the intended graph workflows easier.


### Checklist

Delete any items that are not applicable to this PR.

- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials

### For maintainers

- [x] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
@P1llus
Copy link
Copy Markdown
Member Author

P1llus commented Aug 23, 2024

@elasticmachine merge upstream

@kibana-ci
Copy link
Copy Markdown

💚 Build Succeeded

Metrics [docs]

Unknown metric groups

ESLint disabled in files

id before after diff
integrationAssistant 3 4 +1

Total ESLint disabled count

id before after diff
integrationAssistant 10 11 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@P1llus P1llus merged commit 8e66a3e into elastic:main Aug 23, 2024
@P1llus P1llus added backport:prev-minor and removed backport:skip This PR does not require backporting labels Aug 26, 2024
@P1llus P1llus self-assigned this Aug 26, 2024
@P1llus P1llus added the v8.15.1 label Aug 26, 2024
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Aug 26, 2024
…astic#190426)

## Summary

This PR prepares the ECS Mapping graph to support larger samples by
chunking and running certain parts of the graph concurrently side by
side and merging the results rather than trying to use one large
context.

More details below, but in general there is only a slight modification
to the actual code, most of the lines are related to moving code around
to new files and updated tests.

There are also some minor tweaks to the ECS graph code in general, below
is the related changes:

1. Moved some code out of graph.ts to make it a bit smaller (moved
model* functions to a new model.ts, moved state to its own file.
2. Added chunkSize as a optional input to the graph (default to 10
fields with an actual string value per chunk). Just to allow it to be
overwritten if necessary later.
3. Renamed the `samples` state to `prefixedSamples` and
`formattedSamples` to `combinedSamples` as it got really confusing at
some point when debugging. I also updated the function argument names
that used them to the new names to better understand which sample type
they are using.
4. Renamed `modifySamples` to `prefixSamples` to clarify what it
actually modifies
5. Moved `mapping`, `invalid`, `duplicate`, `missing` and `validate`
nodes to its own subgraph. The `combinedSamples` state is now set when
invoking the subgraph, the value will be its related `chunk`, so it only
needs to work on this smaller subset of data.
6. The `currentMapping` state is now only used by the sub graph, once
all the subgraphs has finished, the will post their own results to
`finalMapping` state. This state uses a reducer function, that combines
the existing state with the new, so all results from the X subgraphs
running will be merged into the same resulting object as before this PR.

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### For maintainers

- [x] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit 8e66a3e)
@kibanamachine
Copy link
Copy Markdown
Contributor

💚 All backports created successfully

Status Branch Result
8.15

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

P1llus added a commit to P1llus/kibana that referenced this pull request Aug 27, 2024
## Summary

**NOTE** I will need help testing this before we merge it!

I spoke with @spong about an upcoming PR we have here:
elastic#190426 which bumps the langgraph
version from 0.0.31 to 0.0.34, unfortunately this caused a lot of type
errors in the default assistant.

After some more discussion we proposed to open a PR that removes some of
the more complex layers and to fix up the type issues. Though I have not
worked on this graph before, the changes hopefully makes sense 👍

Graph flow:

![image](https://github.com/user-attachments/assets/911190c1-2cdc-429f-bd1b-2b4a6a343729)

The PR changes the below items to remove some of the abstractions and
resolve some of the type issues, also adds a few improvements in
general:

- Moves `llmType`, `bedrockChatEnabled`, `isStream` and `conversationId`
to be invoke parameters rather than compile parameters. This allows them
to be used in state, and removes the need to pass them everywhere as
parameters. Adding them to the state also allows them to be available in
langsmith.
- Removes the constants defining each node with wrappers and rather
expose them directly as async functions. This removes a lot of the
boilerplate code and it makes reading the stacktraces much easier.
- Moved to a single `stepRouter` used for the current conditional edges.
This allows one to very easily extend the routing between either
existing or new nodes, and makes it much easier to understand what
conditions are routed where.
- Exports a common `NodeType` object constant (no need for the extra
compile overhead of Enums here, we are only using strings), to make the
node name strings auto-complete and prevent hardcoded names for the
router.
- Added a `modelInput` node to be the starter node. This was first
because adding nodes inside if conditions usually create errors, so it
was created to be able to set the `hasRespondStep` state. However this
node is nice to have as an entrypoint in which you find yourself wanting
to change the state based on the invoke parameters or other conditions
retrieved from other parts of the stack etc before it continues to any
of the other nodes.
- Added a `yarn draw-graph` command, that outputs to
`docs/img/default_assistant_graph.png`. This is then also included in
the readme. This makes it better for changes by other teams (like me) to
understand the intended graph workflows easier.

### Checklist

Delete any items that are not applicable to this PR.

- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials

### For maintainers

- [x] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit b660d42)

# Conflicts:
#	x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/execute_tools.ts
#	x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/generate_chat_title.ts
#	x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/run_agent.ts
#	x-pack/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/should_continue.ts
P1llus added a commit that referenced this pull request Aug 27, 2024
…191386)

# Backport

This will backport the following commits from `main` to `8.15`:
- [[Elastic Assistant] Update default assistant graph
(#190686)](#190686)

<!--- Backport version: 8.9.8 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Marius
Iversen","email":"marius.iversen@elastic.co"},"sourceCommit":{"committedDate":"2024-08-22T20:52:28Z","message":"[Elastic
Assistant] Update default assistant graph (#190686)\n\n##
Summary\n\n**NOTE** I will need help testing this before we merge
it!\n\nI spoke with @spong about an upcoming PR we have
here:\nhttps://github.com//pull/190426 which bumps the
langgraph\nversion from 0.0.31 to 0.0.34, unfortunately this caused a
lot of type\nerrors in the default assistant.\n\nAfter some more
discussion we proposed to open a PR that removes some of\nthe more
complex layers and to fix up the type issues. Though I have not\nworked
on this graph before, the changes hopefully makes sense 👍\n\nGraph
flow:\n\n![image](https://github.com/user-attachments/assets/911190c1-2cdc-429f-bd1b-2b4a6a343729)\n\n\nThe
PR changes the below items to remove some of the abstractions
and\nresolve some of the type issues, also adds a few improvements
in\ngeneral:\n\n- Moves `llmType`, `bedrockChatEnabled`, `isStream` and
`conversationId`\nto be invoke parameters rather than compile
parameters. This allows them\nto be used in state, and removes the need
to pass them everywhere as\nparameters. Adding them to the state also
allows them to be available in\nlangsmith.\n- Removes the constants
defining each node with wrappers and rather\nexpose them directly as
async functions. This removes a lot of the\nboilerplate code and it
makes reading the stacktraces much easier.\n- Moved to a single
`stepRouter` used for the current conditional edges.\nThis allows one to
very easily extend the routing between either\nexisting or new nodes,
and makes it much easier to understand what\nconditions are routed
where.\n- Exports a common `NodeType` object constant (no need for the
extra\ncompile overhead of Enums here, we are only using strings), to
make the\nnode name strings auto-complete and prevent hardcoded names
for the\nrouter.\n- Added a `modelInput` node to be the starter node.
This was first\nbecause adding nodes inside if conditions usually create
errors, so it\nwas created to be able to set the `hasRespondStep` state.
However this\nnode is nice to have as an entrypoint in which you find
yourself wanting\nto change the state based on the invoke parameters or
other conditions\nretrieved from other parts of the stack etc before it
continues to any\nof the other nodes.\n- Added a `yarn draw-graph`
command, that outputs to\n`docs/img/default_assistant_graph.png`. This
is then also included in\nthe readme. This makes it better for changes
by other teams (like me) to\nunderstand the intended graph workflows
easier.\n\n\n### Checklist\n\nDelete any items that are not applicable
to this PR.\n\n-
[x]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n\n### For
maintainers\n\n- [x] This was checked for breaking API changes and was
[labeled\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n---------\n\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"b660d42b08a645bcbb8f1e5c78341f32f6c5d5fe","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:prev-minor","Feature:GenAI","Team:Security
Generative
AI","v8.16.0"],"number":190686,"url":"https://github.com/elastic/kibana/pull/190686","mergeCommit":{"message":"[Elastic
Assistant] Update default assistant graph (#190686)\n\n##
Summary\n\n**NOTE** I will need help testing this before we merge
it!\n\nI spoke with @spong about an upcoming PR we have
here:\nhttps://github.com//pull/190426 which bumps the
langgraph\nversion from 0.0.31 to 0.0.34, unfortunately this caused a
lot of type\nerrors in the default assistant.\n\nAfter some more
discussion we proposed to open a PR that removes some of\nthe more
complex layers and to fix up the type issues. Though I have not\nworked
on this graph before, the changes hopefully makes sense 👍\n\nGraph
flow:\n\n![image](https://github.com/user-attachments/assets/911190c1-2cdc-429f-bd1b-2b4a6a343729)\n\n\nThe
PR changes the below items to remove some of the abstractions
and\nresolve some of the type issues, also adds a few improvements
in\ngeneral:\n\n- Moves `llmType`, `bedrockChatEnabled`, `isStream` and
`conversationId`\nto be invoke parameters rather than compile
parameters. This allows them\nto be used in state, and removes the need
to pass them everywhere as\nparameters. Adding them to the state also
allows them to be available in\nlangsmith.\n- Removes the constants
defining each node with wrappers and rather\nexpose them directly as
async functions. This removes a lot of the\nboilerplate code and it
makes reading the stacktraces much easier.\n- Moved to a single
`stepRouter` used for the current conditional edges.\nThis allows one to
very easily extend the routing between either\nexisting or new nodes,
and makes it much easier to understand what\nconditions are routed
where.\n- Exports a common `NodeType` object constant (no need for the
extra\ncompile overhead of Enums here, we are only using strings), to
make the\nnode name strings auto-complete and prevent hardcoded names
for the\nrouter.\n- Added a `modelInput` node to be the starter node.
This was first\nbecause adding nodes inside if conditions usually create
errors, so it\nwas created to be able to set the `hasRespondStep` state.
However this\nnode is nice to have as an entrypoint in which you find
yourself wanting\nto change the state based on the invoke parameters or
other conditions\nretrieved from other parts of the stack etc before it
continues to any\nof the other nodes.\n- Added a `yarn draw-graph`
command, that outputs to\n`docs/img/default_assistant_graph.png`. This
is then also included in\nthe readme. This makes it better for changes
by other teams (like me) to\nunderstand the intended graph workflows
easier.\n\n\n### Checklist\n\nDelete any items that are not applicable
to this PR.\n\n-
[x]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n\n### For
maintainers\n\n- [x] This was checked for breaking API changes and was
[labeled\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n---------\n\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"b660d42b08a645bcbb8f1e5c78341f32f6c5d5fe"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","labelRegex":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190686","number":190686,"mergeCommit":{"message":"[Elastic
Assistant] Update default assistant graph (#190686)\n\n##
Summary\n\n**NOTE** I will need help testing this before we merge
it!\n\nI spoke with @spong about an upcoming PR we have
here:\nhttps://github.com//pull/190426 which bumps the
langgraph\nversion from 0.0.31 to 0.0.34, unfortunately this caused a
lot of type\nerrors in the default assistant.\n\nAfter some more
discussion we proposed to open a PR that removes some of\nthe more
complex layers and to fix up the type issues. Though I have not\nworked
on this graph before, the changes hopefully makes sense 👍\n\nGraph
flow:\n\n![image](https://github.com/user-attachments/assets/911190c1-2cdc-429f-bd1b-2b4a6a343729)\n\n\nThe
PR changes the below items to remove some of the abstractions
and\nresolve some of the type issues, also adds a few improvements
in\ngeneral:\n\n- Moves `llmType`, `bedrockChatEnabled`, `isStream` and
`conversationId`\nto be invoke parameters rather than compile
parameters. This allows them\nto be used in state, and removes the need
to pass them everywhere as\nparameters. Adding them to the state also
allows them to be available in\nlangsmith.\n- Removes the constants
defining each node with wrappers and rather\nexpose them directly as
async functions. This removes a lot of the\nboilerplate code and it
makes reading the stacktraces much easier.\n- Moved to a single
`stepRouter` used for the current conditional edges.\nThis allows one to
very easily extend the routing between either\nexisting or new nodes,
and makes it much easier to understand what\nconditions are routed
where.\n- Exports a common `NodeType` object constant (no need for the
extra\ncompile overhead of Enums here, we are only using strings), to
make the\nnode name strings auto-complete and prevent hardcoded names
for the\nrouter.\n- Added a `modelInput` node to be the starter node.
This was first\nbecause adding nodes inside if conditions usually create
errors, so it\nwas created to be able to set the `hasRespondStep` state.
However this\nnode is nice to have as an entrypoint in which you find
yourself wanting\nto change the state based on the invoke parameters or
other conditions\nretrieved from other parts of the stack etc before it
continues to any\nof the other nodes.\n- Added a `yarn draw-graph`
command, that outputs to\n`docs/img/default_assistant_graph.png`. This
is then also included in\nthe readme. This makes it better for changes
by other teams (like me) to\nunderstand the intended graph workflows
easier.\n\n\n### Checklist\n\nDelete any items that are not applicable
to this PR.\n\n-
[x]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n\n### For
maintainers\n\n- [x] This was checked for breaking API changes and was
[labeled\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n---------\n\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"b660d42b08a645bcbb8f1e5c78341f32f6c5d5fe"}}]}]
BACKPORT-->
P1llus added a commit that referenced this pull request Aug 27, 2024
…aph (#190426) (#191314)

# Backport

This will backport the following commits from `main` to `8.15`:
- [[Automatic Import] Adding support for larger samples in ECS graph
(#190426)](#190426)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Marius
Iversen","email":"marius.iversen@elastic.co"},"sourceCommit":{"committedDate":"2024-08-23T14:45:12Z","message":"[Automatic
Import] Adding support for larger samples in ECS graph (#190426)\n\n##
Summary\r\n\r\nThis PR prepares the ECS Mapping graph to support larger
samples by\r\nchunking and running certain parts of the graph
concurrently side by\r\nside and merging the results rather than trying
to use one large\r\ncontext.\r\n\r\nMore details below, but in general
there is only a slight modification\r\nto the actual code, most of the
lines are related to moving code around\r\nto new files and updated
tests.\r\n\r\nThere are also some minor tweaks to the ECS graph code in
general, below\r\nis the related changes:\r\n\r\n1. Moved some code out
of graph.ts to make it a bit smaller (moved\r\nmodel* functions to a new
model.ts, moved state to its own file.\r\n2. Added chunkSize as a
optional input to the graph (default to 10\r\nfields with an actual
string value per chunk). Just to allow it to be\r\noverwritten if
necessary later.\r\n3. Renamed the `samples` state to `prefixedSamples`
and\r\n`formattedSamples` to `combinedSamples` as it got really
confusing at\r\nsome point when debugging. I also updated the function
argument names\r\nthat used them to the new names to better understand
which sample type\r\nthey are using.\r\n4. Renamed `modifySamples` to
`prefixSamples` to clarify what it\r\nactually modifies\r\n5. Moved
`mapping`, `invalid`, `duplicate`, `missing` and `validate`\r\nnodes to
its own subgraph. The `combinedSamples` state is now set
when\r\ninvoking the subgraph, the value will be its related `chunk`, so
it only\r\nneeds to work on this smaller subset of data.\r\n6. The
`currentMapping` state is now only used by the sub graph, once\r\nall
the subgraphs has finished, the will post their own results
to\r\n`finalMapping` state. This state uses a reducer function, that
combines\r\nthe existing state with the new, so all results from the X
subgraphs\r\nrunning will be merged into the same resulting object as
before this PR.\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n\r\n\r\n### For
maintainers\r\n\r\n- [x] This was checked for breaking API changes and
was
[labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"8e66a3e8ad06a9abfe176c0e8bc8bea976c1d171","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:prev-minor","v8.16.0","Team:Security-Scalability"],"title":"[Automatic
Import] Adding support for larger samples in ECS
graph","number":190426,"url":"https://github.com/elastic/kibana/pull/190426","mergeCommit":{"message":"[Automatic
Import] Adding support for larger samples in ECS graph (#190426)\n\n##
Summary\r\n\r\nThis PR prepares the ECS Mapping graph to support larger
samples by\r\nchunking and running certain parts of the graph
concurrently side by\r\nside and merging the results rather than trying
to use one large\r\ncontext.\r\n\r\nMore details below, but in general
there is only a slight modification\r\nto the actual code, most of the
lines are related to moving code around\r\nto new files and updated
tests.\r\n\r\nThere are also some minor tweaks to the ECS graph code in
general, below\r\nis the related changes:\r\n\r\n1. Moved some code out
of graph.ts to make it a bit smaller (moved\r\nmodel* functions to a new
model.ts, moved state to its own file.\r\n2. Added chunkSize as a
optional input to the graph (default to 10\r\nfields with an actual
string value per chunk). Just to allow it to be\r\noverwritten if
necessary later.\r\n3. Renamed the `samples` state to `prefixedSamples`
and\r\n`formattedSamples` to `combinedSamples` as it got really
confusing at\r\nsome point when debugging. I also updated the function
argument names\r\nthat used them to the new names to better understand
which sample type\r\nthey are using.\r\n4. Renamed `modifySamples` to
`prefixSamples` to clarify what it\r\nactually modifies\r\n5. Moved
`mapping`, `invalid`, `duplicate`, `missing` and `validate`\r\nnodes to
its own subgraph. The `combinedSamples` state is now set
when\r\ninvoking the subgraph, the value will be its related `chunk`, so
it only\r\nneeds to work on this smaller subset of data.\r\n6. The
`currentMapping` state is now only used by the sub graph, once\r\nall
the subgraphs has finished, the will post their own results
to\r\n`finalMapping` state. This state uses a reducer function, that
combines\r\nthe existing state with the new, so all results from the X
subgraphs\r\nrunning will be merged into the same resulting object as
before this PR.\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n\r\n\r\n### For
maintainers\r\n\r\n- [x] This was checked for breaking API changes and
was
[labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"8e66a3e8ad06a9abfe176c0e8bc8bea976c1d171"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190426","number":190426,"mergeCommit":{"message":"[Automatic
Import] Adding support for larger samples in ECS graph (#190426)\n\n##
Summary\r\n\r\nThis PR prepares the ECS Mapping graph to support larger
samples by\r\nchunking and running certain parts of the graph
concurrently side by\r\nside and merging the results rather than trying
to use one large\r\ncontext.\r\n\r\nMore details below, but in general
there is only a slight modification\r\nto the actual code, most of the
lines are related to moving code around\r\nto new files and updated
tests.\r\n\r\nThere are also some minor tweaks to the ECS graph code in
general, below\r\nis the related changes:\r\n\r\n1. Moved some code out
of graph.ts to make it a bit smaller (moved\r\nmodel* functions to a new
model.ts, moved state to its own file.\r\n2. Added chunkSize as a
optional input to the graph (default to 10\r\nfields with an actual
string value per chunk). Just to allow it to be\r\noverwritten if
necessary later.\r\n3. Renamed the `samples` state to `prefixedSamples`
and\r\n`formattedSamples` to `combinedSamples` as it got really
confusing at\r\nsome point when debugging. I also updated the function
argument names\r\nthat used them to the new names to better understand
which sample type\r\nthey are using.\r\n4. Renamed `modifySamples` to
`prefixSamples` to clarify what it\r\nactually modifies\r\n5. Moved
`mapping`, `invalid`, `duplicate`, `missing` and `validate`\r\nnodes to
its own subgraph. The `combinedSamples` state is now set
when\r\ninvoking the subgraph, the value will be its related `chunk`, so
it only\r\nneeds to work on this smaller subset of data.\r\n6. The
`currentMapping` state is now only used by the sub graph, once\r\nall
the subgraphs has finished, the will post their own results
to\r\n`finalMapping` state. This state uses a reducer function, that
combines\r\nthe existing state with the new, so all results from the X
subgraphs\r\nrunning will be merged into the same resulting object as
before this PR.\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n\r\n\r\n### For
maintainers\r\n\r\n- [x] This was checked for breaking API changes and
was
[labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"8e66a3e8ad06a9abfe176c0e8bc8bea976c1d171"}}]}]
BACKPORT-->

Co-authored-by: Marius Iversen <marius.iversen@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_note:skip Skip the PR/issue when compiling release notes Team:Security-Scalability Security Integrations Scalability Team v8.15.1 v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants