Skip to content

[SIEM][Detection Engine] Speeds up value list imports by enabling streaming of files.#70685

Merged
FrankHassanabad merged 14 commits intoelastic:masterfrom
FrankHassanabad:stream-imports
Jul 9, 2020
Merged

[SIEM][Detection Engine] Speeds up value list imports by enabling streaming of files.#70685
FrankHassanabad merged 14 commits intoelastic:masterfrom
FrankHassanabad:stream-imports

Conversation

@FrankHassanabad
Copy link
Contributor

@FrankHassanabad FrankHassanabad commented Jul 3, 2020

Summary

  • Changes the value list imports to use a streaming in model
  • Adds a custom light hand spun multi-part parser for the incoming text
  • Adds a buffer pause and resume which continues to buffer the incoming data if an async event such as creating a list from the attachment file needs to happen but does not emit the lines until the resume continues.
  • Adds a data slicing if the buffer becomes larger than the maximum so that if we begin buffering too quickly within memory we don't blow up the limit of Elastic Search.
  • Adds unit tests

Checklist

@FrankHassanabad
Copy link
Contributor Author

@elasticmachine merge upstream

@FrankHassanabad
Copy link
Contributor Author

@elasticmachine merge upstream

@FrankHassanabad
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a walkthrough of the streaming logic from @FrankHassanabad, and the code looks good!

I tested this against #67068, and it decreased import time of a 1.2kB file from 52s to 444ms. So yeah, this is an improvement to say the least 😉 .


public _read(): void {
// No operation but this is required to be implemented
public _read(): void {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was gonna say we should leave the comment, but the typescript error generated if one removes this method should be straightforward 👍

Copy link
Contributor

@yctercero yctercero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Saw Ryland pulled down to test, so was just looking at the code. This is really awesome!

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@FrankHassanabad FrankHassanabad merged commit 3863921 into elastic:master Jul 9, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/siem (Team:SIEM)

@FrankHassanabad FrankHassanabad deleted the stream-imports branch July 9, 2020 02:15
@MindyRS MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Sep 23, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_note:enhancement Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM v7.9.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants