Skip to content

[Streams 🌊] Prevent routing simulation timeout and improve cancellations#226374

Merged
tonyghiani merged 20 commits intoelastic:mainfrom
tonyghiani:219494-prevent-routing-timeout
Jul 9, 2025
Merged

[Streams 🌊] Prevent routing simulation timeout and improve cancellations#226374
tonyghiani merged 20 commits intoelastic:mainfrom
tonyghiani:219494-prevent-routing-timeout

Conversation

@tonyghiani
Copy link
Contributor

@tonyghiani tonyghiani commented Jul 3, 2025

📓 Summary

Closes #219494

Refactor the samples fetching logic and address the following issues:

  • Add a timeout (10 seconds) to avoid requests on incomplete or unmatching conditions to keep running with no feedback.
  • Update cancellation logic, previously was not catching correctly the request abortions.
  • Sample data fetching performed on the creation form only.

@tonyghiani tonyghiani changed the title 219494 prevent routing timeout [Streams 🌊] Prevent routing simulation timeout and improve cancellations Jul 3, 2025
conditionUpdateDebounceTime: 500,
},
}).createMachine({
/** @xstate-layout N4IgpgJg5mDOIC5QCcD2BXALgSwHZQGUBDAWwAcAbOAYjSz0NMrgDpkwAzd2ACwG0ADAF1EoMqljYcqXKJAAPRADYAjCwDsAJiUCBSgJwqAHAGYVAViXqlAGhABPRABZLLc9aUmr6rSpMBffzs6HHxicipYWgxQxgjWdDIIIkwwAGEZCClsGUERJBBxSWlZAsUEL3MWIxUVJ2N9ARV9cxVbB2cTExYdXU16vyNzIydA4JiGcOZYFggwACMMXABjBgzcLJLqeVhMFLAWIg5U5AAKZczsmQBVJP2AEQWl5bAAFWwSMABKaPowpkisye6BWa0uJTyciKV1KoHKJnUVTq6hMTicAjM6IEljsjgQnm6mi0NRM+nq5j0+jGIBCkwBrA4YEwyx4DFmqGW6E+uEwMwoqCIWXw1AgMgOeAAbqgANYHC4UKjLTD3DlcsA82CQgrQkpycrmExGFhOKxGAyIymafS4xD6O0sO0Y7HqCwYlTU2n-eIzRnM1n4dmc7m8lj8wUMahgZBoZAsSgpDioZAkFjyxXK1XBzXCKESGF6xAG-QsPzuQxE-RKE02hBGdRG8zmJx2pxGTReTzqD0TL3TFi+lls0VB9WYDIgkNhoVQEViliSmVy1AKsBKlUjjXjnlasR53VlQtoh3qASNJ31EbWjoIPwqdRuUwmPpafp+bt-OJ9gf+qCBtU8rdJwFadI2jJM4woBMkxTNNVwzDdeUAndCj3HJYQUQ8qk8Iw22MLo-HUK88RUTQBE0Y1zErAR1CcDxdHMQIghAXBUDmeACk9T9IlzYo0ILBAsMNUxhhwusTRUGtNCMAQS10AR0TvTRtBcd9YimQE5kWEFVnwdZNj47VUJkfj0WLZsVGomjaKaPQa0sGS2jLEZsUbRtVLpb1+yZQd8B4-MDwE7psOE0SxNUGsT26RtzOaC9K00dze0Bb8h0zUd2N3XjjICizyJRFp9FMQihn0JSIsMFhsWbA1DSJQiqSYzj1IZbyfz-LNQ2Ahg-P3OFEFdDRSUoorCsosrryMfR70aVtRpafCGMansuJav1UoQmZRVwMAeoMvrayaB1SMRA1NAsk0TBrSajXrCxrFPWp9BMBKlo-ZqfVa9b-zHJZeV27L9ostRgpMEScJo8Lr1qDFjS8JSDWk9xnsSlaPrWgNh2+wC+S63zDKy9DyjqYsQbBsKa0emSntUIlaZcFoUferz0d-THg2x9ltv+wnEGk7oyIsO8alIswJKhs7yOpto2wNJGXsCIA */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-07-03 at 11 44 06

@tonyghiani tonyghiani marked this pull request as ready for review July 3, 2025 12:33
@tonyghiani tonyghiani requested review from a team as code owners July 3, 2025 12:33
@tonyghiani tonyghiani added release_note:skip Skip the PR/issue when compiling release notes Team:obs-onboarding Observability Onboarding Team backport:version Backport to applied version labels Feature:Streams This is the label for the Streams Project v9.1.0 labels Jul 3, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

@Kerry350 Kerry350 self-requested a review July 3, 2025 12:40
Copy link
Contributor

@flash1293 flash1293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't test thoroughly yet

But it seems like there is an issue if no results are returned - it's not handled properly but shown as an error:
Screenshot 2025-07-04 at 15 01 47

Investigated and this is happening in case of intermediate results of the async search, which can be an empty object.

@tonyghiani
Copy link
Contributor Author

@flash1293 I could replicate the issue and simulate longer and more complex requests to handle this scenario. The type lies when it comes to partial results, but we can specify to always get partial results to get it right, and then handle the search correctly.
Here is a quick video with an increasingly more specific condition over a ~800mb dataset, it times out after 10s (we can tune this value) when it doesn't get any partial result, and it constantly gives feedback on the ongoing data fetching:

Screen.Recording.2025-07-07.at.11.49.33.mov

Copy link
Contributor

@Kerry350 Kerry350 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion to the state machine etc looks good to me barring the ongoing discussion about partial results.

The type lies when it comes to partial results, but we can specify to always get partial results to get it right, and then handle the search correctly.

This is interesting. So I can't remember the specifics as I did the original code ages ago, but there were caveats around aggregations and using partial results. If you're always able to specify this now, that sounds good 👌

@tonyghiani
Copy link
Contributor Author

@flash1293 @Kerry350 thanks for your first reviews, this should be ready for another look

@flash1293
Copy link
Contributor

Seems like there is still a bug here - if I change a condition that does match something to one that doesn't, the old result keeps showing instead of switching to "no results found":
Kapture 2025-07-08 at 12 01 33

@tonyghiani
Copy link
Contributor Author

Seems like there is still a bug here - if I change a condition that does match something to one that doesn't, the old result keeps showing instead of switching to "no results found":

Yeah good catch, I missed a check in the response filtering, fixed!

Copy link
Contributor

@flash1293 flash1293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works fine for me now! I'll leave a detailed review of the fetching logic to Kerry, but it looks fine to me.

One note: During testing I noticed that we reset the data grid state when refreshing which is quite annoying:

Image

But I guess we are touching this table anyway soon as part of the routing improvements, we can probably fix there

Copy link
Contributor

@Kerry350 Kerry350 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, LGTM 👌

@tonyghiani tonyghiani enabled auto-merge (squash) July 9, 2025 08:41
@tonyghiani tonyghiani merged commit 00333f2 into elastic:main Jul 9, 2025
12 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 9.1

https://github.com/elastic/kibana/actions/runs/16166971911

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
streamsApp 585.0KB 589.9KB +4.9KB
Unknown metric groups

ESLint disabled line counts

id before after diff
streamsApp 8 7 -1

Total ESLint disabled count

id before after diff
streamsApp 12 11 -1

History

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
9.1 Backport failed because of merge conflicts

You might need to backport the following PRs to 9.1:
- [Streams] Significant events view (#220197)

Manual backport

To create the backport manually run:

node scripts/backport --pr 226374

Questions ?

Please refer to the Backport tool documentation

@tonyghiani tonyghiani added backport:version Backport to applied version labels and removed backport:version Backport to applied version labels labels Jul 9, 2025
@tonyghiani tonyghiani deleted the 219494-prevent-routing-timeout branch July 9, 2025 12:41
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 9.1

https://github.com/elastic/kibana/actions/runs/16169591225

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
9.1 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 226374

Questions ?

Please refer to the Backport tool documentation

tonyghiani added a commit to tonyghiani/kibana that referenced this pull request Jul 9, 2025
…ons (elastic#226374)

## 📓 Summary

Closes elastic#219494

Refactor the samples fetching logic and address the following issues:
- Add a timeout (10 seconds) to avoid requests on incomplete or
unmatching conditions to keep running with no feedback.
- Update cancellation logic, previously was not catching correctly the
request abortions.
- Sample data fetching performed on the creation form only.

(cherry picked from commit 00333f2)

# Conflicts:
#	x-pack/platform/plugins/shared/streams_app/public/components/data_management/stream_detail_enrichment/processor_outcome_preview.tsx
tonyghiani added a commit that referenced this pull request Jul 9, 2025
…ellations (#226374) (#227246)

# Backport

This will backport the following commits from `main` to `9.1`:
- [[Streams 🌊] Prevent routing simulation timeout and improve
cancellations (#226374)](#226374)

<!--- Backport version: 10.0.1 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Marco Antonio
Ghiani","email":"marcoantonio.ghiani01@gmail.com"},"sourceCommit":{"committedDate":"2025-07-09T10:31:52Z","message":"[Streams
🌊] Prevent routing simulation timeout and improve cancellations
(#226374)\n\n## 📓 Summary\n\nCloses #219494 \n\nRefactor the samples
fetching logic and address the following issues:\n- Add a timeout (10
seconds) to avoid requests on incomplete or\nunmatching conditions to
keep running with no feedback.\n- Update cancellation logic, previously
was not catching correctly the\nrequest abortions.\n- Sample data
fetching performed on the creation form
only.","sha":"00333f2c8adf96c1ef823ab4db6755d54f4a8925","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:obs-ux-logs","backport:version","Feature:Streams","v9.1.0","v9.2.0"],"title":"[Streams
🌊] Prevent routing simulation timeout and improve
cancellations","number":226374,"url":"https://github.com/elastic/kibana/pull/226374","mergeCommit":{"message":"[Streams
🌊] Prevent routing simulation timeout and improve cancellations
(#226374)\n\n## 📓 Summary\n\nCloses #219494 \n\nRefactor the samples
fetching logic and address the following issues:\n- Add a timeout (10
seconds) to avoid requests on incomplete or\nunmatching conditions to
keep running with no feedback.\n- Update cancellation logic, previously
was not catching correctly the\nrequest abortions.\n- Sample data
fetching performed on the creation form
only.","sha":"00333f2c8adf96c1ef823ab4db6755d54f4a8925"}},"sourceBranch":"main","suggestedTargetBranches":["9.1"],"targetPullRequestStates":[{"branch":"9.1","label":"v9.1.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/226374","number":226374,"mergeCommit":{"message":"[Streams
🌊] Prevent routing simulation timeout and improve cancellations
(#226374)\n\n## 📓 Summary\n\nCloses #219494 \n\nRefactor the samples
fetching logic and address the following issues:\n- Add a timeout (10
seconds) to avoid requests on incomplete or\nunmatching conditions to
keep running with no feedback.\n- Update cancellation logic, previously
was not catching correctly the\nrequest abortions.\n- Sample data
fetching performed on the creation form
only.","sha":"00333f2c8adf96c1ef823ab4db6755d54f4a8925"}}]}] BACKPORT-->
kertal pushed a commit to kertal/kibana that referenced this pull request Jul 25, 2025
…ons (elastic#226374)

## 📓 Summary

Closes elastic#219494 

Refactor the samples fetching logic and address the following issues:
- Add a timeout (10 seconds) to avoid requests on incomplete or
unmatching conditions to keep running with no feedback.
- Update cancellation logic, previously was not catching correctly the
request abortions.
- Sample data fetching performed on the creation form only.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels Feature:Streams This is the label for the Streams Project release_note:skip Skip the PR/issue when compiling release notes Team:obs-onboarding Observability Onboarding Team v9.1.0 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Streams 🌊] Prevent routing samples query overload with timeouts

5 participants