Revert "[Incident Management] Add page attachment modal (#231186)" by fkanout · Pull Request #236869 · elastic/kibana

fkanout · 2025-09-30T05:46:23Z

This reverts commit 807b177.

…)" This reverts commit 807b177.

elasticmachine · 2025-09-30T05:46:27Z

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

…urce-definitions/scripts/fix-location-collection.ts'

mgiota

I tested this one on top of the revert of this one and I verify that the modal that used to open the comment textarea doesn't appear anymore.

Screen.Recording.2025-09-30.at.13.58.56.mov

…achement-modal

Generated by https://buildkite.com/elastic/kibana-api-docs-daily/builds/1212

…cy to avoid Rate limit errors. (elastic#236535) ## Problem Resolves elastic/security-team#14004 > [!TIP] > Enable below experimental feature before using this feature. > ``` > xpack.securitySolution.enableExperimental: > - automaticDashboardsMigration > ``` This PR improves the CPU performance, Token Performance and Error rate for Dashboard Migrations. Before this PR, a dashboard migration run leads to lot of panels erroring out with `Rate Limits error` as can be seen in below screenshot from `main`. you can see that almost all panels are failing. <details> <summary>Rate Limit Error Screenshot </summary> <img width="2271" height="1202" alt="Image" src="https://github.com/user-attachments/assets/36949d7f-a029-4bfd-a834-4899a4b9a600" /> </details> This would also result in Kibana Task manager being choked up as you might observe when desk testing. This is hard to reproduce because issue occurs intermittently, depending on which part of graph you are in. Specially after 2-3 minutes when graph is well underway. ## Explanation It was because of too many Translation graphs running and each Translation graph triggering `All` Panel graphs in parallel. Current limit on Dashboard graph is 10 and on panels there is not limit. That means, if there are 10 Dashboard graphs ( out current concurrency limit ) running each with 5 panels, there will be 50 tasks running at the same time and making calls to LLM. This results in system choking up and LLM more rate limit calls. > [!TIP] > The objective of this PR was to tune the concurrency of Dashboard migration so as to reduce the instances of above mentioned issues. ## TLDR; After testing multiple concurrency configurations, I think we can go with 3x4 ( 3 dashboards, each with 4 Panels concurrently ) config. Below section details of those experiments and corresponding justifications. Feel free to skip the section and start the testing instead. <details> <summary><h2> Solution and Experiments </h2></summary> All changes are applicable to `Dashboard` migrations only. I reached to a setting of 3x4 concurrency. What this means is 3 Dashboards ( at max) each with 4 panels ( at max ) running concurrently. I arrived at this configuration after a series of experiments as you can see. - All tests were done on `Elastic LLM`. - 3 Retries has been switched on for all LLM nodes. So if Rate limit error occurs, we can retry it. If those retries fail for a panel, the process is aborted. So in below traces Aborted error are because of multi retries after rate limit was hit. - All tests were done on same dashboard data set as explained below - 7 dashboards - 4 fully/partially translatable - 2 have errors - No panels found - 1 (`Content Overview`) errors with some unknown error in `Index pattern` node. Not sure of the error. Let's ignore this for this node. | Dashboard Concurrency| Panels Concurrency | Error % | Token Usage | Time Taken |Langsmith Trace | |---|---|---|---|---|---| |5|10|89%|~95K|~3m|https://ela.st/5x10| |5|4|71%|~148K|3m 20s|https://ela.st/5x4| |3|4|50%|120K|~3m|https://ela.st/3x4-concurrency| |1|10|40%|~292K|~6m|https://ela.st/1x10| > [!TIP] > In 3x4 and 1x10, I did not see any instance of Rate limit error. > These error rates are only for comparison, they will perform better with more dashboards which are bound to be successful. Out test dataset had some dashboard which were failures also. See screenshots for all of runs mentioned above <details> <summary>5x10</summary> <img width="2279" height="769" alt="image" src="https://github.com/user-attachments/assets/cb2a3664-8bf4-497c-b2b6-3da099fc7768" /> </details> <details> <summary>5x4</summary> <img width="2281" height="727" alt="image" src="https://github.com/user-attachments/assets/af16de94-f787-4e46-9818-52e882b2190c" /> </details> <details> <summary>3x4</summary> <img width="2288" height="804" alt="image" src="https://github.com/user-attachments/assets/be0dbf2a-003c-4770-ae67-0d74902ab7a5" /> </details> <details> <summary>1x10</summary> <img width="2305" height="759" alt="image" src="https://github.com/user-attachments/assets/34b91294-973d-4506-b431-09cfbd44ecc9" /> </details> ## Final run on the deployed Project. With the select 3x4 configuration, i did a final run and results were much better. However there were still `Rate limit` errors here and there. I think we can merge this in and see the performance overtime and do some more tweaks to the `RetryPolicy`. There were total 40 Dashboards and most of them had valid data. | Dashboard Concurrency| Panels Concurrency | Error % | Token Usage | Time taken|Langsmith Trace | |---|---|---|---|---|---| |3|4|37%|380K|~21m|https://ela.st/3x4-project| <img width="1107" height="606" alt="image" src="https://github.com/user-attachments/assets/6c476571-4cf8-43fe-8c74-5fab3a414be4" /> Results are available here : https://keepkibana-pr-236535-security-b9fbee.kb.eu-west-1.aws.qa.elastic.cloud/app/security/siem_migrations/dashboards/fe0ab206-2249-47b4-8bce-d55811efd214 Credentials can be found [here](https://p.elstc.co/paste/QAIlMYEH#-Zf/fUtX2xUUQ83uSLaNRf9QxJtBJVkSCuMhwracC+h) </details> ## Testing Guidelines. ### Things to test 1. First run given dashboards migration on `main` and note the following - Rate limit Errors in each panel ( can be observed from Comments section ). Easier to do with small data set as given below - <img width="1216" height="269" alt="image" src="https://github.com/user-attachments/assets/bf4385c6-98ba-4d5b-9fae-dea7a76317aa" /> - Performance of Kibana as run when Migration is running. Pay specific attention to following. This is easier to with big dataset. Also given below. - Time server requests are taking. - Time taken during hard refresh - Navigation Lags. 2. Next, repeat the same on this PR branch and results should be much better. ### Data - [7 Dashboards](https://drive.google.com/drive/folders/1D3BibV4AnBmIs7En5WPFSuEbIkNucG49?usp=drive_link) ( great for checking rate limit error) - [40 Dashboards](https://drive.google.com/drive/folders/1D3BibV4AnBmIs7En5WPFSuEbIkNucG49?usp=drive_link) ( great for checking kibana perf) Both Macros and lookups are also available in the same folder.

… attached page (feature flagged) (elastic#225750) (elastic#236872) This reverts commit f1eb210.

fkanout · 2025-10-01T07:24:27Z

closing this in favor of #237064

elasticmachine · 2025-10-01T07:34:05Z

💔 Build Failed

Buildkite Build
Commit: 7f5aa86

Failed CI Steps

History

💔 Build #345509 failed a57993b

cc @fkanout

Revert "[Incident Management] Add page attachment modal (elastic#231186…

466eb3a

…)" This reverts commit 807b177.

fkanout self-assigned this Sep 30, 2025

fkanout requested a review from a team as a code owner September 30, 2025 05:46

fkanout added the release_note:skip Skip the PR/issue when compiling release notes label Sep 30, 2025

fkanout requested a review from a team as a code owner September 30, 2025 05:46

fkanout added backport:skip This PR does not require backporting Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. v9.2.0 labels Sep 30, 2025

github-actions bot added the author:obs-ux-management PRs authored by the obs ux management team label Sep 30, 2025

[CI] Auto-commit changed files from 'ts-node .buildkite/pipeline-reso…

a57993b

…urce-definitions/scripts/fix-location-collection.ts'

mgiota self-requested a review September 30, 2025 09:35

mgiota approved these changes Sep 30, 2025

View reviewed changes

mgiota mentioned this pull request Sep 30, 2025

Revert "[Observability] Page attachment type (#225295)" #236958

Merged

fkanout and others added 5 commits October 1, 2025 06:56

update

345d947

Merge remote-tracking branch 'upstream/main' into revert-add-page-att…

9a0e002

…achement-modal

[api-docs] 2025-10-01 Daily api_docs build (elastic#237057)

d104f0d

Generated by https://buildkite.com/elastic/kibana-api-docs-daily/builds/1212

Revert "[Observability] [Page attachment] add AI based summary of the…

7f5aa86

… attached page (feature flagged) (elastic#225750) (elastic#236872) This reverts commit f1eb210.

fkanout requested a review from a team as a code owner October 1, 2025 07:17

fkanout closed this Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "[Incident Management] Add page attachment modal (#231186)"#236869

Revert "[Incident Management] Add page attachment modal (#231186)"#236869
fkanout wants to merge 7 commits intoelastic:mainfrom
fkanout:revert-add-page-attachement-modal

fkanout commented Sep 30, 2025 •

edited by mgiota

Loading

Uh oh!

elasticmachine commented Sep 30, 2025

Uh oh!

mgiota left a comment

Uh oh!

fkanout commented Oct 1, 2025

Uh oh!

elasticmachine commented Oct 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

fkanout commented Sep 30, 2025 • edited by mgiota Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Sep 30, 2025

Uh oh!

mgiota left a comment

Choose a reason for hiding this comment

Uh oh!

fkanout commented Oct 1, 2025

Uh oh!

elasticmachine commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💔 Build Failed

Failed CI Steps

History

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fkanout commented Sep 30, 2025 •

edited by mgiota

Loading

elasticmachine commented Oct 1, 2025 •

edited

Loading