-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Reduce Rules Management e2e flakiness #164099
[Security Solution] Reduce Rules Management e2e flakiness #164099
Conversation
f531f33
to
675a113
Compare
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
Pinging @elastic/security-solution (Team: SecuritySolution) |
30f26da
to
675a113
Compare
b81d8e8
to
9ab4358
Compare
f21d406
to
a640340
Compare
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]
History
To update your PR or re-run it, just comment with: cc @maximpn |
💔 All backports failed
Manual backportTo create the backport manually run:
Questions ?Please refer to the Backport tool documentation |
@maximpn please backport to 8.10 manually ^^^ |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…4099) **Relates to: elastic#161507 **Fixes: elastic#163704 **Fixes: elastic#163182 **Fixes: elastic#163558 **Fixes: elastic#163974 **Fixes: elastic#153914 **Fixes: elastic#164079 **Fixes: elastic#164279 ## Summary While working on fixing Rules Management flaky tests I've noticed similar fails in different tests. This PR addresses common pitfalls to reduce a number of reasons causing e2e tests flakiness and as a result reduce a number of flaky tests. ## Details The common reasons causing e2e tests flakiness for the rules tables are - Auto-refresh Auto-refresh functionality is enabled by default and the table gets auto-refreshed every 60 seconds. If a test takes more than 60 seconds the table fetches updated rules. Having rules enabled by default and sorted by `Enabled` column makes the sorting order undetermined and as rules get updated due to execution ES return them in undetermined order. This update can happen between commands working on the same element and indexed access like `eq()` would access different elements. - Missing selectors Some tests or helper functions have expectations for an element absence like `should('not.exist')` without checking an element is visible before like `should('be.visible')`. This way a referenced element may disappear from the codebase after refactoring and the expectation still fulfils. - Checking for `should('not.visible')` while an element is removed from the DOM It most applicable to popovers as it first animates to be hidden and then removed from the DOM. Cypress first need to find an element to check its visibility. Replacing `should('not.visible')` with `should('not.exist')` and taking into concern from the account previous bullet fixes the problem. - Modifying ES data without refreshing (`_delete_by_query` in particular) Due to high performance ES architecture data isn't updated instantly. Having such behavior in tests leads to undetermined state depending on a number of environmental factors. As UI doesn't always auto-refreshes to fetch the recent updates in short period of time test fail. `_delete_by_query` may take some time to update the data but it doesn't support `refresh=wait_for` as it stated in [docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#_refreshing_shards). Adding `?refresh=true` or just `?refresh` to `_delete_by_query` ES request urls fixes the problem. ### What was done to address mentioned reasons above? - Auto-refresh functionality disabled in tests where it's not necessary. - `enabled: false` field was added to rule creators to have disabled rules as the majority of tests don't need enabled rules. - `waitForRulesTableToBeLoaded` was removed and replaced with `expectManagementTableRules` at some tests. - `should('not.visible')` replaced with `should('not.exist')` in `deleteRuleFromDetailsPage()` - `?refresh` added to `_delete_by_query` ES data update requests The other changes get rid of global constants and improve readability. ## Flaky test runs [All Cypress tests under `detection_response` folder (100 runs)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2920) (value lists export is flaky but it's out of scope of this PR) (cherry picked from commit 40df521) # Conflicts: # x-pack/test/security_solution_cypress/cypress/e2e/detection_response/rule_creation/custom_query_rule.cy.ts
…4099) (#164903) # Backport This will backport the following commits from `main` to `8.10`: - [[Security Solution] Reduce Rules Management e2e flakiness (#164099)](#164099) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Maxim Palenov","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-08-23T14:58:04Z","message":"[Security Solution] Reduce Rules Management e2e flakiness (#164099)\n\n**Relates to: https://github.com/elastic/kibana/issues/161507**\r\n**Fixes: https://github.com/elastic/kibana/issues/163704**\r\n**Fixes: https://github.com/elastic/kibana/issues/163182**\r\n**Fixes: https://github.com/elastic/kibana/issues/163558**\r\n**Fixes: https://github.com/elastic/kibana/issues/163974**\r\n**Fixes: https://github.com/elastic/kibana/issues/153914**\r\n**Fixes: https://github.com/elastic/kibana/issues/164079**\r\n**Fixes: https://github.com/elastic/kibana/issues/164279**\r\n\r\n## Summary\r\n\r\nWhile working on fixing Rules Management flaky tests I've noticed similar fails in different tests. This PR addresses common pitfalls to reduce a number of reasons causing e2e tests flakiness and as a result reduce a number of flaky tests.\r\n\r\n## Details\r\n\r\nThe common reasons causing e2e tests flakiness for the rules tables are\r\n\r\n- Auto-refresh\r\n\r\nAuto-refresh functionality is enabled by default and the table gets auto-refreshed every 60 seconds. If a test takes more than 60 seconds the table fetches updated rules. Having rules enabled by default and sorted by `Enabled` column makes the sorting order undetermined and as rules get updated due to execution ES return them in undetermined order. This update can happen between commands working on the same element and indexed access like `eq()` would access different elements. \r\n\r\n- Missing selectors\r\n\r\nSome tests or helper functions have expectations for an element absence like `should('not.exist')` without checking an element is visible before like `should('be.visible')`. This way a referenced element may disappear from the codebase after refactoring and the expectation still fulfils.\r\n\r\n- Checking for `should('not.visible')` while an element is removed from the DOM\r\n\r\nIt most applicable to popovers as it first animates to be hidden and then removed from the DOM. Cypress first need to find an element to check its visibility. Replacing `should('not.visible')` with `should('not.exist')` and taking into concern from the account previous bullet fixes the problem.\r\n\r\n- Modifying ES data without refreshing (`_delete_by_query` in particular)\r\n\r\nDue to high performance ES architecture data isn't updated instantly. Having such behavior in tests leads to undetermined state depending on a number of environmental factors. As UI doesn't always auto-refreshes to fetch the recent updates in short period of time test fail. `_delete_by_query` may take some time to update the data but it doesn't support `refresh=wait_for` as it stated in [docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#_refreshing_shards). Adding `?refresh=true` or just `?refresh` to `_delete_by_query` ES request urls fixes the problem.\r\n\r\n### What was done to address mentioned reasons above?\r\n\r\n- Auto-refresh functionality disabled in tests where it's not necessary.\r\n- `enabled: false` field was added to rule creators to have disabled rules as the majority of tests don't need enabled rules.\r\n- `waitForRulesTableToBeLoaded` was removed and replaced with `expectManagementTableRules` at some tests.\r\n- `should('not.visible')` replaced with `should('not.exist')` in `deleteRuleFromDetailsPage()`\r\n- `?refresh` added to `_delete_by_query` ES data update requests\r\n\r\nThe other changes get rid of global constants and improve readability.\r\n\r\n## Flaky test runs\r\n\r\n[All Cypress tests under `detection_response` folder (100 runs)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2920) (value lists export is flaky but it's out of scope of this PR)","sha":"40df5219ea7165e89623afcc22bb0dbc3cc19152","branchLabelMapping":{"^v8.11.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["test","release_note:skip","Team:Detections and Resp","Team: SecuritySolution","Feature:Rule Management","Team:Detection Rule Management","v8.10.0","v8.11.0"],"number":164099,"url":"https://github.com/elastic/kibana/pull/164099","mergeCommit":{"message":"[Security Solution] Reduce Rules Management e2e flakiness (#164099)\n\n**Relates to: https://github.com/elastic/kibana/issues/161507**\r\n**Fixes: https://github.com/elastic/kibana/issues/163704**\r\n**Fixes: https://github.com/elastic/kibana/issues/163182**\r\n**Fixes: https://github.com/elastic/kibana/issues/163558**\r\n**Fixes: https://github.com/elastic/kibana/issues/163974**\r\n**Fixes: https://github.com/elastic/kibana/issues/153914**\r\n**Fixes: https://github.com/elastic/kibana/issues/164079**\r\n**Fixes: https://github.com/elastic/kibana/issues/164279**\r\n\r\n## Summary\r\n\r\nWhile working on fixing Rules Management flaky tests I've noticed similar fails in different tests. This PR addresses common pitfalls to reduce a number of reasons causing e2e tests flakiness and as a result reduce a number of flaky tests.\r\n\r\n## Details\r\n\r\nThe common reasons causing e2e tests flakiness for the rules tables are\r\n\r\n- Auto-refresh\r\n\r\nAuto-refresh functionality is enabled by default and the table gets auto-refreshed every 60 seconds. If a test takes more than 60 seconds the table fetches updated rules. Having rules enabled by default and sorted by `Enabled` column makes the sorting order undetermined and as rules get updated due to execution ES return them in undetermined order. This update can happen between commands working on the same element and indexed access like `eq()` would access different elements. \r\n\r\n- Missing selectors\r\n\r\nSome tests or helper functions have expectations for an element absence like `should('not.exist')` without checking an element is visible before like `should('be.visible')`. This way a referenced element may disappear from the codebase after refactoring and the expectation still fulfils.\r\n\r\n- Checking for `should('not.visible')` while an element is removed from the DOM\r\n\r\nIt most applicable to popovers as it first animates to be hidden and then removed from the DOM. Cypress first need to find an element to check its visibility. Replacing `should('not.visible')` with `should('not.exist')` and taking into concern from the account previous bullet fixes the problem.\r\n\r\n- Modifying ES data without refreshing (`_delete_by_query` in particular)\r\n\r\nDue to high performance ES architecture data isn't updated instantly. Having such behavior in tests leads to undetermined state depending on a number of environmental factors. As UI doesn't always auto-refreshes to fetch the recent updates in short period of time test fail. `_delete_by_query` may take some time to update the data but it doesn't support `refresh=wait_for` as it stated in [docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#_refreshing_shards). Adding `?refresh=true` or just `?refresh` to `_delete_by_query` ES request urls fixes the problem.\r\n\r\n### What was done to address mentioned reasons above?\r\n\r\n- Auto-refresh functionality disabled in tests where it's not necessary.\r\n- `enabled: false` field was added to rule creators to have disabled rules as the majority of tests don't need enabled rules.\r\n- `waitForRulesTableToBeLoaded` was removed and replaced with `expectManagementTableRules` at some tests.\r\n- `should('not.visible')` replaced with `should('not.exist')` in `deleteRuleFromDetailsPage()`\r\n- `?refresh` added to `_delete_by_query` ES data update requests\r\n\r\nThe other changes get rid of global constants and improve readability.\r\n\r\n## Flaky test runs\r\n\r\n[All Cypress tests under `detection_response` folder (100 runs)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2920) (value lists export is flaky but it's out of scope of this PR)","sha":"40df5219ea7165e89623afcc22bb0dbc3cc19152"}},"sourceBranch":"main","suggestedTargetBranches":["8.10"],"targetPullRequestStates":[{"branch":"8.10","label":"v8.10.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.11.0","labelRegex":"^v8.11.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/164099","number":164099,"mergeCommit":{"message":"[Security Solution] Reduce Rules Management e2e flakiness (#164099)\n\n**Relates to: https://github.com/elastic/kibana/issues/161507**\r\n**Fixes: https://github.com/elastic/kibana/issues/163704**\r\n**Fixes: https://github.com/elastic/kibana/issues/163182**\r\n**Fixes: https://github.com/elastic/kibana/issues/163558**\r\n**Fixes: https://github.com/elastic/kibana/issues/163974**\r\n**Fixes: https://github.com/elastic/kibana/issues/153914**\r\n**Fixes: https://github.com/elastic/kibana/issues/164079**\r\n**Fixes: https://github.com/elastic/kibana/issues/164279**\r\n\r\n## Summary\r\n\r\nWhile working on fixing Rules Management flaky tests I've noticed similar fails in different tests. This PR addresses common pitfalls to reduce a number of reasons causing e2e tests flakiness and as a result reduce a number of flaky tests.\r\n\r\n## Details\r\n\r\nThe common reasons causing e2e tests flakiness for the rules tables are\r\n\r\n- Auto-refresh\r\n\r\nAuto-refresh functionality is enabled by default and the table gets auto-refreshed every 60 seconds. If a test takes more than 60 seconds the table fetches updated rules. Having rules enabled by default and sorted by `Enabled` column makes the sorting order undetermined and as rules get updated due to execution ES return them in undetermined order. This update can happen between commands working on the same element and indexed access like `eq()` would access different elements. \r\n\r\n- Missing selectors\r\n\r\nSome tests or helper functions have expectations for an element absence like `should('not.exist')` without checking an element is visible before like `should('be.visible')`. This way a referenced element may disappear from the codebase after refactoring and the expectation still fulfils.\r\n\r\n- Checking for `should('not.visible')` while an element is removed from the DOM\r\n\r\nIt most applicable to popovers as it first animates to be hidden and then removed from the DOM. Cypress first need to find an element to check its visibility. Replacing `should('not.visible')` with `should('not.exist')` and taking into concern from the account previous bullet fixes the problem.\r\n\r\n- Modifying ES data without refreshing (`_delete_by_query` in particular)\r\n\r\nDue to high performance ES architecture data isn't updated instantly. Having such behavior in tests leads to undetermined state depending on a number of environmental factors. As UI doesn't always auto-refreshes to fetch the recent updates in short period of time test fail. `_delete_by_query` may take some time to update the data but it doesn't support `refresh=wait_for` as it stated in [docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#_refreshing_shards). Adding `?refresh=true` or just `?refresh` to `_delete_by_query` ES request urls fixes the problem.\r\n\r\n### What was done to address mentioned reasons above?\r\n\r\n- Auto-refresh functionality disabled in tests where it's not necessary.\r\n- `enabled: false` field was added to rule creators to have disabled rules as the majority of tests don't need enabled rules.\r\n- `waitForRulesTableToBeLoaded` was removed and replaced with `expectManagementTableRules` at some tests.\r\n- `should('not.visible')` replaced with `should('not.exist')` in `deleteRuleFromDetailsPage()`\r\n- `?refresh` added to `_delete_by_query` ES data update requests\r\n\r\nThe other changes get rid of global constants and improve readability.\r\n\r\n## Flaky test runs\r\n\r\n[All Cypress tests under `detection_response` folder (100 runs)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2920) (value lists export is flaky but it's out of scope of this PR)","sha":"40df5219ea7165e89623afcc22bb0dbc3cc19152"}}]}] BACKPORT-->
) **NOTE: This is a manual backport of #164099** **Original PR description:** --- **Relates to: #161507 **Fixes: #163704 **Fixes: #163182 **Fixes: #163558 **Fixes: #163974 **Fixes: #153914 **Fixes: #164079 **Fixes: #164279 ## Summary While working on fixing Rules Management flaky tests I've noticed similar fails in different tests. This PR addresses common pitfalls to reduce a number of reasons causing e2e tests flakiness and as a result reduce a number of flaky tests. ## Details The common reasons causing e2e tests flakiness for the rules tables are - Auto-refresh Auto-refresh functionality is enabled by default and the table gets auto-refreshed every 60 seconds. If a test takes more than 60 seconds the table fetches updated rules. Having rules enabled by default and sorted by `Enabled` column makes the sorting order undetermined and as rules get updated due to execution ES return them in undetermined order. This update can happen between commands working on the same element and indexed access like `eq()` would access different elements. - Missing selectors Some tests or helper functions have expectations for an element absence like `should('not.exist')` without checking an element is visible before like `should('be.visible')`. This way a referenced element may disappear from the codebase after refactoring and the expectation still fulfils. - Checking for `should('not.visible')` while an element is removed from the DOM It most applicable to popovers as it first animates to be hidden and then removed from the DOM. Cypress first need to find an element to check its visibility. Replacing `should('not.visible')` with `should('not.exist')` and taking into concern from the account previous bullet fixes the problem. - Modifying ES data without refreshing (`_delete_by_query` in particular) Due to high performance ES architecture data isn't updated instantly. Having such behavior in tests leads to undetermined state depending on a number of environmental factors. As UI doesn't always auto-refreshes to fetch the recent updates in short period of time test fail. `_delete_by_query` may take some time to update the data but it doesn't support `refresh=wait_for` as it stated in [docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#_refreshing_shards). Adding `?refresh=true` or just `?refresh` to `_delete_by_query` ES request urls fixes the problem. ### What was done to address mentioned reasons above? - Auto-refresh functionality disabled in tests where it's not necessary. - `enabled: false` field was added to rule creators to have disabled rules as the majority of tests don't need enabled rules. - `waitForRulesTableToBeLoaded` was removed and replaced with `expectManagementTableRules` at some tests. - `should('not.visible')` replaced with `should('not.exist')` in `deleteRuleFromDetailsPage()` - `?refresh` added to `_delete_by_query` ES data update requests The other changes get rid of global constants and improve readability. ## Flaky test runs [All Cypress tests under `detection_response` folder (100 runs)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2920) (value lists export is flaky but it's out of scope of this PR) --------- Co-authored-by: kibanamachine <[email protected]>
Relates to: #161507
Fixes: #163704
Fixes: #163182
Fixes: #163558
Fixes: #163974
Fixes: #153914
Fixes: #164079
Fixes: #164279
Summary
While working on fixing Rules Management flaky tests I've noticed similar fails in different tests. This PR addresses common pitfalls to reduce a number of reasons causing e2e tests flakiness and as a result reduce a number of flaky tests.
Details
The common reasons causing e2e tests flakiness for the rules tables are
Auto-refresh functionality is enabled by default and the table gets auto-refreshed every 60 seconds. If a test takes more than 60 seconds the table fetches updated rules. Having rules enabled by default and sorted by
Enabled
column makes the sorting order undetermined and as rules get updated due to execution ES return them in undetermined order. This update can happen between commands working on the same element and indexed access likeeq()
would access different elements.Some tests or helper functions have expectations for an element absence like
should('not.exist')
without checking an element is visible before likeshould('be.visible')
. This way a referenced element may disappear from the codebase after refactoring and the expectation still fulfils.should('not.visible')
while an element is removed from the DOMIt most applicable to popovers as it first animates to be hidden and then removed from the DOM. Cypress first need to find an element to check its visibility. Replacing
should('not.visible')
withshould('not.exist')
and taking into concern from the account previous bullet fixes the problem._delete_by_query
in particular)Due to high performance ES architecture data isn't updated instantly. Having such behavior in tests leads to undetermined state depending on a number of environmental factors. As UI doesn't always auto-refreshes to fetch the recent updates in short period of time test fail.
_delete_by_query
may take some time to update the data but it doesn't supportrefresh=wait_for
as it stated in docs. Adding?refresh=true
or just?refresh
to_delete_by_query
ES request urls fixes the problem.What was done to address mentioned reasons above?
enabled: false
field was added to rule creators to have disabled rules as the majority of tests don't need enabled rules.waitForRulesTableToBeLoaded
was removed and replaced withexpectManagementTableRules
at some tests.should('not.visible')
replaced withshould('not.exist')
indeleteRuleFromDetailsPage()
?refresh
added to_delete_by_query
ES data update requestsThe other changes get rid of global constants and improve readability.
Flaky test runs
All Cypress tests under
detection_response
folder (100 runs) (value lists export is flaky but it's out of scope of this PR)