[8.8] [ftr] Improve FTR error handling for NoSuchSessionError (#161025)#161206
Closed
kibanamachine wants to merge 1 commit intoelastic:8.8from
Closed
[8.8] [ftr] Improve FTR error handling for NoSuchSessionError (#161025)#161206kibanamachine wants to merge 1 commit intoelastic:8.8from
kibanamachine wants to merge 1 commit intoelastic:8.8from
Conversation
## Summary Sometimes browser/driver process dies during test run on CI and FTR fails with errors cascade, good example is [here](https://buildkite.com/elastic/kibana-pull-request/builds/138535#0188fd74-9adf-4011-8168-1bdc6e3d0f17) Current behaviour on `main`: FTR lifecycle hooks, defined in [remote](https://github.com/elastic/kibana/blob/57aea91fae4f2d0a047ff9e2339eb61352f1d73d/test/functional/services/remote/remote.ts) service, has no access to the information about test suite run and particular test failure. These hooks are related to WebDriver (browser) state management and suppose to reset it to default state. Currently we silently fail screenshot taking which means tests execution is continued even if `--bail` flag is passed. It ends with cascade of failures with the same error `NoSuchSessionError: invalid session id` <details> <summary>FTR output on failure</summary> ``` └- ✖ fail: discover/group1 discover test query should show correct time range string by timepicker │ Error: expected 'Sep 19, 2015 @ 06:31:44.000' to equal 'Sep 23, 2015 @ 18:31:44.000' │ at Assertion.assert (expect.js:100:11) │ at Assertion.apply (expect.js:227:8) │ at Assertion.be (expect.js:69:22) │ at Context.<anonymous> (_discover.ts:53:31) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Object.apply (wrap_function.js:73:16) │ │ └-> "after all" hook: afterTestSuite.trigger for "should reload the saved search with persisted query to show the initial hit count" └- ✖ fail: discover/group1 discover test query "after all" hook: afterTestSuite.trigger for "should reload the saved search with persisted query to show the initial hit count" │ NoSuchSessionError: invalid session id │ at Object.throwDecodedError (node_modules/selenium-webdriver/lib/error.js:524:15) │ at parseHttpResponse (node_modules/selenium-webdriver/lib/http.js:601:13) │ at Executor.execute (node_modules/selenium-webdriver/lib/http.js:529:28) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Task.exec (prevent_parallel_calls.ts:28:20) │ │ └-> "after all" hook in "discover test" │ debg Cleaning all saved objects { space: undefined } │ succ deleted 2 objects └-> "after all" hook: afterTestSuite.trigger in "discover test" └- ✖ fail: discover/group1 discover test "after all" hook: afterTestSuite.trigger in "discover test" │ NoSuchSessionError: invalid session id │ at Object.throwDecodedError (node_modules/selenium-webdriver/lib/error.js:524:15) │ at parseHttpResponse (node_modules/selenium-webdriver/lib/http.js:601:13) │ at Executor.execute (node_modules/selenium-webdriver/lib/http.js:529:28) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Task.exec (prevent_parallel_calls.ts:28:20) │ │ └-> "after all" hook: unloadMakelogs in "discover/group1" │ info [test/functional/fixtures/es_archiver/logstash_functional] Unloading indices from "mappings.json" │ info [test/functional/fixtures/es_archiver/logstash_functional] Deleted existing index "logstash-2015.09.22" │ info [test/functional/fixtures/es_archiver/logstash_functional] Deleted existing index "logstash-2015.09.20" │ info [test/functional/fixtures/es_archiver/logstash_functional] Deleted existing index "logstash-2015.09.21" │ info [test/functional/fixtures/es_archiver/logstash_functional] Unloading indices from "data.json.gz" └-> "after all" hook: afterTestSuite.trigger in "discover/group1" └- ✖ fail: discover/group1 "after all" hook: afterTestSuite.trigger in "discover/group1" │ NoSuchSessionError: invalid session id │ at Object.throwDecodedError (node_modules/selenium-webdriver/lib/error.js:524:15) │ at parseHttpResponse (node_modules/selenium-webdriver/lib/http.js:601:13) │ at Executor.execute (node_modules/selenium-webdriver/lib/http.js:529:28) │ at runMicrotasks (<anonymous>) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Task.exec (prevent_parallel_calls.ts:28:20) │ │ │ │0 passing (15.7s) │4 failing │ │1) discover/group1 │ discover test │ query │ should show correct time range string by timepicker: │ │ Error: expected 'Sep 19, 2015 @ 06:31:44.000' to equal 'Sep 23, 2015 @ 18:31:44.000' │ at Assertion.assert (expect.js:100:11) │ at Assertion.apply (expect.js:227:8) │ at Assertion.be (expect.js:69:22) │ at Context.<anonymous> (_discover.ts:53:31) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Object.apply (wrap_function.js:73:16) │ │ │2) discover/group1 │ discover test │ query │ "after all" hook: afterTestSuite.trigger for "should reload the saved search with persisted query to show the initial hit count": │ │ NoSuchSessionError: invalid session id │ at Object.throwDecodedError (node_modules/selenium-webdriver/lib/error.js:524:15) │ at parseHttpResponse (node_modules/selenium-webdriver/lib/http.js:601:13) │ at Executor.execute (node_modules/selenium-webdriver/lib/http.js:529:28) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Task.exec (prevent_parallel_calls.ts:28:20) │ │ │3) discover/group1 │ discover test │ "after all" hook: afterTestSuite.trigger in "discover test": │ │ NoSuchSessionError: invalid session id │ at Object.throwDecodedError (node_modules/selenium-webdriver/lib/error.js:524:15) │ at parseHttpResponse (node_modules/selenium-webdriver/lib/http.js:601:13) │ at Executor.execute (node_modules/selenium-webdriver/lib/http.js:529:28) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Task.exec (prevent_parallel_calls.ts:28:20) │ │ │4) discover/group1 │ "after all" hook: afterTestSuite.trigger in "discover/group1": │ │ NoSuchSessionError: invalid session id │ at Object.throwDecodedError (node_modules/selenium-webdriver/lib/error.js:524:15) │ at parseHttpResponse (node_modules/selenium-webdriver/lib/http.js:601:13) │ at Executor.execute (node_modules/selenium-webdriver/lib/http.js:529:28) │ at runMicrotasks (<anonymous>) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Task.exec (prevent_parallel_calls.ts:28:20) ``` </details> This PR change: I didn't find a good reason why we need to fail silently on screenshot taking. I added a check WebDriver session status with `hasOpenWindow` and take failure artefacts only if is still valid. Next change is to fail FTR after hooks related to WebDriver silently: there is no help having cascade of the repeated stacktrace so I wrap WebDriver call in hooks with `tryWebDriverCall` that catches the error and only prints it for visibility. <details> <summary>FTR new output on failure</summary> ``` │ERROR WebDriver session is no longer valid. │ Probably Chrome process crashed when it tried to use more memory than what was available. │ERROR Browser is closed, no artifacts were captured for the failure └- ✖ fail: discover/group1 discover test query should show correct time range string by timepicker │ Error: expected 'Sep 19, 2015 @ 06:31:44.000' to equal 'Sep 23, 2015 @ 18:31:44.000' │ at Assertion.assert (expect.js:100:11) │ at Assertion.apply (expect.js:227:8) │ at Assertion.be (expect.js:69:22) │ at Context.<anonymous> (_discover.ts:53:31) │ at processTicksAndRejections (node:internal/process/task_queues:96:5) │ at Object.apply (wrap_function.js:73:16) │ │ └-> "after all" hook: afterTestSuite.trigger for "should reload the saved search with persisted query to show the initial hit count" │ERROR WebDriver session is no longer valid └-> "after all" hook in "discover test" │ debg Cleaning all saved objects { space: undefined } │ warn browser[SEVERE] ERROR FETCHING BROWSR LOGS: This driver instance does not have a valid session ID (did you call WebDriver.quit()?) and may no longer be used. │ succ deleted 2 objects └-> "after all" hook: afterTestSuite.trigger in "discover test" │ERROR WebDriver session is no longer valid └-> "after all" hook: unloadMakelogs in "discover/group1" │ info [test/functional/fixtures/es_archiver/logstash_functional] Unloading indices from "mappings.json" │ info [test/functional/fixtures/es_archiver/logstash_functional] Deleted existing index "logstash-2015.09.22" │ info [test/functional/fixtures/es_archiver/logstash_functional] Deleted existing index "logstash-2015.09.20" │ info [test/functional/fixtures/es_archiver/logstash_functional] Deleted existing index "logstash-2015.09.21" │ info [test/functional/fixtures/es_archiver/logstash_functional] Unloading indices from "data.json.gz" └-> "after all" hook: afterTestSuite.trigger in "discover/group1" │ERROR WebDriver session is no longer valid 0 passing (16.2s) 1 failing 1) discover/group1 discover test query should show correct time range string by timepicker: Error: expected 'Sep 19, 2015 @ 06:31:44.000' to equal 'Sep 23, 2015 @ 18:31:44.000' at Assertion.assert (expect.js:100:11) at Assertion.apply (expect.js:227:8) at Assertion.be (expect.js:69:22) at Context.<anonymous> (_discover.ts:53:31) at processTicksAndRejections (node:internal/process/task_queues:96:5) at Object.apply (wrap_function.js:73:16) ``` </details> Flaky-test-runner verification: started 100x to hopefully catch invalid session on CI https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2539 Note: locally I was simulating it by calling `this.driver.close()` to close browser before screenshot taking (cherry picked from commit 8a95bf7)
Contributor
Author
|
CI was triggered for this PR, but this PR targets 8.8 which should not receive a future release. CI is not supported for these branches. Please consult the release schedule, or contact The following branches are currently considered to be open:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport
This will backport the following commits from
mainto8.8:Questions ?
Please refer to the Backport tool documentation