Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart browser if it became unresponsive #1815

Closed
AndreyBelym opened this issue Sep 25, 2017 · 33 comments
Closed

Restart browser if it became unresponsive #1815

AndreyBelym opened this issue Sep 25, 2017 · 33 comments
Assignees
Labels
AREA: server STATE: Auto-locked An issue has been automatically locked by the Lock bot. SYSTEM: browser connection SYSTEM: browser natives TYPE: enhancement The accepted proposal for future implementation.
Milestone

Comments

@AndreyBelym
Copy link
Contributor

Are you requesting a feature or reporting a bug?

Feature

What is the current behavior?

When browser becomes unresponsive (XXX browser disconnected. This problem may appear when a browser hangs or is closed, or due to network issues.) we stop all running tests.

What is the expected behavior?

Unresponsive browser can be closed, and a new browser instance can be started on the next or even the same test (e.g. if quarantine mode is enabled).

How would you reproduce the current behavior (if this is a bug)?

???

Specify your

  • testcafe version: 0.17.2
  • node.js version: any
@pablorivera
Copy link

Hi, we are having the same issue with testcafe 0.17.2 with a couple of tests running in Bamboo CI that sometimes fails because of this error. It would be great if you can work on this ticket. Below is the stacktrace.

12-Oct-2017 10:43:40 | ERROR The Chrome 53.0.2785 / Linux 0.0.0 browser disconnected. This problem may appear when a browser hangs or is closed, or due to network issues.
-- | --
12-Oct-2017 10:43:40 |  
12-Oct-2017 10:43:40 | Type "testcafe -h" for help.
12-Oct-2017 10:43:40 |  
12-Oct-2017 10:43:40 | npm ERR! Linux 4.8.0-1.el7.elrepo.x86_64
12-Oct-2017 10:43:40 | npm ERR! argv "/usr/bin/node" "/usr/bin/npm" "run" "test-ci" "tests"
12-Oct-2017 10:43:40 | npm ERR! node v6.7.0
12-Oct-2017 10:43:40 | npm ERR! npm  v3.10.3
12-Oct-2017 10:43:40 | npm ERR! code ELIFECYCLE
12-Oct-2017 10:43:40 | npm ERR! [email protected] test-ci: `testcafe 'chromium --no-sandbox' -s ./screenshots --screenshots-on-fails --color "tests"`
12-Oct-2017 10:43:40 | npm ERR! Exit status 1
12-Oct-2017 10:43:40 | npm ERR!
12-Oct-2017 10:43:40 | npm ERR! Failed at the [email protected] test-ci script 'testcafe 'chromium --no-sandbox' -s ./screenshots --screenshots-on-fails --color "tests"'.
12-Oct-2017 10:43:40 | npm ERR! Make sure you have the latest version of node.js and npm installed.
12-Oct-2017 10:43:40 | npm ERR! If you do, this is most likely a problem with the testcafe package,
12-Oct-2017 10:43:40 | npm ERR! not with npm itself.
12-Oct-2017 10:43:40 | npm ERR! Tell the author that this fails on your system:
12-Oct-2017 10:43:40 | npm ERR!     testcafe 'chromium --no-sandbox' -s ./screenshots --screenshots-on-fails --color "tests"
12-Oct-2017 10:43:40 | npm ERR! You can get information on how to open an issue for this project with:
12-Oct-2017 10:43:40 | npm ERR!     npm bugs testcafe
12-Oct-2017 10:43:40 | npm ERR! Or if that isn't available, you can get their info via:
12-Oct-2017 10:43:40 | npm ERR!     npm owner ls testcafe
12-Oct-2017 10:43:40 | npm ERR! There is likely additional logging output above.
12-Oct-2017 10:43:40 |  
12-Oct-2017 10:43:40 | npm ERR! Please include the following file with any support request:
12-Oct-2017 10:43:40 | npm ERR!     /apps/testcafe/npm-debug.log
12-Oct-2017 10:43:41 | Tests failed

@pablorivera
Copy link

It also happens with testcafe 0.18.1

@gtroshin
Copy link

Annoying issue. Any ETA here? :) 0.18.5 still has it

@berrutti
Copy link

I'm having the same problem. I thought it was related to local execution but our client experiences the same on his machine.

@AlexanderMoskovkin
Copy link
Contributor

Hi,

Thanks for your interest. This issue is in our plans to the one of the next releases.

@Automation-Geek
Copy link

Automation-Geek commented Mar 23, 2018

My observation is if we run bulk test cases in one Browser then only face this issue. @AlexanderMoskovkin : what is your thought on this. I have allready Opened the issue : DevExpress/testcafe-browser-provider-browserstack#25

@katerynarieznik
Copy link

katerynarieznik commented Apr 3, 2018

Hi 👋 @AndreyBelym @AlexanderMoskovkin
We also have this kind of problem when Testcafe became unresponsive. We are running tests with concurrency 1 in Chrome. And usually, it fails when the time of execution reaches 20+ minutes.

{ Error: The Chrome 64.0.3282 / Linux 0.0.0 browser disconnected. This problem may appear when a browser hangs or is closed, or due to network issues.
    at Timeout.<anonymous> (/node_modules/testcafe/lib/browser/connection/index.js:226:34)
    at ontimeout (timers.js:475:11)
    at tryOnTimeout (timers.js:310:5)
    at Timer.listOnTimeout (timers.js:270:5) constructor: [Function: GeneralError] }

I also got this issue when I was debugging test with VSCode debugger for some time. Look like for us there is some kind of timeout reached on 25th-minute point 🤔

I hope that can help you investigate this issue.

@sijosyn
Copy link

sijosyn commented Apr 27, 2018

Can we get this feature implemented in ASAP - @AndreyBelym / @AlexanderMoskovkin? This issue is very annoying and blocking our test execution on Safari, Firefox (& maybe Edge/IE). We are still seeing this issue in the latest TestCafe version 0.20.0-alpha.1 ? Please let me know if I can help provide any additional debug logs. I already provided some logs here - https://testcafe-discuss.devexpress.com/t/browser-disconnected-error-this-problem-may-appear-when-a-browser-hangs-or-is-closed-or-due-to-network-issues/383

@kirovboris
Copy link
Collaborator

kirovboris commented Apr 27, 2018

Hi @sijosyn, we plan to implement this feature in next sprint (0.21.0 version). We'll let you know when the dev build will become available.

@Automation-Geek
Copy link

Automation-Geek commented Apr 27, 2018 via email

@Automation-Geek
Copy link

Automation-Geek commented May 22, 2018

Any ETA for it's fix? I am using this testcafe version 0.18.6 now

@dmc1522
Copy link

dmc1522 commented Jun 25, 2018

Any update on this? I have a couple of suites running on Jenkins and randomly stop because of this... I am really thinking about creating a daemon to keep checking the job and re-start everything in case it hangs like this, but that doesn't seem like a good solution... is this going out already?

@berrutti
Copy link

berrutti commented Jul 2, 2018

Is there a way to contribute code to this issue? I would like to help in any way possible. Thanks

@AndreyBelym
Copy link
Contributor Author

Hi @berrutti, any contributions are welcome. I think all things that should be changed lies on https://github.com/DevExpress/testcafe/blob/master/src/browser/connection/index.js, look at the _waitForHeartbeat function. If you make a PR, I can help with tests and can answer any questions that will arise.

@need-response-app need-response-app bot added the STATE: Need response An issue that requires a response or attention from the team. label Apr 6, 2019
@AlexKamaev AlexKamaev removed the STATE: Need response An issue that requires a response or attention from the team. label Apr 8, 2019
@itdoginfo
Copy link

I want to add another example why this function is needed.

The example contains testing 3D graphics on mobile devices. Not all mobile devices can load complex 3d graphics.

This is a simplified test to make it easy to reproduce:

npm install testcafe-browser-provider-android

test.js

import { Selector } from 'testcafe';

var examples = ['https://devexpress.github.io/testcafe/example/', 'https://xeokit.github.io/xeokit-sdk/examples/#sceneRepresentation_PerformanceModel_batching_benchmark', 'google.com'];

examples.forEach(example => {
   
    fixture `Test page: ${example}`
        .page (`${example}`);

    test('My test', async t => {
        const element     = Selector('#developer-name', {timeout:90000});
        const clientWidth = await element.clientWidth;
    });
});

A timeout is needed to wait until browser falls on the 3d.

npx testcafe android::52109133:chrome test.js

The browser cannot load the 3D model and the testcafe restarts this test after some time, and so on three times. After that, the testcafe completes its work with output:

ERROR The Chrome Mobile 76.0.3809 / Android 8.0.0 browser disconnected. This problem may appear when a browser hangs or is closed, or due to network issues.

In the example, several tests and I expected that after a failed test on one page, the tests will continue on the next. But no, this completely interrupts the work.

I use Samsung Galaxy A5 (2017) and webdriverio + appium as browser-provider. And I get the same issue on ios devices.

@need-response-app need-response-app bot added the STATE: Need response An issue that requires a response or attention from the team. label Aug 2, 2019
@Farfurix Farfurix assigned Farfurix and unassigned Farfurix Aug 5, 2019
@AlexKamaev
Copy link
Contributor

AlexKamaev commented Aug 5, 2019

@itdoginfo
The purpose of this feature is improving the stability of tests under the unexpected conditions which make the browser hang.
This means that if the browser hung during the test, TestCafe tries to restart the browser on this specific test, because TestCafe does not exactly know that your heavy page leads to hanging.  It's possible that your machine just does not have enough memory because of some other application. In this case, we try to restart the browser to finish the test.
In other words, this feature must help the test to pass and not to fail it. If you know exactly that some of your tests is very heavy for some devices, it's better to exclude this test from executing on these devices, and not relying on the mechanism which is not intended for this.

@need-response-app need-response-app bot removed the STATE: Need response An issue that requires a response or attention from the team. label Aug 5, 2019
@itdoginfo
Copy link

@AlexKamaev
I almost agree with current behavior, the only difference I expect testcafe to fail test and move on to the next one.

In other words, this feature must help the test to pass and not to fail it. If you know exactly that some of your tests is very heavy for some devices, it's better to exclude this test from executing on these devices, and not relying on the mechanism which is not intended for this.

We have test suite with a kind of stress tests, that we try to run on as many devices as we can.
We are expecting tests to be as heavy as they can.
Current behavior is ok, we see that not all of our tests passed, but if we failed on the first test, we do not know the state of other tests.

Maybe there is some workaround?

@need-response-app need-response-app bot added the STATE: Need response An issue that requires a response or attention from the team. label Aug 5, 2019
@AndreyBelym AndreyBelym assigned AndreyBelym and unassigned AlexKamaev Aug 6, 2019
@AndreyBelym
Copy link
Contributor Author

TestCafe was not designed with stress testing support in mind, so I'm afraid it's not possible to achieve your goal without heavily modifying the TestCafe code. But I think we can make stress testing possible once we allow using TestCafe for browser automation tasks - #2501.

@need-response-app need-response-app bot removed the STATE: Need response An issue that requires a response or attention from the team. label Aug 6, 2019
@kirillgroshkov
Copy link

We're not using TestCafe for stress-testing, but still our e2e tests hang with "browser disconnected" message every now and then, which makes them very unstable. We're almost considering dropping TestCafe because of this single issue, cause it was very annoying throughout last year. We're managing some sophisticated "restarting scripts" right now to mitigate that, but that increases our e2e run time by 2 / 3 times (cause restart restarts ALL tests).

@need-response-app need-response-app bot added the STATE: Need response An issue that requires a response or attention from the team. label Aug 6, 2019
@rehael
Copy link

rehael commented Aug 6, 2019

How hard would it be to add something along the lines of test.skipOnDisconnect() (or any other method of tagging the test with specific instructions for the runner how to handle the test – maybe in meta like test.meta({skipOnDisconnect: true})?), which upon browser disconnect would normally restart browser, and continue from the next test? Where should I start looking if I'd like to create a PR for it?

@kirillgroshkov
Copy link

Or test.meta({restartOnDisconnect: true}), since for us it eventually works after 1/few restarts, and would be more efficient than restarting ALL tests

@AndreyBelym
Copy link
Contributor Author

@kirillgroshkov such behavior clearly indicates that TestCafe conflicts with your pages in some way. Even if we skip a test that caused a browser failure, there is no guarantee that the next test will pass. It's better to create a separate bug report and provide an example or access to your pages. In this case, we can effectively debug and fix the reason that causes browser disconnections.

@rehael a change like this will cover several subsystems (compiler, browser connections, testing runtime) and due to its complexity, we will need to discuss if we really want to include such patch in our codebase.

@need-response-app need-response-app bot removed the STATE: Need response An issue that requires a response or attention from the team. label Aug 8, 2019
@AndreyBelym
Copy link
Contributor Author

I close this issue since restarting browsers that don't respond was implement in a basic way. Enhancements suggested by @miherlosev () are extracted to a separate issue: #4132. The suggestion about skipping the test that caused the DNR state in the browser is extracted to #4133. If you have any other suggestions about browser restarting, please don't hesitate to create new feature requests.

@lock
Copy link

lock bot commented Aug 18, 2019

This thread has been automatically locked since it is closed and there has not been any recent activity. Please open a new issue for related bugs or feature requests. We recommend you ask TestCafe API, usage and configuration inquiries on StackOverflow.

@lock lock bot added the STATE: Auto-locked An issue has been automatically locked by the Lock bot. label Aug 18, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Aug 18, 2019
kirovboris pushed a commit to kirovboris/testcafe-phoenix that referenced this issue Dec 18, 2019
…evExpress#2800)

* [WIP]Restart browser if it became unresponsive (closes DevExpress#1815)

* tests

* change approach

* fix server tests

* fix for headless browsers

* fix functional test

* refactor and fix tests

* refactoring and tests

* refactoring

* refactor

* refactoring

* fix and refactoring

* add listenered to connection only when testRun is started

* rename and fix typos
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AREA: server STATE: Auto-locked An issue has been automatically locked by the Lock bot. SYSTEM: browser connection SYSTEM: browser natives TYPE: enhancement The accepted proposal for future implementation.
Projects
None yet
Development

No branches or pull requests