Skip to content

Conversation

@kripken
Copy link
Member

@kripken kripken commented Aug 15, 2019

They fail with 'Access is denied' in atexits. Researching this, I can't find an actual proper solution.

Also it is mysterious that it happens on some tests and not others, but these are the ones I keep seeing again and again, e.g. https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket.appspot.com/8905118125082322064/+/steps/Emscripten_testsuite__upstream__other_/0/stdout https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket.appspot.com/8905121773696247856/+/steps/Emscripten_testsuite__upstream__other_/0/stdout

There is nothing in particular platform-specific about these tests, so it doesn't seem that bad to just disable them.

@kripken kripken requested a review from dschuff August 15, 2019 02:42
@quantum5
Copy link
Member

I am not convinced that this is sufficient to deal with the flakiness issue, since it entirely possible that other tests would start failing with the same cause.

Hilariously, the only resources I could find are:

@quantum5
Copy link
Member

Further investigation revealed that this is probably a bug in python's multiprocessing module.

It appears that subprocess.Popen.terminate is written specifically to resist the spurious access denied errors from killing an already exited process:

        def terminate(self):
            """Terminates the process
            """
            try:
                _subprocess.TerminateProcess(self._handle, 1)
            except OSError as e:
                # ERROR_ACCESS_DENIED (winerror 5) is received when the
                # process already died.
                if e.winerror != 5:
                    raise
                rc = _subprocess.GetExitCodeProcess(self._handle)
                if rc == _subprocess.STILL_ACTIVE:
                    raise
                self.returncode = rc

Yet, multiprocessing calls _subprocess.TerminateProcess directly without this treatment. This strongly suggests the flake could happen with any test.

A potential fix is to monkey patch the function, since any submitted python patch is unlikely to make its way into the deployed versions anytime soon.

@dschuff
Copy link
Member

dschuff commented Aug 15, 2019

Monkeypatching the function seems fine if we can ensure that it only happens e.g. on the Chromium infrastructure where we know exactly what version of Python we are using. Do you happen to know whether e.g. python3 has this same problem?

@quantum5
Copy link
Member

Current python master branch still has the same problem:

    def terminate(self):
        if self.returncode is None:
            try:
                _winapi.TerminateProcess(int(self._handle), TERMINATE)
            except OSError:
                if self.wait(timeout=1.0) is None:
                    raise

See https://github.com/python/cpython/blob/master/Lib/multiprocessing/popen_spawn_win32.py#L120-L126

@quantum5
Copy link
Member

Upon further examination, if self.wait(timeout=0.1) is None also attempts to detect if the process has actually exited. Only when it hasn't is the error reraised. It seems like the explanation is incorrect.

@kripken
Copy link
Member Author

kripken commented Aug 19, 2019

@quantum5's insight here is that we just shouldn't compare stderr to an empty string unnecessarily. I rewrote the PR to do that. This should reduce such bot errors by a lot.

To make the refactoring even nicer, add self.assertContainedIf which is like self.assertContained / self.assertNotContained but takes a parameter.

@kripken kripken changed the title Disable two tests in other that fail quite frequently on windows Don't compare stderr to an empty string unnecessarily, which can result in random errors on windows (due to Access is denied issues). Aug 19, 2019
@kripken kripken requested a review from sbc100 August 20, 2019 00:01
@kripken kripken merged commit 12eb93b into incoming Aug 20, 2019
@kripken kripken deleted the win branch August 20, 2019 15:03
belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020
…lt in random errors on windows (due to Access is denied issues). (emscripten-core#9240)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants