Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver: more reasonable thread wait timeout on Windows. #2339

Merged
merged 1 commit into from
Dec 13, 2019

Conversation

Jehan
Copy link

@Jehan Jehan commented Dec 13, 2019

It used to be 5ms, which might not be long enough in some cases for the
thread to exit well, but then when set to 5000 (5s), it would slow down
any program depending on OpenBlas.

Let's just set it to 50ms, which is at least 10 times longer than
originally, but still reasonable in case of failed thread termination.


This is a followup of #2314. After providing a test installer of GIMP with patched OpenBLAS, reporters could confirm crashes were gone. On the other hand, it seems GIMP got extra slower on startup on Windows (worse reports were telling of GIMP startup hanging for more than 10 minutes!). I am unsure when this code in blas_server_win32.c is called exactly, and how many threads it creates and it awaits for termination (and I'd like to not have to spend too long on this! 😛), but I figured the timeout we passed to 5 seconds is the culprit. After testing, lowering it down to 50ms (which is still 10 times longer than the original 5ms, but at least reasonable wait in case of failure), people confirmed GIMP startup is good again and crash is still gone.

Anyway I think that the other part of the previous patch (not terminating a thread which was successfully terminated by itself) was the most important part of the fix.

It used to be 5ms, which might not be long enough in some cases for the
thread to exit well, but then when set to 5000 (5s), it would slow down
any program depending on OpenBlas.

Let's just set it to 50ms, which is at least 10 times longer than
originally, but still reasonable in case of failed thread termination.
@martin-frbg
Copy link
Collaborator

martin-frbg commented Dec 13, 2019

Thanks. I had initially assumed that the proposed timeout had been tested already :)
I suspect GIMP (or actually its openblas-dependent plugin) is just loading openblas on startup to see if it is available, and unloading it immediately.

@Jehan
Copy link
Author

Jehan commented Dec 13, 2019

Thanks. I had initially assumed that the proposed timeout had been tested already :)

Sorry to disappoint! 😛

To be fair, we have nearly no Windows developers (at least no recurrent one, we just get random patches sometimes, and when I get bored, I cross-compile and test some random bugs myself in a VM). For the previous patch, it was contributed by some proprietary security software (privately) and all I did was to make sure OpenBLAS could (cross-)compile for Windows with it. But yeah, I am guilty of not having tested it thoroughly. Sorry 🙇🏻, I know it's not the best, but really just getting fixes for Windows is already a big deal in our situation!

I suspect GIMP (or actually its openblas-dependent plugin) is just loading openblas on startup to see if it is available, and unloading it immediately.

We don't use OpenBLAS directly, but as a dependency of SuiteSparse, itself a dependency of one operation of GEGL, which is a direct dep of GIMP. We check availability of various operations of GEGL at startup. I guess it may be when OpenBLAS is loaded at startup.

Once again, I have not even tried myself or tried to run a debugger (on Windows, I am not even sure what to use). This time though, we had the various users (the ones who reported the initial bug) test and confirm the long startup issue is gone.

@martin-frbg martin-frbg merged commit 445ca2f into OpenMathLib:develop Dec 13, 2019
@Jehan
Copy link
Author

Jehan commented Dec 13, 2019

Thanks @martin-frbg !

gnomesysadmins pushed a commit to GNOME/gimp that referenced this pull request Dec 13, 2019
Previous OpenBlas patch fixed the crash with Sophos (see reports #3633
and #4246) but created a huge slowdown of startup because of a timeout
change and most likely OpenBLAS being loaded at startup during the
various verifications.

A new patch has been merged upstream to lower this timeout to something
more reasonable. Reporters confirmed GIMP now runs fine (neither crashes
nor very long startups).

See: OpenMathLib/OpenBLAS#2339
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants