Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Create a bunch of build tests #1976

Closed
wants to merge 1 commit into from

Conversation

TiborGY
Copy link
Contributor

@TiborGY TiborGY commented Jan 20, 2019

The build system is somewhat fragile. This script does builds using a bunch of different Makefile.rule settings, to help find bugs, regressions and corner cases in the build system.
The script is not intended to be run in Travis, but as a manual test, for example before creating a new release.

There are two things this script tests:

  • All builds should succeed (might add expected failures, builds that must fail)

  • Builds that are run with equivalent options should yield the same binary (tested via SHA hash of libopenblas.so)

  • Ready for review

@brada4
Copy link
Contributor

brada4 commented Jan 21, 2019

What is missing in travis.yml ?

@TiborGY
Copy link
Contributor Author

TiborGY commented Jan 21, 2019

What is missing in travis.yml ?

I am not sure what you mean by that. AFAIK Travis only checks if the builds succeed, but has no way of checking if the resulting binary is the one that should have been built.
Since most of these tests require at least 2 builds, I think they are not suitable to be run on Travis.
The only one so far that should probably be added to the Travis builds is the 32-bit build.

@martin-frbg
Copy link
Collaborator

I am not yet convinced that this is worth the trouble, but I won't stop you... If your impression that the build system as a whole is fragile comes from #1974, that bug was only introduced into the develop branch a few days ago as a tentative (and as yet unconfirmed) fix for #1947.

@TiborGY
Copy link
Contributor Author

TiborGY commented Jan 21, 2019

I have found that issue when I started working on this project.
IMO the build system is large, complicated and has a lot of moving parts. Also there is a large reliance on #ifdefs, flags that get passed around various makefiles and the CPU and feature detection logic is an interacting mix of makefiles, C and Perl.

Overall the system gives me the impression that it is rather easy to break it in unexpected ways when implementing new features. I think some time ago there was an issue, where having SOME_FEATURE=<default_of_some_feature> in Makefile.rule was not equivalent to having that line commented out.

Having tests for things like this should give more confidence. Also, I want to make sure that I do not give false information regarding compile time options in #1827

@martin-frbg
Copy link
Collaborator

Ah well, that would be #1422 (though I think I got the worst offenders already).

@TiborGY
Copy link
Contributor Author

TiborGY commented Jan 27, 2019

@martin-frbg I have found some build time quirks of the threading system, can you confirm that this the intended behavior?

  • Setting NUM_THREADS = 1 is not the same as USE_THREAD = 0
  • Setting USE_THREAD = 0 NUM_THREADS = 16 is not the same as USE_THREAD = 0, and also not the same as NUM_THREADS = 1

PS: When I say "setting NUM_THREADS = 1", that means the only deviation from the defaults is NUM_THREADS = 1

@martin-frbg
Copy link
Collaborator

Correct in the sense that this is the behaviour inherited from GotoBLAS (with the primary difference being the sizing of an internal memory buffer), though it is not entirely clear if that was Kazushige Goto's intention or a flaw in his implementation (that may not have mattered as much on the x86 systems of ten years ago).

@TiborGY
Copy link
Contributor Author

TiborGY commented Jan 27, 2019

Is there any conceivable reason why anyone would want to use a build with USE_THREAD = 0 NUM_THREADS > 1 ?

Because if there is none, I would probably advocate for refusing to build with such a setting, similar to how the combination of USE_THREAD = 0 USE_OPENMP = 1 refuses to build. Or, alternatively, issue a warning and do the build with NUM_THREADS = 1.
I think the fact that NUM_THREADS affects the binary in any way when threading is explicitly disabled is counter-intuitive.

@martin-frbg
Copy link
Collaborator

See #1141 please. Even the single-threaded OpenBLAS may make use of the internal buffer - yes this is counter-intuitive but it is how the original library was coded and changing it now is a non-trivial task.

@brada4
Copy link
Contributor

brada4 commented Jan 27, 2019

There is fixed compiled-in struct permitting 2 blas_memory_alloc()/..free() instances at any time for each of openblas (NUM_)threads for all openblas threads in fact occuring in current process. It was recently fixed at minimum 50 buffers to not hurt casual users of 1-threaded openblas. The change becomes worse than non-trivial as there is no uniform adherence to these builltin memory handling routines, sometimes it is malloc etc. It would take extreme effort to even "correctly" use same functions at all occurences.

@TiborGY
Copy link
Contributor Author

TiborGY commented Jan 27, 2019

Wouldn't it make more sense to use NUM_PARALLEL to make sure there are enough buffers for thread safety? I mean the current comment in Makefile.rule claims it is for parallel OpenMP builds only, but given its stated purpose of ensuring thread safety, I think single threaded builds could also use that.
Looking at this line: https://github.com/xianyi/OpenBLAS/blob/89b60dab8ad21a0cc6320cbd9fcd603c4c4bfc81/common.h#L186
it looks like MAX_PARALLEL_NUMBER already does increase the number of buffers, as desired.

@brada4
Copy link
Contributor

brada4 commented Jan 27, 2019

Thats old codebase, probably most of concern goes away once USE_TLS gets stable. 50 and MAX_NUMBER just wins some time to let USE_TLS stage before general use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants