opencv: misc CUDA-related updates and fixes; add enableLto#218044
Closed
ConnorBaker wants to merge 185 commits intoNixOS:masterfrom
Closed
opencv: misc CUDA-related updates and fixes; add enableLto#218044ConnorBaker wants to merge 185 commits intoNixOS:masterfrom
ConnorBaker wants to merge 185 commits intoNixOS:masterfrom
Conversation
This commit updates the `buildFlags`, which is a single string with one of four possibilities: - "" - "profiled" - "bootstrap" - "profiledbootstrap" Previously only the last two were possible. Since 2ea3482 all four are possible.
e3b62ad to
c70cc93
Compare
The primary motivating example is openssl: Before the change full package build took 1m54s minutes. After the change full package build takes 59s. About a 2x speedup. The difference is visible because openssl builds hundreds of manpages spawning a perl process per manual in `install` phase. Such a workload is very easy to parallelize. Another example would be `autotools`+`libtool` based build system where install step requires relinking. The more binaries there are to relink the more gain it will be to do it in parallel. The change enables parallel installs by default only for buiilds that already have parallel builds enabled. There is a high chance those build systems already handle parallelism well but some packages will fail. Consistently propagated the enableParallelBuilding to: - cmake (enabled by default, similar to builds) - ninja (set parallelism explicitly, don't rely on default) - bmake (enable when requested) - scons (enable when requested) - meson (set parallelism explicitly, don't rely on default) - waf (set parallelism explicitly, don't rely on default) - qmake-4/5/6 (enable by default, similar to builds) - xorg (always enable, similar to builds)
Without the change install phase fails as:
installing
install flags: -j16 ...
...
./.libs/libnetsnmpagent.so: file not recognized: file format not recognized
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:1012: libnetsnmpmibs.la] Error 1
make[1]: *** Waiting for unfinished jobs....
Without the change install phase fails as:
Installing libxfs-install
../../install-sh -o nixbld -g nixbld -m 644 ioctl_xfs_ag_geometry.2 /nix/store/chymzkiiv6c2rgl2gqrn4bqv5azhx9vf-xfsprogs-6.1.1-bin/share/man/man2/ioctl_xfs_ag_geometry.2
make[1]: *** No rule to make target '\', needed by 'kmem.lo'. Stop.
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:148: libxfs-install] Error 2
make: *** Waiting for unfinished jobs....
Without the change parallel install fails as:
install flags: -j16
...
libbtool: error: error: relink '_py3sss.la' with the above command before installing it
libtool: warning: '/build/source/libsss_cert.la' has not been installed in '/nix/store/apyk9a6q7bc7d1fnn81vqrwil4waw9cd-sssd-2.8.2/lib/sssd'
make[3]: *** [Makefile:13362: install-py3execLTLIBRARIES] Error 1
now gcc isn't built
Without the change parallel install fails as:
$ install flags: -j16 ...
...
collect2: error: ld returned 1 exit status
libtool: error: error: relink 'libsvn_ra_serf-1.la' with the above command before installing it
make: *** [build-outputs.mk:1316: install-serf-lib] Error 1
make: *** Waiting for unfinished jobs....
/nix/store/1qasgqvab0xh2jcy00x9b1zh39dw7m8f-bin
Without the change parallel install fails as:
$ install flags: -j16 ...
...
install: target '...-ocaml-4.14.0/lib/ocaml/threads': No such file or directory
make[1]: *** [Makefile:140: installopt] Error 1
Without the change parallel installs fail as:
install flags: -j2
...
ln: failed to create symbolic link '...-eresi-0.83-a3-phoenix//bin/elfsh': No such file or directory
make: *** [Makefile:108: install64] Error 1
w3m: 0.5.3+git20220429 -> 0.5.3+git20230121
libpcap: 1.10.1 -> 1.10.3
- use cudaPackages instead of cudatoolkit (reduces download/closure size) - set C/C++ compiler when building with CUDA to ensure NVCC has an appropriate backing compiler - add flag to build with CUDNN (disabled by default due to increase in closure size) - add flag to build with LTO (enabled by default)
1bd932a to
13d80db
Compare
Member
|
Please reopen, we can't unping people. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of changes
Info on closure sizes: these are the result of before (compiling from master) and after (my PR, with and without CUDNN support).
Before:
Full closure:
Details
After (without CUDNN, which is the default):
Full closure:
Details
After (with CUDNN):
Full closure:
Details
Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)