Add NVIDIA DCGM and DCGM-exporter (prometheus)#235024
Conversation
Kiskae
left a comment
There was a problem hiding this comment.
In addition to the comments I'd like to see a nixpkgs-review result to see if this would cause unrelated rebuilds due to jsoncpp/libevent package changes. It shouldn't because the changes are all gated by parameters but its best to check.
|
Noting that plog was added before this got merged: #238806 |
All the tensorrt failures are due to the fact that... |
|
@Kiskae Thank you for the review. What's left to get this merged? |
|
@de11n I don't actually have merge permissions, but @NixOS/cuda-maintainers do and they are probably the ones most affected by any changes so they should take a look. |
There was a problem hiding this comment.
I don't understand the comment so I didn't change it. Whatever it meant, the static build is incorrect without this change.
There was a problem hiding this comment.
yeah unfortunately I find the comment quite tricky to understand past "this is fixed and no longer necessary in 1.9.5" as well... perhaps author @tfc can shed some light here?
There was a problem hiding this comment.
Ok, i understand why the comment seems incomprehensible now, as i have a bit timely distance to it myself. :D
Looking at the latest version of jsoncpp, they are no longer giving the static version of jsoncpp a _static filename suffix. This was a very bad idea anyway, so they stopped doing that. (It's only done for MSVC builds).
I think in the current version, you don't need this cp/ln dance any longer and static linking will just work out of the box.
So please just drop these lines. I am sure we were on an earlier version of jsoncpp when i wrote them.
There was a problem hiding this comment.
You are right. We can drop the postInstall entirely. Thank you!
There was a problem hiding this comment.
| version = "1.4-3feeb7b"; | |
| version = "1.4.0-rc1"; |
There was a problem hiding this comment.
The official 1.4.0-rc1 tag doesn't have unit tests and is missing pkgconfig files. That is why I went with this slightly more recent version. If I use 1.4.0-rc1 I will have to apply patches similar to what this commit is providing. See here: https://github.com/mirror/tclap/compare/1.4.0-rc1..1.4
I will add a comment to that effect.
There was a problem hiding this comment.
| rev = "3feeb7b2499b37d9cb80890cadaf7c905a9a50c6"; | |
| rev = "refs/tags/${version}"; |
There was a problem hiding this comment.
why are these build flags necessary? if at all possible, it is preferable to just use dependencies as they are default packaged
There was a problem hiding this comment.
DCGM is very particular about its build. Its entire build system is actually deterministic, but it uses Docker to achieve that. This makes it extremely particular about the exact build flags used for every dependency. These dependencies are built with these specific flags and I ran into build failures in other configurations. While it might be theoretically possible to find a different configuration that succeeds, this one exactly matches what upstream does in its build system so it is the most likely to succeed and have matching behavior.
b7e7a0a to
bfa5bcf
Compare
|
@samuela I've made many of the suggested changes. I'm not sure why this PR is now causing a rebuild of so many packages. |
|
CUDA changes look good to me as well -- thank you for the CUDA Toolkit fix. |
|
I have one follow-up task on this, but it shouldn't block merge. Is there anything else needed to get this merged? |
|
Result of 112 packages failed to build:
486 packages built:
|
|
nixpkgs-review looks reasonable to me. I'll go ahead and merge tomorrow unless anyone objects? |
The workaround is no longer needed for static builds.
libnvrtc.so is not found in the same location in all versions of cudatoolkit.
Many Qt dependencies were added only for 12.1 and above, but 12.0.1 also needs them.
|
@samuela Pushed a change to one of my commit messages which had leftover cruft from a rebase. (The comparison shows no code differences.) |
Description of changes
Add NVIDIA DCGM and dcgm-exporter to Nixpkgs.
In order for the build to work, I had to also add
plogandtclap_1_4which is a long-lived branch oftclap. I also fixed some minor errors in a couple other packages.Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)