Skip to content

cudaPackages: align with upstream#342

Merged
elliotberman merged 3 commits intoanduril:masterfrom
ConnorBaker:feat/align-with-upstream
Oct 9, 2025
Merged

cudaPackages: align with upstream#342
elliotberman merged 3 commits intoanduril:masterfrom
ConnorBaker:feat/align-with-upstream

Conversation

@ConnorBaker
Copy link
Copy Markdown
Contributor

@ConnorBaker ConnorBaker commented Aug 21, 2025

Description of changes

Aligns with upstream's changes in NixOS/nixpkgs#406568.

nvidia-jetpack.cudaPackages now has a pkgs attribute which functions in the same way as upstream's cudaPackages.pkgs attribute: https://nixos.org/manual/nixpkgs/stable/#cuda-using-cudapackages-pkgs.

The default version of nvidia-jetpack is now determined by the default version of the CUDA package set (cudaPackages). Changing the default version of the CUDA package set (either through something like cudaPackages_12.pkgs or an overlay) now changes the default version of nvidia-jetpack.

Added assertions to the NixOS configuration which, when CUDA support is requested, check that the default CUDA package set can be used with the current version of JetPack NixOS. As an example, taking a JetPack 6 closure and purposefully changing the default CUDA version by applying an overlay like

final: _: { cudaPackages = final.cudaPackages_12_2; }

would provide this evaluation error:

error:
       … while calling the 'head' builtin
         at /nix/store/5n6k4jld2n6k0cs98h9xgh35xsa03jk3-source/lib/attrsets.nix:1534:13:
         1533|           if length values == 1 || pred here (elemAt values 1) (head values) then
         1534|             head values
             |             ^
         1535|           else

       … while evaluating the attribute 'value'
         at /nix/store/5n6k4jld2n6k0cs98h9xgh35xsa03jk3-source/lib/modules.nix:1083:7:
         1082|     // {
         1083|       value = addErrorContext "while evaluating the option `${showOption loc}':" value;
             |       ^
         1084|       inherit (res.defsFinal') highestPrio;

       … while evaluating the option `system.build.toplevel':

       … while evaluating definitions from `/nix/store/5n6k4jld2n6k0cs98h9xgh35xsa03jk3-source/nixos/modules/system/activation/top-level.nix':

       (stack trace truncated; use '--show-trace' to show the full, detailed trace)

       error:
       Failed assertions:
       - JetPack NixOS 6 supports CUDA 12.4 (natively) - 12.9 (with `cuda_compat`): `pkgs.cudaPackages` has version 12.2.

Important

If NVIDIA provides backward-compatibility guarantees for Tegra/Jetson, using an older version of CUDA packages with a newer system should be okay, but I've nested tested or looked into what's necessary to make that happen.

Note

In order to get the NixOS assertions working for all of the different CUDA overlays I could think of, I did have to change the NixOS module such that it only wraps unsupported nvidia-jetpack package sets in a warning, rather than setting them to the empty attribute set. I can work around this if desired by instead turning it into a singleton which just contains cudaPackages.cudaMajorMinorVersion.

Added a new check, jetpackSelectionDependsOnCudaVersion, which uses assertions to verify the version-changing behavior made possible by cudaPackages.pkgs and the like function as expected.

Testing
  • CI (internal)
  • Manual testing in HITL (internal; in progress)
  • Manual verification

@ConnorBaker ConnorBaker force-pushed the feat/align-with-upstream branch 2 times, most recently from 052c832 to 868c446 Compare August 22, 2025 17:24
@ConnorBaker ConnorBaker force-pushed the feat/align-with-upstream branch 2 times, most recently from a3fb9ca to 5ec7c87 Compare September 11, 2025 21:59
@ConnorBaker ConnorBaker force-pushed the feat/align-with-upstream branch 4 times, most recently from 3467d03 to c4c2fc4 Compare October 7, 2025 01:02
@ConnorBaker ConnorBaker marked this pull request as ready for review October 7, 2025 01:09
Copy link
Copy Markdown
Collaborator

@elliotberman elliotberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've understood the big picture here, left a couple minor comments. Will take another pass at it soon once it's had time to digest.


Aligns with upstream's changes in NixOS/nixpkgs#406568.

Is upstream a dependency?

The default version of nvidia-jetpack is now determined by the default version of the CUDA package set (cudaPackages).

Can you justify why? We talked about this offline, but would be good to get it in writing for our future selves. Super ideally, the comment can be done in overlay.nix and in the PR description.

…eam's cudaPackages.pkgs pattern

Signed-off-by: Connor Baker <cbaker2@anduril.com>
Signed-off-by: Connor Baker <cbaker2@anduril.com>
Signed-off-by: Connor Baker <cbaker2@anduril.com>
@ConnorBaker ConnorBaker force-pushed the feat/align-with-upstream branch from c4c2fc4 to c7056cc Compare October 7, 2025 21:25
@ConnorBaker
Copy link
Copy Markdown
Contributor Author

Updated to address comments and force-pushed.

Added a comment to the version selection of nvidia-jetpack:

Due to the interplay between JetPack releases and supported CUDA versions, the choice of CUDA version drives the
version of nvidia-jetpack made the default to avoid the need to maintain tedious overlays and ensures the two stay
in sync by default.

Thanks to the functionality added in NixOS/nixpkgs#406568, we can build packages
using variants of Nixpkgs through the Flake CLI. See https://nixos.org/manual/nixpkgs/stable/#cuda-using-cudapackages-pkgs
for an example.

Consider a user trying to build a hypothetical package foo which works with all versions of nvidia-jetpack. If
they build .#blarg (assuming proper configuration of Nixpkgs and use of our overlay), blarg receives
nvidia-jetpack5 as nvidia-jetpack and nvidia-jetpack5.cudaPackages (CUDA 11.4) as cudaPackages.
If they build .#cudaPackages_12.pkgs.blarg, blarg receives nvidia-jetpack6 as nvidia-jetpack and
nvidia-jetpack6.cudaPackages (CUDA 12.6) as cudaPackages. If the version of nvidia-jetpack did not depend on
the version of the CUDA package set, blarg would have received nvidia-jetpack5 as nvidia-jetpack (since it would
stay unchanged) and nvidia-jetpack6.cudaPackages (CUDA 12.6) as cudaPackages -- this is likely unintentional!

@elliotberman
Copy link
Copy Markdown
Collaborator

In order to get the NixOS assertions working for all of the different CUDA overlays I could think of, I did have to change the NixOS module such that it only wraps unsupported nvidia-jetpack package sets in a warning, rather than setting them to the empty attribute set. I can work around this if desired by instead turning it into a singleton which just contains cudaPackages.cudaMajorMinorVersion.

Let's do this in a follow up PR. I'd really like it to be an outright error if you try to reference the wrong JetPack version.

@elliotberman elliotberman merged commit ee949d5 into anduril:master Oct 9, 2025
1 check passed
@ConnorBaker ConnorBaker deleted the feat/align-with-upstream branch October 9, 2025 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants