Skip to content

nixos/nvidia: move TOPOLOGY_FILE_PATH and DATABASE_PATH into nvidia-fabricmanager service definition#320830

Merged
SomeoneSerge merged 1 commit intoNixOS:masterfrom
philiptaron:nixos-nvidia-change-datacenter-defaults
Jun 27, 2024
Merged

nixos/nvidia: move TOPOLOGY_FILE_PATH and DATABASE_PATH into nvidia-fabricmanager service definition#320830
SomeoneSerge merged 1 commit intoNixOS:masterfrom
philiptaron:nixos-nvidia-change-datacenter-defaults

Conversation

@philiptaron
Copy link
Copy Markdown
Contributor

Description of changes

Move the TOPOLOGY_FILE_PATH and DATABASE_PATH keys from hardware.nvidia.datacenter.settings default into the service file derivation. As can be seen on line 9, nvidia_x11 can be null and thus is unsuitable for use in this venue.

I discovered this while testing out #313497.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

… hardware.nvidia.datacenter.settings default into the service file
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jun 18, 2024
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Jun 18, 2024
Copy link
Copy Markdown
Member

@FireyFly FireyFly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, reducing the scope to one where nvidia_x11 is known to be non-null

@wegank wegank added the 12.approvals: 1 This PR was reviewed and approved by one person. label Jun 20, 2024
@nixos-discourse
Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-already-reviewed/2617/1787

Comment on lines -49 to -50
TOPOLOGY_FILE_PATH = "${nvidia_x11.fabricmanager}/share/nvidia-fabricmanager/nvidia/nvswitch";
DATABASE_PATH = "${nvidia_x11.fabricmanager}/share/nvidia-fabricmanager/nvidia/nvswitch";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-note: I find these passthru packages in nvidia_x11 rather annoying; we recently found that e.g. libXNctrl could be moved out of nvidia_x11 entirely (#318092); do you know if is fabricmanager driver-locked?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.nvidia.com/datacenter/tesla/pdf/fabric-manager-user-guide.pdf

That diagram suggests that it is, sadly, but I don't know for sure.

# unsuitable to be mentioned in the configuration defaults, but they _can_
# be overridden in `cfg.datacenter.settings` if needed.
fabricManagerConfDefaults = {
TOPOLOGY_FILE_PATH = "${nvidia_x11.fabricmanager}/share/nvidia-fabricmanager/nvidia/nvswitch";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanity-check: here nvidia_x11 != null because it's guarded by cfg.enable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

@SomeoneSerge SomeoneSerge merged commit 35a472d into NixOS:master Jun 27, 2024
@philiptaron philiptaron deleted the nixos-nvidia-change-datacenter-defaults branch June 27, 2024 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.approvals: 1 This PR was reviewed and approved by one person.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants