Skip to content

WIP nixos/opengl: move to hardware.drivers#158079

Closed
jonringer wants to merge 3 commits intoNixOS:masterfrom
jonringer:rfc-121
Closed

WIP nixos/opengl: move to hardware.drivers#158079
jonringer wants to merge 3 commits intoNixOS:masterfrom
jonringer:rfc-121

Conversation

@jonringer
Copy link
Contributor

@jonringer jonringer commented Feb 4, 2022

Motivation for this change

Implementation part of NixOS/rfcs#121

closes: #141803

Things needing to be done:
  • services.xserver.videoDrivers -> hardware.gpu.drivers
  • hardware.opengl -> hardware.drivers
  • Update nixos manual
  • Update other nixos modules
  • Update hardcoded references to /run/opengl-driver (postponed til after 22.11)
  • Rename addOpenGLRunpath -> addHardwareRunpath [staging] addHardwareRunpath: init, alias addOpenGLRunpath #196174
  • Add release notes

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Feb 4, 2022
@ofborg ofborg bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Feb 4, 2022
@github-actions github-actions bot added the 6.topic: python Python is a high-level, general-purpose programming language. label Feb 4, 2022
@jonringer
Copy link
Contributor Author

took off draft so i could see non-ofborg gates

@ofborg ofborg bot added 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 1001-2500 This PR causes many rebuilds on Darwin and should most likely target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. and removed 2.status: merge conflict This PR has merge conflicts with the target branch labels Feb 4, 2022
@github-actions github-actions bot added 8.has: changelog This PR adds or changes release notes 8.has: documentation This PR adds or changes documentation labels Feb 4, 2022
@jonringer
Copy link
Contributor Author

The old nixos manual reference should probably be made into literals. Was just trying to get the manual to build again

@ofborg ofborg bot added 8.has: clean-up This PR removes packages or removes other cruft 8.has: package (new) This PR adds a new package labels Feb 5, 2022
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/include-nixgl-in-nix-flake-app/17588/2

@jansol jansol mentioned this pull request Feb 21, 2022
1 task
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-use-nvidia-v100-a100-gpus/17754/4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure these changes will clobber the binary, the original code contains padding to match the length of the string and the new path is longer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The legacy link will be retained. So it might be an exception

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/cuda-in-nixos-on-gcp-for-a-tesla-k80/20145/9

@SuperSandro2000 SuperSandro2000 added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 6, 2022
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then relatedPackages can be uncommented above.

Suggested change
# uses relatedPackages
meta.buildDocsInSandbox = false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still relevant?

@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Oct 15, 2022
@github-actions github-actions bot removed the 6.topic: python Python is a high-level, general-purpose programming language. label Oct 15, 2022
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. and removed 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. labels Oct 15, 2022
@ofborg ofborg bot removed 10.rebuild-darwin: 1001-2500 This PR causes many rebuilds on Darwin and should most likely target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. labels Oct 15, 2022
@jonringer
Copy link
Contributor Author

jonringer commented Oct 15, 2022

Updated canonical driver path to be /run/current-system/drivers for the systemd tmpfile link but didn't update the nixpkgs references to /run/opengl-driver/ for smoother user transition, as mentioned in the RFC discussion.

@jonringer
Copy link
Contributor Author

This PR still needs some more testing, I will be trying to use it with my current system configuration today.

@jonringer
Copy link
Contributor Author

ran into some issues with extraPackages being removed

@jonringer
Copy link
Contributor Author

was able to get my current system configuration building. Would like for others to test

Comment on lines +28 to +38
(mkRenamedOptionModule [ "hardware" "opengl" "enable" ] [ "hardware" "drivers" "enable" ])
(mkRenamedOptionModule [ "hardware" "opengl" "package" ] [ "hardware" "drivers" "packages" ])
(mkRenamedOptionModule [ "hardware" "opengl" "extraPackage" ] [ "hardware" "drivers" "packages" ])
(mkRenamedOptionModule [ "hardware" "opengl" "driSupport32Bit" ] [ "hardware" "drivers" "enable32bit" ])
(mkRenamedOptionModule [ "services" "xserver" "vaapiDrivers" ] [ "hardware" "drivers" "packages" ])
(mkRemovedOptionModule [ "hardware" "opengl" "s3tcSupport" ] ''
S3TC support is now always enabled in Mesa.
'')
(mkRemovedOptionModule [ "hardware" "opengl" "driSupport" ] ''
dri support is now always enabled in Mesa.
'')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only start warning when the new options are available on all supported releases. To sketch the issue, otherwise, you're putting "3rd party" repos in a spot where they have to behave differently for each version just to avoid user complaints about warnings. This "3rd party" includes nixos-hardware.

Copy link
Contributor Author

@jonringer jonringer Oct 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the desired way forward? I thought this was the canonical way to do it.

I only added the first 4 opengl lines, the vaapiDrivers and s3tc lines existed already.

(mkRenamedOptionModule [ "services" "xserver" "vaapiDrivers" ] [ "hardware" "opengl" "extraPackages" ])
(mkRemovedOptionModule [ "hardware" "opengl" "s3tcSupport" ] ''
S3TC support is now always enabled in Mesa.
'')

The driSupport I removed because all the consuming logic just was never used and implied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use mkRenamedOptionModuleWith { sinceRelease = 2211; from = ...; to = ...; }. This will start warning on master when 2205 is EOL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't realise this existed. It should be used more often, as a lot times I just reference similar modules for correctness

@LunNova
Copy link
Member

LunNova commented Oct 27, 2022

If we're taking the effort to rename these might make sense to make them easier to override at the same time? A single list means having to use mkForce when you set it if you want to remove one item from it to replace it.

hardware.drivers.packages = {
  mesa = pkgs.mesa.drivers;
  amdvlk = pkgs.amdvlk;
}

This way you could cleanly use hardware.drivers.packages.mesa = lib.mkForce myOverriddenMesa.

Otherwise this PR removing the separation between the main package and extra packages makes it a little harder to properly override your mesa version without potentially accidentally removing other entries from the list.

@SomeoneSerge SomeoneSerge added the 6.topic: cuda Parallel computing platform and API label Mar 23, 2023
@purepani
Copy link
Contributor

Is there any help needed to move this PR forward?

"Option services.xserver.useGlamor was removed because it is unnecessary. Drivers that uses Glamor will use it automatically.")
(mkRemovedOptionModule ["services" "xserver" "videoDriver" ]
"Use hardware.gpu.drivers instead")
(mkRenamedOptionModule [ "services" "xserver" "videoDrivers" ] [ "hardware" "gpu" "drivers" ])
Copy link
Member

@Atemu Atemu Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this option should either stay under xorg or be removed entirely. What's in this list are the xorg DDX drivers, not OGL/VK/whatever drivers. Deriving the latter from the former has always been a bad pattern IMHO.

Instead, we should have an option where you can enable specific drivers directly and ideally also pass driver packages as overrides.

Something like

hardware.drivers.<driver> = {
  enable = true;
  package = ...;
  driverSpecificOption = ...;
};

for mesa, amdvlk, nvidia, amdgpu-pro and whatever other significant driver packages exist.

I also think we might want to have a per-API interface. APIs would include obvious ones like OGL and VK but also OpenCL, VAAPI, oneAPI, CUDA, rocm and various other acceleration APIs for TPUs and the like; anything that facilitates talking to specialised hardware and requires a central userspace component basically.

Each API would have the option to enable/disable (a headless media server might not need OGL/VK but might want VAAPI for instance), declaring available drivers (many APIs allow multiple drivers to co-exist) and perhaps the option to set a preferred/primary driver where applicable.

Something like:

hardware.acceleration.<api> = {
  enable = ...;
  packages = { ... };
  preferredPackage = config.hardware.acceleration.packages.<driver>;
  apiSpecificOption = ...;
};

The <driver>.enable option would place its package into every API the driver implements, allowing the user to override a certain driver's package "globally".

When the desktop is enabled (that setting should also be moved out of xorg), we would enable the mesa driver and the OGL and VAAPI APIs.
If a user requires the nvidia driver, they would set hardware.drivers.nvidia.enable = true which would in-turn also enable the kernel module. Should they need an older driver, they'd set hardware.drivers.nvidia.package = nvidia_xyz;.
Enabling steam (or some other gaming related thing) would enable the VK driver.

What do you think of this design?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your design looks like it resolves my concern over hardware.drivers.packages making mesa hard to override.

What's the benefit of splitting up enabling opengl and vulkan? Bundling everything in mesa together is somewhat expected so the default shouldn't be to split that up or it will add a new confusing way to have games break.

Similarly, mesa should probably default to on if any graphics drivers are in use to match current behavior and behavior of other distros.
Enabling only nvidia libs should be possible but it's unusual and wouldn't be a good thing to have happen by default or accident.

Copy link
Member

@Atemu Atemu Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of splitting up enabling opengl and vulkan?

The reasoning is that these can be en-/disabled separately is that these are entirely separate acceleration APIs. A regular desktop user does not require Vulkan drivers, oneAPI or CUDA.

It also simplifies enabling less usual APIs as all you'd need to do is set hardware.acceleration.obscureAPI.enable = true and if there is a "mesa situation" where its the most sensible driver to use, we might opt to enable that driver when the API gets enabled.

mesa should probably default to on if any graphics drivers are in use to match current behavior and behavior of other distros.

This is what I meant with enabling mesa when the desktop is enabled.

Copy link
Member

@LunNova LunNova Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A regular desktop user does not require Vulkan drivers

I think this will change soon - wayland compositors are in various stages (eg kwin https://invent.kde.org/plasma/kwin/-/issues/169 ) of investigating or supporting vulkan and I would be surprised if this doesn't eventually become the default.

As long as the defaults are good / the behavior for someone who installs NixOS and wants to have their graphics card work as expected is good then I support your proposal for making very granular options available.

@mweinelt
Copy link
Member

@jonringer Fancy rebasing this for 24.05? The way things are set up right now are really misleading and need fixing!

@Atemu
Copy link
Member

Atemu commented Nov 23, 2023

If you don't want to continue this, I think I'll move forward with attempting to implement my proposed design this weekend.

@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 19, 2024
@jonringer
Copy link
Contributor Author

@Atemu seems to have a better long term solution to handle hardware acceleration, whereas I was just concerned with naming alignment. #158079 (comment)

@Tungsten842 Tungsten842 mentioned this pull request Jun 16, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: cuda Parallel computing platform and API 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog This PR adds or changes release notes 8.has: clean-up This PR removes packages or removes other cruft 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

Deprecate usage of OpenGL utilties and NixOS options.