check-meta: custom remediation messages#456908
check-meta: custom remediation messages#456908SomeoneSerge wants to merge 2 commits intoNixOS:masterfrom
Conversation
Waiting for a more comprehensive RFC 127 "Problems" implementation (NixOS#177272), proactively packages to specify `meta.problems` and use `problems` of kinds `unsupported` and `broken` to display context-aware remediation messages. This change is motivated by the merging of the CUDA13 PR, which included denying in-tree support for using CUDA without enabling it Nixpkgs-wide. Implementation-wise, unsupported packages were marked as `broken` (TBD: make "unsupported"), and reasons are visible when `--trace-verbose`, but obscured by the long and unhelpful NIXPKGS_ALLOW_BROKEN message. Instead of reverting the CUDA13 changes, which are blocking for a number of updates in the SciComp ecosystem, it seems better to allow customization of remediation mesages. In `cudaPackages` (and dependend `pythonXPackages`), we used to rely on an ad hoc `brokenConditions` passthru attribute. Instead of moving that to `meta`, rewrite them in the form compatible with the future RFC127 implementation.
71b0b9a to
a9f4724
Compare
|
I’ll look at this tomorrow (hopefully), but from what I remember the original problems RFC implementation was superseded by #338267, which was blocked due to performance concerns. I’ve not looked at this PR, but that may be a concern raised here. |
That's the reason I restrained myself to |
a9f4724 to
e5e7bfa
Compare
A couple Python/SciComp and CUDA packages previously relied on making `meta.broken` and `meta.unsupported` conditions more reliable by listing the reasons in `passthru.brokenConditions` (&c), used in conjunction with `addErrorContext` or ad hoc `assert`. Now that the need is more urgent (Cf. 25.11 config.cudaSupport requirement), use an RFC127-compatible scheme to list problems and display meaningful errors. cudaPackages: moved assertions from `backendStdenv` (keep it a tiny shim, hopefully eventually remove) to `cuda_nvcc`.
e5e7bfa to
c24aef6
Compare
|
Although I fully support proper |
Thanks, wasn't aware of the rework and didn't go too far searching. The [Accidentally clicked C-Return] That said, I agree about the timing, and also appreciate that Connor's hope was to merge the PR before October... |
As someone who uses this on her main system, this is an awful recommendation. This triggers massive rebuilds for packages that the user probably doesn't even want to use with CUDA. This might be too big to add to the error message, but it's probably about time the wiki page about CUDA teaches users to make an overlay for using with the packages they want (instead of only mentioning nix shells), such as: nixpkgs.overlays = [
(final: prev: {
pkgsCu = import inputs.nixpkgs { config.allowUnfree = config.nixpkgs.config.allowUnfree; localSystem.system = final.stdenv.hostPlatform.system; localSystem.config = "${final.stdenv.hostPlatform.config}"; config.cudaSupport = true; config.cudaVersion = "12";};
})
];I could even make a PR adding something like that to the wiki page. Not sure what the guidelines are for this kind of error messages, but maybe it could then link to the wiki page on the last line ("For more information check the wiki page: https://wiki.nixos.org/wiki/CUDA" or something)? While I do have it the other way around in my config, this is honestly an awful experience and the only reason I have |
|
@LuNeder hi, yes the idea precisely is to have two different instances of Nixpkgs, one CPU-only and one CUDA-only, this way achieving consistency while skipping most of the false-positive rebuilds triggered by early binding. You're probably right that the documentation around this sucks, and in particular that the suggestion about the NixOS option should be removed (I copy-pasted it from the original remediation template) |
Have you checked the Nixpkgs docs for CUDA? Seems like what you want is covered by |
I think I found the one sentence that reference this, but are there any examples one can read? I think I'm only finding 5 examples at the moment in the wild: |
|
I'ma close this for now: we're postponing the "mandatory consistency" change, which was the reason for urgency. @LuNeder @ruffsl you bring up important points, but they were offtopic for this PR; I encourage you to open a tracking issue with the documentation and cuda labels, and/or prepare respective PRs. I hope to revisit this at the start of the next release cycle, because I feel quite skeptical about the idea of rolling out a full-blown and performance-optimized RFC implementation in a single step, which is where #338267 might have been throttling |
Things done
Allows packages to define
meta.problemsahead of a complete RFC 127 "Problems" implementation (#177272). Usesmeta.problemsto customize remediation messages forbrokenandunsupportedpackages. MigratescudaPackagesassertions tometa.problems.Cf. discussion in #437723 (comment) and #456510 (comment). The TLDR is that it's highly undesirable to revert the changes from CUDA 13 PR, and, in fact, a high time to rip the band aid off on the local/unsafe
cudaSupport. Ripping it off while displaying theNIXPKGS_ALLOW_BROKENmessage would be an absolute disaster for SEO and documentation, as in totally and hopelessly ruinous. The sought compromise is to rip off, but to add the possible minimum extra data tometato make guiding the user possible.Example:
CC @NixOS/nixpkgs-core because
check-meta.nixand trying to sneak in a relatively ad hoc fix into the release.CC @piegamesde because RFC 127.
CC @ConnorBaker.
Add a 👍 reaction to pull requests you find important.