-
-
Notifications
You must be signed in to change notification settings - Fork 161
[RFC 0089] Collect non-source package meta attribute #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
90f214d
create 0089-collect-non-source-package-meta.md
risicle a60c2f4
Add shepherd metadata
lheckemann a2ad743
[RFC 0089] adopt `sourceProvenance` "multi-valued" approach
risicle 0387adc
[RFC 0089] change switched -> changed
risicle 32e773d
[RFC 0089] typo fix
risicle b292ab6
[RFC 0089] remove unclear "opaque type" clause
risicle File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| --- | ||
| feature: collect-non-source-package-meta | ||
| start-date: 2021-03-14 | ||
| author: Robert Scott | ||
| co-authors: (find a buddy later to help out with the RFC) | ||
| shepherd-team: (names, to be nominated and accepted by RFC steering committee) | ||
| shepherd-leader: (name to be appointed by RFC steering committee) | ||
| related-issues: (will contain links to implementation PRs) | ||
| --- | ||
|
|
||
| # Summary | ||
| [summary]: #summary | ||
|
|
||
| Collect and maintain a new `meta` attribute in packages allowing users to easily | ||
| identify and manage their preference for binary (more broadly "non-source") | ||
| packages. | ||
|
|
||
| # Motivation | ||
| [motivation]: #motivation | ||
|
|
||
| Different users have different expectations from a software distribution. We | ||
| acknowledge that much with the collection of license information and the | ||
| existence of the `allowUnfree` nixpkgs option, much as Debian maintains a | ||
| separate `-nonfree` repository. | ||
|
|
||
| Similarly, there are a number of different reasons users may have to disfavour | ||
| those packages not built-from-source: | ||
|
|
||
| - Transparency: an ever-growing concern with more focus than ever on | ||
| supply-chain attacks. | ||
| - Malleability: being able to conveniently override packages with patches or an | ||
| altered build process is a key advantage of Nix, and for nixpkgs maintainers | ||
| it's not generally possible to backport security fixes to binary packages. | ||
|
|
||
| For some users, these concerns are enough to deter them from using Nix entirely. | ||
|
|
||
| # Detailed design | ||
| [design]: #detailed-design | ||
|
|
||
| Add a new `meta` attribute to non-source-built packages, `fromSource = false`. | ||
| Leave other packages as-is with the assumption of a missing attribute meaning | ||
| `true`. | ||
|
|
||
| Add a mechanism to allow `.nixpkgs/config.nix` to specify | ||
| `allowNonSource = false` to prevent use of these packages in a similar manner | ||
| to `allowUnfree`. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There should also be |
||
|
|
||
| # Alternatives | ||
| [alternatives]: #alternatives | ||
|
|
||
| I might have been tempted to collect the inverse, i.e. `isBinary = true` but | ||
| this runs into problems with clunky terminology. In my mind, the kind of package | ||
| that fails the transparency/malleability tests goes beyond what many people | ||
| would argue is "a binary". For instance, many (most?) java packages in nixpkgs | ||
| simply pull opaque `.jar`s - if not for their own app, they pull `.jar` | ||
| dependencies from maven. These are not transparent or malleable, but it's quite | ||
| an obtuse and disputable use of the term "binary" to describe them as such. | ||
|
|
||
| I decided that those packages which _did_ pass these transparency/malleability | ||
| tests had more in common than those that don't: that they are "from source", a | ||
| form where users have as much ability to inspect and alter the result as the | ||
| original author did. | ||
|
|
||
| There already exists a rather informally-applied convention of adding a `-bin` | ||
| suffix to the package names of "binary packages". This is non-ideal because: | ||
|
|
||
| - It doesn't allow a user to filter the use of these packages in a better way | ||
| than simply not requesting a package with a `-bin` suffix. Binary-package | ||
| _dependencies_ of non-`-bin` packages will still be installed regardless. | ||
| - It falls into the terminology trap over the term "binary", and if we expanded | ||
| the definition of what a "binary" package is, *very many* packages in nixpkgs | ||
| would have to be renamed, causing not only visual clutter but possible | ||
| breakage and churn. | ||
|
|
||
| If we _don't_ do anything about this, then I think we continue to signal to | ||
| users who have such concerns over the source of their software that | ||
| nixpkgs/NixOS isn't for them. Far from being a concern just for obscure | ||
| extremists, most Debian users would probably balk at our appetite for binary | ||
| packages. | ||
|
|
||
| # Drawbacks | ||
| [drawbacks]: #drawbacks | ||
|
|
||
| - Some maintainers may be upset by having their packages marked as | ||
| `fromSource = false`. | ||
| - It could spur us to disappear into endless navel-gazing conversations about | ||
| what really counts as "from source" and what doesn't. | ||
| - On the other hand, _not_ discussing where the line stands thoroughly enough | ||
| could cause the flag to be over-applied and thus become useless. Should we be | ||
| compiling all our fonts where e.g. fontforge files are available? If all of | ||
| these got marked as `fromSource = false`, all of a sudden users with | ||
| `allowNonSource = false` set may end up with no installable desktop. | ||
|
|
||
| # Unresolved questions | ||
| [unresolved]: #unresolved-questions | ||
|
|
||
| Exact attribute names are open for debate. | ||
|
|
||
| # Future work | ||
| [future]: #future-work | ||
|
|
||
| The author is willing to spend a significant amount of time finding and marking | ||
| non-source packages in nixpkgs. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem flexible enough:
unfreeRedistributableFirmwarelicense, there should be a way to label binary packages used in bootstrapping as such. Most users who don't want binary packages will be okay with their use in bootstrapping.Do you think
allowNonSourcePredicateis enough for the latter?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this hints slightly towards making it non-boolean, but if so, what should the options be? I think I considered finer classifications but saw it as something this could evolve into if/once we come to understand the problem better.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A list of "types"? Because a package might contain different types!
For an hypothetical driver with a code part, and a binary firmware part.
For a pre-built font.
(The naming here is clunky, I hope if gets the point across.)
EDIT: as a bonus, collecting what provenances a closure uses is a matter of adding all the lists together, and then getting the unique elements. And it is closer to the mechanisms used for licenses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a list of types is usually too complex. Hence I suggest usually not using them, even though some packages might require them.
We sometimes have lists of licenses, though without clear semantics. For this attribute, a list should be interpreted as being built from all of those types combined.
I would call the attribute
meta.builtFrom.A normal
xyz-binpackage would, even if parts of it are built from source, haveA binary font could have
And a bootstrapping package would have
The default would be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that they are more complex, I don't think it's an issue.
Otherwise we're pushing the complexity into pre-declaring all possible combinations as part of the library representing the different types.
We need granularity here. Otherwise we'll have issues deciding which descriptor to use, and end up using the wrong generic "it's a binary", which will make filtering against or for a specific descriptor needlessly harder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do something "weird" and have
lib.builtFrom.source = ["source"]so that you can easily do one or++two together. Or maybe better islib.buildFrom.source = {source = true;}and then you can//them together but it is a bit easier for a predicate to filter the ones that you are okay with (and removes the irrelevant ordering).Also I think instead of
builtFromit probably makes sense to call itnonSourceComponentsand list the types of things that were not built from source where "types" could be things likecode,assets,docsand similar.I see this is somewhat against the "Why not isBinary?" below but I think it kinda agrees.
nonSourceComponents = {}is the "purest" form and well understood. Then you just have to document the deviations.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like samueldr's idea here. Another benefit of it would be that I'd be able to say "I don't want proprietary software on my computer, but it's okay if data files are CC-BY-ND" or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice, my worry is just what we'd actually end up with given humans being humans, most people probably not that invested in understanding a complex tagging scheme well enough to perfectly represent the situation for a particular package.
This concern probably comes from my background in openstreetmap and the long, heated discussions that take place on the "tagging" mailing list debating a scheme that can perfectly represent most every situation, yet which ends up bearing little relation to what people actually map with in reality, because it's too complex and verbose.
I certainly think it's important to make the common cases have very concise representations, and also allow both coarse and fine granularity. If only fine granularities are allowed, it will deter most people from bothering to add an annotation at all. Coarse data is better than nothing and can always be refined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree with @risicle and @dotlambda here, in that I think we should at least start with a rougher schema, and wait until we've seen if we actually need a more refined one. I resonate with the risk @risicle sees, otherwise.
@dotlambda's proposal matches what I would do:
allowNonSourcePredicateI think this would be a great first step, and it would address the 2 points raised in the RFC:
Other concerns, like wanting to only run free software + special casing some data files based on their license, I would leave out of this RFC, and potentially address in a follow-up one.