feat: Rework amend command, add optional operations#160
Conversation
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||
fed482f to
0734ed2
Compare
There was a problem hiding this comment.
@mmarseu I did some tests and seems like most of the commands are working as expected, so after doing the changes, we should be able to merge this one (as the comments are mostly regarding grammar/comments)
However, some more remarks:
- Maybe you should rename this MR as it does more than described and is more a rework than just feat.
- I'm fine with "3. Breaking change: InferCopyright no longer runs by default", now it is up to anybody to decide whether they want to do this or not. Maybe a print to stdout with an
INFOwould be useful, like it is already done inclass LicenseNameToIdandAddLicenseText, would even add some consistency and then the changes are traceable 😉. - In "4. Breaking change: Broken up ProcessLicense" you mention that "It no longer adds an empty string as text to licenses where no text has been found. That intentionally makes the SBOM worse than it was before", however if I only have one ambiguous licenses in the
licenses-array, then--operation delete-ambiguous-licensesleads to an empty array, i.e.licenses: []. Should not we, in this case, remove the keylicensescompletely, as an empty array does not add any (information) value?
Why the "one" is bold: It seems that--operation delete-ambiguous-licensesonly deletes the first license withlicense.name, e.g.
"licenses": [
{"license": {"id": "Apache-2.0"}},
{"license": {"name": "Some license"}},
{"licence": {"name": "blub"}}
],leads to
"licenses": [
{"license": {"id": "Apache-2.0"}},
{"licence": {"name": "blub"}}
],This is something you do not have included in your tests, there you only have the case one id and one name.
- Do we need a test for the case that
license.nameis similar to anSPDX-IDand thedelete-ambiguous-licensesoperation is used? Just to make sure that it maps it to alicense.idand does not delete it. - The reason why we wanted
supplier, which you now changed with "5. Breaking change: InferSupplier is now more selective", was the current tool landscape: Most of the tools (e.g. DependencyTrack and BlackDuck) do not understand e.g.publisher, mostly they either only supportedsupplieror, additionally,author.
Fair enough.
Done
We can. I'll add it.
I've added a test for this. Weirdly, it works perfectly fine for me 😕
I don't think so. The user is free to make sure that they run
I understand. I'll revert this particular change. |
89d5669 to
14e5942
Compare
Overview
This PR originally only intended to add a new operation to the amend command. This addition triggered others and snowballed into a rather large overhaul of the command. I'll try to go into all changes in adequate detail.
DeleteAmbigiousLicenseswhich deletes licenses identified only by name, with no other context that provides more insight about the license's content. This is not very contentious and requires little further explanation. See Addamendoperator to delete ambigious licenses #144 for more.InferCopyrightno longer runs by default.ProcessLicenseInferSupplierno longer adds supplier information to components which already have comparable fields.Compositionsnow sets aggregate unknown instead of incompletecdxev/amend/operations.py.tests/test_amend.py2. New command-line option to select operations to run
Because the added
DeleteAmbigiousLicensescommand can be dangerous, it shouldn't run by default. This necessitates a new mechanism for the user to select the operations to run.This change introduces the
--operationscommand-line option. It takes a list of operation names to run. If the option is not provided, a default set of operations is run.3. Breaking change:
InferCopyrightno longer runs by defaultThe set of default operations includes all pre-existing operations with one exception:
InferCopyrightis extremely dangerous and its outright removal from the tool is currently in debate. Pending any decision, this PR makes it an opt-in operation.4. Breaking change: Broken up
ProcessLicense4.1 Removed feature to delete licenses with name "unknown"
The
ProcessLicenseoperation previously included a feature that deleted any licenses whosenameincludes the word "unknown" and that doesn't also specify the full text.This has been been removed for the following reasons:
If the feature is nevertheless desired, it can always be re-added as a separate operation. In that case, it shouldn't be added to the default set.
4.2 New operation
LicenseNameToIdThis operation runs by default and takes over the part of
ProcessLicensethat translates well-known names to SPDX ids.4.3 New operation
AddLicenseTextThis operation takes over the part of
ProcessLicensethat adds full text to licenses with only a name.It doesn't run by default because it needs a parameter to be specified. In the interest of usability, we shouldn't force users to provide a command-line option if they don't want to use the feature.
There are a few important changes:
If this feature is nevertheless desired, we can add it as a separate, non-default operation.
5. Breaking change:
InferSupplieris now more selectiveIn my effort to make the operation documentation fit for printing in the CLI, I discovered that
InferSupplierwas unnecessarily zealous. Given the inherent uncertainty in the information it generates (we can't really be sure that the generated supplier is valid), it probably should only operate on the components where it is really required. That's why it now no longer adds supplier information to components that already contain comparable fields (e.g.,author,manufacturer, etc.).6. Breaking change:
Compositionsnow sets aggregate unknown instead of incompleteDiscussion: #161
Besides the aggregate, this PR also changes the behavior regarding the metadata component. Previously, this component was treated the same as any other. With this change, the metadata component keeps its original aggregate, it is not set to "unknown", unless it was "unknown" in the input.
7. Improved documentation of design constraints
The fact that individual operations are now exposed on the CLI for manual selection by the user leads to a few new design considerations for operations which have been documented in the module docstring in
cdxev/amend/operations.py.The upshot is: keep it simple, keep it small, and be damn sure you don't run dangerous operations by default.
8. Remove semi-integration tests from
tests/test_amend.pySome tests have been removed in preparation of #157, where they will be grouped together in one module.