Skip to content

bug: spdx checksum empty array; allow syft to generate SHA1 for spdx-tag-value documents#1404

Merged
spiffcs merged 31 commits intomainfrom
spdx-checksum-update
Dec 20, 2022
Merged

bug: spdx checksum empty array; allow syft to generate SHA1 for spdx-tag-value documents#1404
spiffcs merged 31 commits intomainfrom
spdx-checksum-update

Conversation

@spiffcs
Copy link
Copy Markdown
Contributor

@spiffcs spiffcs commented Dec 14, 2022

Summary

Syft currently generates SPDX documents that do not contain SHA1 digests for the files by default.

For a user to obtain a document that meets the current specification the following has to be true:

  • app.FileMetadata.Cataloger.Enabled
  • app.FileMetadata.Digests must add sha1 to the list of desired digests (currently defaults to sha256)

Here is a link to the reference that shows this requirement:
https://spdx.github.io/spdx-spec/v2.3/file-information/#84-file-checksum-field

How to test ---

I've provided a docker image that wraps the spdx-java-tools: caphill4/java-spdx-tools

Three images are supplied by #1401

  • alpine:latest
  • debian:latest
  • photon:3.0

You can follow these steps to see the newly valid spdx documents. Make sure to replace <IMAGE> with one of the above.

SYFT_FILE_METADATA_CATALOGER_ENABLED=true SYFT_FILE_METADATA_DIGESTS=sha1 go run cmd/syft/main.go <IMAGE> -o spdx > test.spdx

docker run -v (pwd)/test.spdx:/test.spdx caphill4/java-spdx-tools Verify /test.spdx 

Screenshot 2022-12-19 at 4 08 59 PM

Context

Running the following command syft -o spdx-json alpine:latest > spdx.json generates an SPDX json document.

  1. Using https://github.com/spdx/tools-python a user can run:
    pyspdxtools_parser --file spdx.spdx

This results in a python error:
TypeError: 'NoneType' object is not iterable

The fix in this PR updates the checksum array from null ==> []

  1. After an empty array is replaced the following errors are seen:
alpine:latest: /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-616ac3bc.rsa.pub: At least one file checksum algorithm must be SHA1
alpine:latest: /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-616adfeb.rsa.pub: At least one file checksum algorithm must be SHA1
alpine:latest: /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-616ae350.rsa.pub: At least one file checksum algorithm must be SHA1
alpine:latest: /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-616db30d.rsa.pub: At least one file checksum algorithm must be SHA1
alpine:latest: /usr/share/udhcpc/default.script: At least one file checksum algorithm must be SHA1
Errors while parsing:  True

Given that a user has enabled the file digest cataloger:
SYFT_FILE_METADATA_CATALOGER_ENABLED=true SYFT_FILE_METADATA_DIGESTS=sha1

This PR adds the ability to generate a valid SPDX document by enhancing the package entity with a PackageVerificationCode.

The PackageVerificationCode is generated by obtaining all of the Contains relationships for a package and performing the following algorithm:

verificationcode = 0
filelist = templist = ""
for all files in the package {
    if file is an "excludes" file, skip it /* exclude SPDX analysis file(s) */

        append templist with "SHA1(file)/n"
    }
sort templist in ascending order by SHA1 value
filelist = templist with "/n"s removed. /* ordered sequence of SHA1 values with no separators */
verificationcode = SHA1(filelist)

https://spdx.github.io/spdx-spec/v2.3/package-information/#79-package-verification-code-field

Note:

We use the Contains relationship here since this is the value the SPDX validator will key off of if a package verification code is missing.

Example:
Validator warns: At least one file checksum algorithm must be SHA1 for package x files

After providing a SHA1 digest for the file:

Validator warns: Found analyzed files for package x when analyzedFiles is set to false

After settting analyzedFiles to true:

Validator warns: Missing Package Verification Code for package x

Note

Empty-size checks were removed from the digest cataloger in order to produce a valid document where empty files like /lib/apk/db/lock were not being summed.

If this is incorrect or needs to be reworked so we're not missing some edge case regarding the removed tar file note I'm happy to refactor.

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
example of empty file being ignored /lib/apk/db/lock

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs force-pushed the spdx-checksum-update branch from 650935b to e7948cc Compare December 14, 2022 16:09
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 14, 2022

Benchmark Test Results

Benchmark results from the latest changes vs base branch
name                                                       old time/op    new time/op    delta
ImagePackageCatalogers/alpmdb-cataloger-2                    11.4ms ± 1%    11.7ms ± 0%  +2.36%  (p=0.029 n=4+4)
ImagePackageCatalogers/ruby-gemspec-cataloger-2              1.31ms ± 1%    1.28ms ± 1%  -2.38%  (p=0.016 n=4+5)
ImagePackageCatalogers/python-package-cataloger-2            3.34ms ± 1%    3.34ms ± 1%    ~     (p=1.000 n=5+5)
ImagePackageCatalogers/php-composer-installed-cataloger-2    1.08ms ± 1%    1.06ms ± 2%    ~     (p=0.056 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2         768µs ± 2%     744µs ± 1%  -3.11%  (p=0.008 n=5+5)
ImagePackageCatalogers/dpkgdb-cataloger-2                     886µs ± 1%     845µs ± 0%  -4.65%  (p=0.008 n=5+5)
ImagePackageCatalogers/rpm-db-cataloger-2                    1.30ms ± 1%    1.25ms ± 0%  -4.05%  (p=0.008 n=5+5)
ImagePackageCatalogers/java-cataloger-2                      14.8ms ± 1%    14.7ms ± 1%  -0.98%  (p=0.016 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                      891µs ± 2%     858µs ± 0%  -3.66%  (p=0.008 n=5+5)
ImagePackageCatalogers/go-module-binary-cataloger-2          6.41µs ± 1%    6.44µs ± 2%    ~     (p=0.690 n=5+5)
ImagePackageCatalogers/dotnet-deps-cataloger-2               1.37ms ± 2%    1.37ms ± 0%    ~     (p=0.690 n=5+5)
ImagePackageCatalogers/portage-cataloger-2                    715µs ± 1%     684µs ± 0%  -4.24%  (p=0.008 n=5+5)
ImagePackageCatalogers/sbom-cataloger-2                      4.46ms ± 0%    4.34ms ± 0%  -2.82%  (p=0.008 n=5+5)
ImagePackageCatalogers/binary-cataloger-2                    3.90ms ± 1%    3.77ms ± 0%  -3.39%  (p=0.008 n=5+5)

name                                                       old alloc/op   new alloc/op   delta
ImagePackageCatalogers/alpmdb-cataloger-2                    5.26MB ± 0%    5.27MB ± 0%    ~     (p=0.222 n=5+5)
ImagePackageCatalogers/ruby-gemspec-cataloger-2               205kB ± 0%     205kB ± 0%    ~     (p=0.421 n=5+5)
ImagePackageCatalogers/python-package-cataloger-2             963kB ± 0%     963kB ± 0%  +0.10%  (p=0.016 n=5+5)
ImagePackageCatalogers/php-composer-installed-cataloger-2     218kB ± 0%     218kB ± 0%    ~     (p=0.222 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2         159kB ± 0%     159kB ± 0%  +0.12%  (p=0.016 n=5+5)
ImagePackageCatalogers/dpkgdb-cataloger-2                     200kB ± 0%     200kB ± 0%    ~     (p=0.548 n=5+5)
ImagePackageCatalogers/rpm-db-cataloger-2                     303kB ± 0%     303kB ± 0%    ~     (p=0.056 n=5+5)
ImagePackageCatalogers/java-cataloger-2                      3.49MB ± 0%    3.49MB ± 0%    ~     (p=0.421 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                      182kB ± 0%     182kB ± 0%    ~     (p=1.000 n=5+5)
ImagePackageCatalogers/go-module-binary-cataloger-2          1.12kB ± 0%    1.12kB ± 0%    ~     (all equal)
ImagePackageCatalogers/dotnet-deps-cataloger-2                374kB ± 0%     374kB ± 0%  +0.03%  (p=0.016 n=5+5)
ImagePackageCatalogers/portage-cataloger-2                    139kB ± 0%     139kB ± 0%    ~     (p=0.421 n=5+5)
ImagePackageCatalogers/sbom-cataloger-2                       722kB ± 0%     722kB ± 0%  +0.04%  (p=0.008 n=5+5)
ImagePackageCatalogers/binary-cataloger-2                     656kB ± 0%     656kB ± 0%  +0.01%  (p=0.008 n=5+5)

name                                                       old allocs/op  new allocs/op  delta
ImagePackageCatalogers/alpmdb-cataloger-2                     85.7k ± 0%     85.7k ± 0%    ~     (p=0.365 n=5+5)
ImagePackageCatalogers/ruby-gemspec-cataloger-2               4.25k ± 0%     4.25k ± 0%    ~     (all equal)
ImagePackageCatalogers/python-package-cataloger-2             16.5k ± 0%     16.5k ± 0%    ~     (p=0.897 n=5+5)
ImagePackageCatalogers/php-composer-installed-cataloger-2     5.50k ± 0%     5.50k ± 0%    ~     (p=0.794 n=4+5)
ImagePackageCatalogers/javascript-package-cataloger-2         3.33k ± 0%     3.33k ± 0%    ~     (all equal)
ImagePackageCatalogers/dpkgdb-cataloger-2                     4.47k ± 0%     4.47k ± 0%    ~     (all equal)
ImagePackageCatalogers/rpm-db-cataloger-2                     8.12k ± 0%     8.12k ± 0%    ~     (all equal)
ImagePackageCatalogers/java-cataloger-2                       57.5k ± 0%     57.5k ± 0%    ~     (p=0.460 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                      5.23k ± 0%     5.23k ± 0%    ~     (all equal)
ImagePackageCatalogers/go-module-binary-cataloger-2            38.0 ± 0%      38.0 ± 0%    ~     (all equal)
ImagePackageCatalogers/dotnet-deps-cataloger-2                7.12k ± 0%     7.12k ± 0%    ~     (all equal)
ImagePackageCatalogers/portage-cataloger-2                    3.58k ± 0%     3.58k ± 0%    ~     (p=1.000 n=5+4)
ImagePackageCatalogers/sbom-cataloger-2                       24.4k ± 0%     24.4k ± 0%    ~     (all equal)
ImagePackageCatalogers/binary-cataloger-2                     22.2k ± 0%     22.2k ± 0%    ~     (all equal)

Comment thread syft/file/digest_cataloger.go
Comment thread test/cli/spdx_json_schema_test.go
Comment thread syft/formats/common/spdxhelpers/to_format_model.go
@spiffcs spiffcs changed the title chore: update schema to latest 2.3 chore: update schema to latest 2.3 - add valid checksum by default Dec 14, 2022
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs changed the title chore: update schema to latest 2.3 - add valid checksum by default bug: add empty array field; allow syft to generate SHA1 for all spdx documents Dec 14, 2022
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@kzantow kzantow linked an issue Dec 14, 2022 that may be closed by this pull request
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs
Copy link
Copy Markdown
Contributor Author

spiffcs commented Dec 19, 2022

--

@spiffcs spiffcs added the blocked Progress is being stopped by something label Dec 19, 2022
previously we were using filesOwned to interpret
files needed for the packageVerificationCode. This commit
simplifies this logic by defaulting to using the ContainsRelationship

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs removed the blocked Progress is being stopped by something label Dec 19, 2022
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Comment thread syft/sbom/sbom.go
Copy link
Copy Markdown
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great -- I can't wait to be able to have completely valid SPDX files!

Only real question is about image-java-spdx-tools/Dockerfile, primarily because I believe I saw that you might like to add some tests for this.

Comment thread test/cli/test-fixtures/image-java-spdx-tools/Dockerfile Outdated
Comment thread syft/sbom/sbom.go Outdated
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs changed the title bug: add empty array field; allow syft to generate SHA1 for all spdx documents bug: checksum empty array; allow syft to generate SHA1 for all spdx documents Dec 19, 2022
@spiffcs spiffcs changed the title bug: checksum empty array; allow syft to generate SHA1 for all spdx documents bug: checksum empty array; allow syft to generate SHA1 for spdx-tag-value Dec 19, 2022
Comment thread test/cli/spdx_tooling_validation_test.go Outdated
Copy link
Copy Markdown
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Comment thread test/cli/test-fixtures/image-java-spdx-tools/Makefile
Comment thread test/cli/spdx_tooling_validation_test.go Outdated
Copy link
Copy Markdown
Contributor

@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of nit comments on the new test (feel free to take it or leave it). Overall a quite-worthy improvement to the SPDX validation --thanks for your time on this 🚀

Comment thread test/cli/spdx_tooling_validation_test.go Outdated
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs changed the title bug: checksum empty array; allow syft to generate SHA1 for spdx-tag-value bug: spdx checksum empty array; allow syft to generate SHA1 for spdx-tag-value documents Dec 19, 2022
@spiffcs spiffcs enabled auto-merge (squash) December 19, 2022 22:27
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs merged commit 0f1e8fc into main Dec 20, 2022
@spiffcs spiffcs deleted the spdx-checksum-update branch December 20, 2022 00:10
spiffcs added a commit to raboof/syft that referenced this pull request Dec 20, 2022
* main: (87 commits)
  feat: Add license parsing for java (anchore#1385)
  fix: cyclonedx component type for binaries (anchore#1406)
  fix: openjdk detection pattern (anchore#1415)
  bug: spdx checksum empty array; allow syft to generate SHA1 for spdx-tag-value documents (anchore#1404)
  Add NetBSD support. (anchore#1412)
  feat: add catalog delete (anchore#1377)
  docs: remove file classifier (anchore#1397)
  chore: update latest cyclonedx library (anchore#1390)
  feat: Add Java binary catalogers (anchore#1392)
  chore: Update SPDX license list to 3.19 (anchore#1389)
  fix: add manual vendor/product removal to fix false flags (anchore#1070)
  Update Stereoscope to c5ff155d72f166e2332e160a75c3ff2b8e9c7e2e (anchore#1395)
  chore: fix test busybox image sha (anchore#1393)
  fix: go version not properly identified in binary (anchore#1384)
  Update Stereoscope to 3b80d983223f6e6fc2d33b0ffa003d30268418e9 (anchore#1376)
  fix: Update node binary package name (anchore#1375)
  feat: Generic Binary Cataloger (anchore#1336)
  recover from bad parsing of golang binary (anchore#1371)
  Fix parsing of apk databases with large entries (anchore#1365)
  Update syft bootstrap tools to latest versions. (anchore#1369)
  ...
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants