Skip to content

perf: switch to kubescape/syft v1.32.0-ks.2 + disable file catalogers#355

Merged
matthyx merged 3 commits intomainfrom
feature/syft-memory-improvements
Apr 29, 2026
Merged

perf: switch to kubescape/syft v1.32.0-ks.2 + disable file catalogers#355
matthyx merged 3 commits intomainfrom
feature/syft-memory-improvements

Conversation

@slashben
Copy link
Copy Markdown
Contributor

@slashben slashben commented Apr 28, 2026

Memory-reduction rollout (NAUT-1283)

Reduces node-agent + kubevuln scan peak RSS by 30.7% on gitlab-ee
(1,621 MB → 1,123 MB), fitting a 1.5 GB cgroup with 377 MB margin.

Measured deltas (gitlab-ee, 113,836 files; kernel peak RSS via /usr/bin/time -v)

Variant Peak RSS Δ vs main+all-cats
main + all catalogers 1,621 MB baseline
main + file-cats off 1,419 MB −202 MB
selective + file-cats off 1,184 MB −437 MB
combined + file-cats off 1,123 MB −498 MB (−30.7%)

Initiative status

  • Initiative 1 — disable file catalogers (this PR for node-agent / kubevuln)
  • Initiative 2 — binary-cataloger prefilter (in kubescape/syft v1.32.0-ks.2)
  • Initiative 3 — selective indexing (in kubescape/syft v1.32.0-ks.2)
  • Initiative 4 — parallelism = 1 (already in place: node-agent uses workerpool.New(1); kubevuln scanConcurrency defaults to 1)
  • Initiative 5 — GOMEMLIMIT at 80% of cgroup (this PR for helm-charts)

Cross-repo PRs

  • helm-charts: kubescape/helm-charts#PENDING_HELM
  • node-agent: kubescape/node-agent#PENDING_NA
  • kubevuln: kubescape/kubevuln#PENDING_KV

Audit

Pre-merge audit confirmed no production-path consumer reads
sbom.Files[*].Digests or sbom.Files[*].Metadata in node-agent,
kubevuln, or kubescape/storage. The two storage consumers
(containerprofile_processor.go:172, applicationprofile_processor.go:67)
only read f.Location.RealPath, which the directory walker still
populates regardless of file-cataloger disable. Selective indexing also
keeps 99.9% of the file-path coverage on gitlab-ee
(113,265 of 113,382 paths).

Reference: shared-designs-and-docs/syft-memory-improvement/2026-04-28-rollout-design.md

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Updated SBOM generation configuration to refine the cataloging pipeline.
  • Chores

    • Updated test data to reflect new formatting standards.
    • Updated underlying dependencies to improve tool functionality.

These three catalogers iterate every file in the scan tree and dominate
transient allocation, but their outputs are not consumed downstream in the
vulnerability scan pipeline. Disabling them saves ~200 MB peak RSS on
gitlab-ee and stacks with upstream selective-indexing + binary-prefilter
improvements.
Signed-off-by: Ben <ben@armosec.io>
Routes anchore/syft imports to the kubescape fork via replace directive.
The fork carries selective indexing + binary-cataloger pre-filtering on
top of v1.32.0; combined with the file-cataloger disable in the parent
commit, this reduces gitlab-ee scan peak RSS from 1,621 MB to 1,123 MB.

Refs: NAUT-1283
Signed-off-by: Ben <ben@armosec.io>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d326421c-5d50-401e-9296-49ad4776fd0b

📥 Commits

Reviewing files that changed from the base of the PR and between 3b420fb and e905737.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • adapters/v1/syft.go
  • adapters/v1/testdata/alpine-embedded-sbom.json
  • adapters/v1/testdata/alpine-sbom.format.json
  • go.mod

📝 Walkthrough

Walkthrough

The pull request modifies SBOM generation configuration to exclude specific file catalogers (digest, metadata, executable) from Syft's pipeline, updates test data JSON files to reflect the new SBOM format with unescaped ampersands and removed file metadata, and adds a Go module replacement pointing Syft to a Kubescape fork.

Changes

Cohort / File(s) Summary
SBOM Generation Configuration
adapters/v1/syft.go
Adds explicit cataloger selection removal to exclude file-digest, file-metadata, and file-executable catalogers from the SBOM pipeline before calling syft.CreateSBOM.
Test Data Updates
adapters/v1/testdata/alpine-embedded-sbom.json, adapters/v1/testdata/alpine-sbom.format.json
Updates SBOM JSON test fixtures to use unescaped & separators in PURLs and removes metadata and digests objects from file entries, reflecting the new cataloger configuration.
Dependency Management
go.mod
Adds module replacement mapping github.com/anchore/syft to github.com/kubescape/syft fork at version v1.32.0-ks.2.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

Suggested Labels

release

Poem

🐰 Catalogers trimmed with surgical care,
File digests and metadata stripped from the air,
Lighter SBOMs dance through the fork,
Kubescape's Syft does cleaner work!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically summarizes the two main changes: switching to a kubescape/syft fork version and disabling file catalogers for performance improvements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/syft-memory-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@slashben
Copy link
Copy Markdown
Contributor Author

@matthyx — these three PRs implement the syft memory-reduction work we discussed (NAUT-1283). The combined effect is -498 MB peak RSS on gitlab-ee (1,621 → 1,123 MB), fitting the 1.5 GB cgroup with margin. The syft side is a clean two-commit branch on kubescape/syft tagged v1.32.0-ks.2 (rebased on anchore v1.32.0 per your "two commits" request). Cross-linked PRs:

Whenever you have a moment.

@matthyx
Copy link
Copy Markdown
Contributor

matthyx commented Apr 28, 2026

@slashben fix unit tests pls

@matthyx matthyx moved this to Waiting on Author in KS PRs tracking Apr 28, 2026
file-digest-cataloger and file-metadata-cataloger are now disabled, so
$.files[i] no longer carries digests or metadata keys; update fixtures to
match the slimmer output
Signed-off-by: Ben <ben@armosec.io>
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: failure
  • Vulnerabilities scan: failure
  • Unit test: success
  • Go linting: failure

Copy link
Copy Markdown
Contributor

@matthyx matthyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending system tests

@matthyx matthyx added the release Create release label Apr 29, 2026
@matthyx matthyx merged commit 8e51783 into main Apr 29, 2026
11 checks passed
@matthyx matthyx deleted the feature/syft-memory-improvements branch April 29, 2026 13:42
@matthyx matthyx moved this from Waiting on Author to To Archive in KS PRs tracking Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release Create release

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants