Skip to content

Add PE binary cataloger#3911

Merged
wagoodman merged 1 commit into
mainfrom
add-pe-binary-cataloger
May 19, 2025
Merged

Add PE binary cataloger#3911
wagoodman merged 1 commit into
mainfrom
add-pe-binary-cataloger

Conversation

@wagoodman
Copy link
Copy Markdown
Contributor

This adds (back) the ability to detect packages from arbitrary dll and exe files. Before #3563 and #3768 were merged the dotnet-cataloger would claim that all dlls/exes found were .NET applications, which was not ideal -- these PRs (correctly) removed these findings so that only binaries with CLR evidence were raised up.

This new pe-binary-cataloger ignores any files that have CLR evidence as these files are already being considered in the .NET-based catalogers.

In terms of refactors, it made sense to take much of the PE parsing logic and centralize it under an internal package to use it across the .NET and binary catalogers.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

@github-actions github-actions Bot added the json-schema Changes the json schema label May 16, 2025
@wagoodman wagoodman force-pushed the add-pe-binary-cataloger branch from 8ecb91b to 09dd47d Compare May 16, 2025 21:15
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file was moved from syft/pkg/cataloger/dotnet/pe.go and some types and functions were exported -- nothing more.

Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
@wagoodman wagoodman force-pushed the add-pe-binary-cataloger branch from 09dd47d to 1f398d0 Compare May 16, 2025 21:19
@wagoodman wagoodman requested a review from a team May 16, 2025 21:21
Copy link
Copy Markdown
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

"purl"
]
},
"PeBinary": {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this match the struct case of PEBinary?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the struct itself is already called PEBinary, we're looking at the name in the JSON schema (which has pretty simple rules for coming up with definition names)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was just about PE vs Pe


var (
// spaceRegex includes nbsp (#160) considered to be a space character
spaceRegex = regexp.MustCompile(`[\s\xa0]+`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smallest nit: I'm sure we have this elsewhere and probably should be using this elsewhere -- it could be useful to have some standard functions/regexes/etc. that catalogers could share rather than re-writing this multiple times; and #160 isn't always obvious to include in whitespace, but we probably want to almost everywhere we are trimming or otherwise dealing with whitespace. Maybe a shared spaceNormalize is what I'm thinking of

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any other places in the code where we're doing ReplaceAll(subject, <space-like-things>, " ") (or similar) --if you find some examples of this happy to refactor in a follow up PR.

@wagoodman wagoodman merged commit e23ca43 into main May 19, 2025
12 checks passed
@wagoodman wagoodman deleted the add-pe-binary-cataloger branch May 19, 2025 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

json-schema Changes the json schema

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Read version resources from non-.NET DLLs and executables

2 participants