-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Alpine Linux /lib/apk/db/installed (Resolves #72) #107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start, thanks!
In addition to my specific comments on the code, could you also:
- run formatting
- add a test for when the file doesn't exist
- add a test that has the lines shuffled (with at least one
V:
before `P:) to confirm that we're not order-dependent - update
parse_test.go
to reflect the new parser
pkg/lockfile/parse.go
Outdated
@@ -32,6 +32,7 @@ var parsers = map[string]PackageDetailsParser{ | |||
"yarn.lock": ParseYarnLock, | |||
"gradle.lockfile": ParseGradleLock, | |||
"buildscript-gradle.lockfile": ParseGradleLock, | |||
"installed": ParseApkInstalled, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably not worth the extra complexity to try and solve right now, but I'm interested in how others feel about this - while I know installed
is the name of the file, it feels a little too generic; I wouldn't be surprised if there's another tool out there that creates a file with the same name that we might one day want to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's too generic, it is not unlikely someone might just have a installed
file in their repository completely unrelated to alpine, which will spit out potentially confusing warnings.
A quick fix could be to check whether installed
is in /lib/apk/db/installed
and only scan that, though that is quite limited, since you might be copying it out, or mounting a docker image ...etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be handled similar to git, sboms and csvs in osv-detector
(which are a parser located in lockfile
, but are never assumed when scanning directories - you have to explicitly say "this is a CSV I want you to check") - because /lib/apk/db/installed
is very specific, so you're only ever going to get it checked if you run the scanner from anywhere along that path with -r
anyway
So what if instead it had a dedicated flag, like --check-apk-installed
?
#94 could then be used as a way of telling the scanner to treat arbitrary files as installed
, but the scanner would never automatically assume a file called installed
should be parsed as-such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, I made an issue here #124 to track it since as you said it's beyond the scope of this PR. We probably should solve this before the next release to avoid changing behavior.
…nd rename of fixtures to better match standard naming
…the case of missing file
…d commit parsing.
Hello @G-Rath, thank you for your precious feedback, I'm learning a lot in contributing to this project. Thank you very much. |
|
||
var curPkg PackageDetails = PackageDetails{} | ||
|
||
for scanner.Scan() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should take a similar approach to what I've done in the yarn.lock
parser, which was to do the parsing in two phases: 1. group the lines for each package, 2. parse each collection of lines in PackageDetails
.
While technically it's doing more loops, it should make it easier to reduce the amount of conditional branches (especially the duplicate ones outside the loop) which in my experience tends to be worth it - what do you think?
(if you do agree, then I think you should be able to use pretty much the exact same logic/structure as the yarn.lock
parser, then add an extra check to handle if the name couldn't be found, which could go in a couple of different places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problems for me, I'll try and get back to you soon.
I think that additional loops are not an issue since lockfiles are generally small.
There will be more code, but aligned to other parsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @G-Rath ,
I have a question:
it should make it easier to reduce the amount of conditional branches
(especially the duplicate ones outside the loop)
maybe I'm missing something but with your solution, aren't you performing the same pattern and number of conditional branches?
I'm referring to parse-yarn-lock.go:
here and here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @G-Rath ,
committed changes. Tests ok.
Thank you again for your advice and support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I'm missing something but with your solution, aren't you performing the same pattern and number of conditional branches?
Correct, but the footprint is smaller since it's just a length check + an append; the idea of that function is effectively to try and get us out of that raw unstructured state as fast as possible, so that we can do the more critical "business logic" stuff against a good structure which usually lets us avoid more complex versions of that pattern (which is where bugs can easily get introduced).
I view it as a lesser evil type situation - I won't say it's impossible to avoid having a duplicated check like that somewhere in this code, but I would expect it to have a significant tradeoff in at least one area (performance, readability, size, etc) that would make it worse than duplicating these three lines (I'm happy to be proven wrong though, if anyone has some ideas).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is coming along really well, thanks for sticking with it! I think this should be the last main change we should make before this is good to go.
Also, I'm thinking we should rename the groupPackageLines
and parsePackageGroup
functions in the yarn.lock
parser to groupYarnPackageLines
and parseYarnPackageGroup
respectively - are you ok doing that in this PR? otherwise doing it in another is fine.
|
||
var curPkg PackageDetails = PackageDetails{} | ||
|
||
for scanner.Scan() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I'm missing something but with your solution, aren't you performing the same pattern and number of conditional branches?
Correct, but the footprint is smaller since it's just a length check + an append; the idea of that function is effectively to try and get us out of that raw unstructured state as fast as possible, so that we can do the more critical "business logic" stuff against a good structure which usually lets us avoid more complex versions of that pattern (which is where bugs can easily get introduced).
I view it as a lesser evil type situation - I won't say it's impossible to avoid having a duplicated check like that somewhere in this code, but I would expect it to have a significant tradeoff in at least one area (performance, readability, size, etc) that would make it worse than duplicating these three lines (I'm happy to be proven wrong though, if anyone has some ideas).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few nit-y suggestions, but it's looking pretty solid overall, great stuff!
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this! LGTM.
Fix indentation I messed up in the merge
/gcbrun |
/gcbrun |
… (google#107) * Added Alpine APK filelock support * Alpine APK Installed Database V2 - Added some basic tests * Updated README with /lib/apk/db/installed support * Alpine installed file - reformat and remove unwanted comments * TrimPrefix and missing formatting * Alpine APK support - Update parse_test.go to reflect the new parser and rename of fixtures to better match standard naming * Alpine APK support - Update tests adding shuffled installed file and the case of missing file * fix: APK installed - support for missing pkg and version fields. Added commit parsing. * APK installed support: refactor to minimize duplicate loops * Refactor: rename Yarn parse package functions to follow apk installed naming * APK installed support: refactor to match with similar Yarn parser * APK installed support: Add package name in warning message. Small additional refactoring * Update pkg/lockfile/parse.go Fix indentation I messed up in the merge Co-authored-by: Rex P <[email protected]>
… (google#107) * Added Alpine APK filelock support * Alpine APK Installed Database V2 - Added some basic tests * Updated README with /lib/apk/db/installed support * Alpine installed file - reformat and remove unwanted comments * TrimPrefix and missing formatting * Alpine APK support - Update parse_test.go to reflect the new parser and rename of fixtures to better match standard naming * Alpine APK support - Update tests adding shuffled installed file and the case of missing file * fix: APK installed - support for missing pkg and version fields. Added commit parsing. * APK installed support: refactor to minimize duplicate loops * Refactor: rename Yarn parse package functions to follow apk installed naming * APK installed support: refactor to match with similar Yarn parser * APK installed support: Add package name in warning message. Small additional refactoring * Update pkg/lockfile/parse.go Fix indentation I messed up in the merge Co-authored-by: Rex P <[email protected]>
Add support for Alpine APK Installed Database V2 ( resolves #72 ):
/lib/apk/db/installed
Reference: https://wiki.alpinelinux.org/wiki/Apk_spec
I tried to follow similar lockfile parsers (e.g. requirements.txt).
Added some tests, updated README.