-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add "vertical" output format #889
Conversation
505d532
to
bae0efe
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #889 +/- ##
==========================================
+ Coverage 66.11% 66.55% +0.44%
==========================================
Files 160 162 +2
Lines 12792 13002 +210
==========================================
+ Hits 8457 8654 +197
- Misses 3877 3889 +12
- Partials 458 459 +1 ☔ View full report in Codecov by Sentry. |
b32bc50
to
9d20ff7
Compare
1e47759
to
181148d
Compare
This comment was marked as resolved.
This comment was marked as resolved.
181148d
to
e2e8479
Compare
e2e8479
to
b8ac63e
Compare
This introduces a set of crafted scanner results that each supported `output` format is run through to showcase how they look across all the different results possible from a scanner run - it originally started life as the tests for #889 but I realised they could base used more generally for testing and reviewing all the outputters, so here we are. ~It looks like this has also revealed the SARIF output is unstable in its ordering, which I'll aim to address in a dedicated PR~
d5c8f4d
to
b266097
Compare
This comment was marked as resolved.
This comment was marked as resolved.
1ecd8e0
to
14e97cd
Compare
LGTM - however I am not sure if we want the output in this color schema. @another-rex is there any discussion/agreement on this? |
This introduces a set of crafted scanner results that each supported `output` format is run through to showcase how they look across all the different results possible from a scanner run - it originally started life as the tests for google#889 but I realised they could base used more generally for testing and reviewing all the outputters, so here we are. ~It looks like this has also revealed the SARIF output is unstable in its ordering, which I'll aim to address in a dedicated PR~
Thanks @G-Rath! It looks really nice! I tested it on container scanning, and the vertical results are much clearer than table ones. Adding some suggestions to further improve the display of container scanning results:
Example (Feel free to modify):
|
@hogo6002 thanks for the review and detailed writeup! It would be really useful if you could provide some reproducations for each of your points as I'm not across all of the different aspects of the scanner like I think you are, so I'm less setup to e.g. produce a result that has a uncalled vulnerability, multiple ecosystems, image scanner results, etc. Alternatively feel free to provide test cases since then you can just craft the situation you're thinking of rather than having to track down a real-world image/lockfile, and since we'll need those anyway to reflect the changes in the snapshots. I think the main concern I have (if I'm understanding things right) is that your proposals would likely greatly increase the output in both directions, which is potentially a bad thing e.g.
In my experience most advisories have at least three aliases: a CVE, a GHSA, and an ID from the original source (Go DB, Rust sec DB, Ruby sec DB, Python sec DB, etc) - so this would mean for every advisory you'd be looking at four lines min + each of those will have a summary that is probably very similar but not always the same that the user has to look through. We'll also want to include a link to make it easy to access, so that's a link per advisory; and the representation of a "fix version" is typically different and multiple per advisory - some have commits, some have version ranges, and some have both. Plus there's determining which fixed version to show (either we show just the latest fix version which might not be in the same major version, or we try to determine the best fit in which case this outputter is suddenly doing a lot more complex work...). In doing this, we'd have also added at least 2-3 levels of indenting, pushing the content further right, and this gets even worse in ecosystems like NPM that support multiple versions of the same package 😅 I definitely think there's more exploring to do with the outputting, but it feels like it could be too much to be try and tackle in this initial PR - maybe also because currently the outputters don't have any real concept of context, and that that might play a role? (i.e. they don't know when you're scanning a container vs a host vs an sbom, and I think things like summaries and os versions have different usages depending on what scanning mode you're actually in) re your other comments:
I can't reproduce this and is very unexpected as we're explicitly printing to stdout + matching the table reporter - can you confirm the exact command you're running and in what terminal?
I'll skip this until we've discussed my opener, but just fyi in case you missed it, we do already put the summary count at the end of each section which I had there are a form of block-end-marker.
I don't think we should do this at least for now because it makes an assumption about the classification and priority of ecosystems that afaik isn't captured in the osv spec |
14e97cd
to
ac14ce7
Compare
@G-Rath you are right, this is just an initial PR, we should probably not add too many new features to this. I will address the container scanning output format in a separate PR with some screenshots for better understanding. Adding some clarification for other comments:
I am using the command Another comment about uncalled vulnerabilities: when I use the command to scan a Go project, the vertical output doesn't show which vulnerabilities are uncalled. |
@hogo6002 that output is correct - the weirdness you're seeing is the color characters. It seems that the color library we're using doesn't do any special checks on when to include colors - I don't know the full details, but my understanding is that there are ways libraries can detect that they're being piped and so often these libraries have that built in. I'll dig further into that and see what we can do |
FYI we do a similar check with the table reporter to avoid adding styling when stdout isn't the terminal by passing in a terminalWidth of 0. (cmd/osv-scanner/main.go:155) |
ac14ce7
to
7f8bd9d
Compare
@hogo6002 color outputting should now be fixed - while looking into that, I discovered that we could simplify the other outputters a bit too so have opened #1087 doing just that; weirdly, calling Per our catch-up today, that just leaves call analysis to implement as the rest we'll tackle as followups - I should have mentioned it at the time, but fact that call analysis isn't in any of our output snapshots means we're missing them entirely from the output fixtures; it'd be great if someone could do a PR adding even just a basic sample of what that looks like in the scanner results struct, to save me some time - but if everyone's busy I can try to figure it out from the existing fixtures |
2518d79
to
443a202
Compare
@G-Rath Is this the call analysis output that you are looking for:
|
The first step will be making the vertical output default. I'm still not sure about whether we want to remove table output or not yet. |
c8358d4
to
36f708c
Compare
@another-rex @hogo6002 called vs uncalled is now being separated in the vertical output so this should be good for re-review - I've intentionally stuck with a minimal presentation for now rather than introduce a new layer of intending and headers that more explicitly call out "there are uncalled" like in @hogo6002's original sample, as I think it's serviceable and figure that'd be better to start with then over-implementing something very wordy that we might not like anyway |
36f708c
to
347ca61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM with some minor comments.
continue | ||
} | ||
|
||
for _, id := range group.IDs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's a bit indented, maybe we can refactor this into a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the awkward name of such a function would offset the gain in reducing indenting here
for _, id := range group.IDs { | ||
for _, v := range pkg.Vulnerabilities { | ||
if v.ID == id { | ||
if group.IsCalled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's also kind of indented. Should we use some extra data structures to track vulnerability count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to have this less intended but I don't really think there's a way to do that which wouldn't just be smearing the indenting around (/ replacing it with different complexity).
Not saying we should do this everywhere, but personally I view it as a natural downside to the currently complex-but-necessary shape of the data structure, and something to review as/when there's more need for this kind of stuff
i.e. v2 might involve some reshaping to this structure that lets us improve this, or a new property might come along that makes it useful to have some functions that could also be used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This adds a new "vertical" output format that is designed for humans and based on the output of
osv-detector
, which effectively aims to group the output relating to each entity being scanned in vertical slices:Unfortunately I think it suffers significantly due to the assumptions made by the rest of the codebase for outputting that made sense when the final output was a table i.e. we dump a lot of information as we go about scanning, config files, vulnerability filtering, and so on that really should be grouped but currently cannot because they're all outputted at different stages - I think a way to address that could be using some sort of event-emitter type pattern so that the reporters could be responsible for deciding what they actually do (e.g.
r.Emit("filtered-vulnerability", ...)
and then most reporters could choose to just print immediately, and ones like "vertical" could choose to add it to an internal struct), but I think that'll involve a lot more work; for now I'm just going to ignore the pre-results output.Resolves #85