Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated/guided remediation #352

Open
oliverchang opened this issue Apr 18, 2023 · 16 comments
Open

Automated/guided remediation #352

oliverchang opened this issue Apr 18, 2023 · 16 comments
Assignees
Labels
enhancement New feature or request guided remediation Related to guided remediation / osv-scanner fix priority

Comments

@oliverchang
Copy link
Collaborator

oliverchang commented Apr 18, 2023

Tracking issue for building a guided remediation feature as part of OSV-Scanner.

Some ideas:

  • Suggesting direct dependency updates to remediate transitive vulns.
  • Ways to prioritize vulnerabilities based on things like dependency depth, severity, whether if it's dev-only etc.
  • Minimal re-locks to avoid known vulnerabilities in dependencies.
  • Automating upgrades with unit tests in a feedback loop.
  • Graph visualisations.

Current roadmap:

  • Q1 2024 for release of feature for npm.

Check out #352 (comment) for a walkthrough of what we've been building.

@oliverchang oliverchang added enhancement New feature or request priority labels Apr 18, 2023
@oliverchang oliverchang changed the title Guided remediation Automated/guided remediation Apr 18, 2023
@abhisek
Copy link
Contributor

abhisek commented Apr 27, 2023

@oliverchang This will be a VERY useful feature. I have some thoughts here:

Suggest minimal number of direct dependencies to upgrade to specific version to remediate maximum risk.

There is ofcourse a lot of detail to be considered here like how to measure aggregated risk. But it will be very useful in driving remediation, which many a times is not feasible due to too many findings or version conflicts in transitive dependencies when one or more direct dependency is updated.

Also another relatively simpler but valuable feature would be to have the ability to identify direct dependency upgrade to specific version to remediate vulnerabilities in transitive dependencies. I think you already suggested that. I am not sure if it is possible to build a graph by parsing flat lockfiles like gradle.lockfile or requirements.txt

In fact, just by having a dependency graph representation instead of a list will enable a lot of mitigation related analysis capability. I am not sure if its already done, but it may be feasible to build a usable graph based on data from deps.dev API.

Is there any plan to combine OSV and deps.dev data in future to perform both vulnerability detection and effective remediation?

@oliverchang
Copy link
Collaborator Author

Thanks for the feedback @abhisek ! We are definitely looking at ways to make the mechanisms flexible enough for users to decide on what they want to prioritise. Please stay tuned here for once we have some things ready to try.

We are working closely with the deps.dev team here.

@agmond
Copy link

agmond commented Jun 13, 2023

Hi @oliverchang,
Will the remediation feature include a command that automatically updates the manifest file and the lock file (like npm audit fix), or will it only have the guidelines for a manual upgrade?

@oliverchang
Copy link
Collaborator Author

@agmond yes: this tool will give you the changed manifest and lockfile at the end of of the guided remediation.

@agmond
Copy link

agmond commented Jul 27, 2023

Hi @oliverchang, I have another question here.
Will I be able to remediate only a single vulnerability at a time?
For example, let's say I have several issues in my dependencies file, but I want to fix them one by one (and not all of them at once).
Will this be possible?

@abhisek
Copy link
Contributor

abhisek commented Aug 7, 2023

@oliverchang Just checking if someone is already working on this. I need the ability to build dependency graph (instead of list) by parsing lockfiles. I am willing to work on this feature and contribute to osv-scanner but I am guessing its not a trivial change so would like some input on possible approach and challenges that the team foresee.

My approach for this would be in following iterations:

  1. Refactor PackageDetailParser to support graph APIs
  2. Use deps.dev API to resolve dependencies for a package and add relationship to the graph
  3. Use code analysis to identify direct dependencies

What do you think?

@oliverchang
Copy link
Collaborator Author

Hi @oliverchang, I have another question here. Will I be able to remediate only a single vulnerability at a time? For example, let's say I have several issues in my dependencies file, but I want to fix them one by one (and not all of them at once). Will this be possible?

Yes, this tool will be completely configurable.

@abhisek: which ecosystems are you interested in? I suspect the amount of effort would vary largely depending on the ecosystem.

@michaelkedar is already working on this for npm. I believe it's possible to recreate the graph from the existing package-lock.json without any additional API calls -- @michaelkedar can you please confirm?

@abhisek
Copy link
Contributor

abhisek commented Aug 7, 2023

@oliverchang Thanks for the info. I am primarily looking at the maven ecosystem, particularly pom.xml and gradle.lockfile lockfiles. I am also interested in the PyPI ecosystem, particularly requirements.txt lockfile. I think these are mostly a flat (list oriented) data structure and resolving them into a graph would need some data source for relationship between the packages.

I am yet to explore if there are package manager specific options / plugins that dump the dependency graph. I think it would be possible with Gradle / Maven but at the cost of having the scanner depend on these package managers at runtime which is probably not desirable.

@oliverchang
Copy link
Collaborator Author

@abhisek we'd welcome contributions for Maven and PyPI here, given that we're focusing on npm at the moment.

That said, do you have any details on how you would leverage deps.dev to resolve graphs from a non-lockfile? There may be a fair bit of complexity involved in implementing the ecosystem-specific resolution algorithms.

@abhisek
Copy link
Contributor

abhisek commented Aug 17, 2023

@oliverchang I need to do a POC to confirm, but here is a tentative approach for building a dependency graph from requirements.txt based on dependency relationship data from deps.dev

  • For each package in requirements.txt
  • Use deps.dev API to fetch dependencies for the package version
  • Add relations to other packages in requirements.txt as dependencies based on data from deps.dev
  • Any package (node) without an incoming edge is considered a direct dependency till we have code scanning capability (may be future iteration) to accurately identify direct dependencies

Here the assumption is, requirements.txt contains a list of all dependencies, including transitive dependencies. While this need not be true given the requirements.txt spec, I don't think we can handle the cases where all transitive dependencies are not included in requirements.txt because there is no way for us to identify the version of the dependency package which is resolved at runtime by the package manager based on different constraints, including the latest available package from the registry.

@sarnesjo
Copy link

Hi @abhisek, I work on the deps.dev team. The problem of reconstructing a dependency graph is indeed quite tricky for some ecosystems, and unfortunately the approach you propose won't work in the general case.

If A depends on B and C (and B and C have some other dependencies), then:

  • resolving the dependencies of A
  • separately resolving the dependencies of B and C, and then combining them

... won't, in general, produce the same result. The reason for this is that dependency resolvers have rules for how to pick versions of packages that show up multiple times in a dependency graph. (Different dependency resolvers have different rules.)

What you can do with the deps.dev data is, if the package version you're inspecting is in our corpus (i.e. it's published to pypi.org in one of the packaging formats we understand) you can use the GetDependencies API endpoint to look up a dependency graph for it. Note that this graph may still be different from ones you see elsewhere, as dependency resolution depends on many environmental factors such as pip version, python version, OS and architecture, time, etc.

@oliverchang
Copy link
Collaborator Author

oliverchang commented Nov 21, 2023

For folks following, here's a preview of what we've been working on. We're hoping to release this Q1 next year for npm, with more ecosystems to come later.

Goal

The goal of guided remediation is to help developers who are flooded with vulnerability reports in their project dependencies. These projects often do not keep up to date with their dependencies, leading to a lot of toil when they need to upgrade them to fix known vulnerabilities. This is hard because upgrades often cause breakages, and they can’t be blindly applied all at the same time.

We're working on a tool as part of OSV-Scanner (leveraging https://deps.dev/) to help with prioritizing upgrades based on both impact as well as return on investment. This tool also enables fully automated remediation workflows.

Walkthrough of modes

Interactive mode

Let's jump right into what this looks like. There are two modes that our tool works in, an interactive mode, as well as a scriptable automatic mode.

image

This is our interactive mode. Here we have a popular but unmaintained JavaScript project we found on GitHub called keystone-classic. When we scan it, we find that there are a whopping 169 vulnerabilities. We provide information which ones affect direct dependencies (31), which ones are transitive (138), and which ones are dev-only dependencies (55).

We also provide a number of prioritization mechanisms to help users focus on the vulnerabilities that matter. One heuristic we have as a measure of exploitability is dependency depth. If you have a vulnerability in your dependency tree that's 10 layers deep, it's most likely less exploitable than a vulnerability in a direct dependency. We also let you set thresholds on severity, as well as whether or not you care about dev dependencies. As an example, let's set the maximum dependency depth to 4, the min severity to 6, and ignore dev dependencies.

image

When we apply these criteria, we instantly bring down the number of vulnerabilities from 169, to only 75 vulnerabilities.

There are two broad categories of actions a user can do to resolve these.

In-place lockfile modification

image

One is something we call "in-place" lockfile modification. This is where we patch the dependency graph in-place to replace vulnerable versions with non-vulnerable versions, while still respecting all the version constraints in the graph. This is often the least risky approach, but also the approach that fixes the fewest vulnerabilities. On the right, we show how many vulnerabilities every individual upgrade fixes. For instance, upgrading ua-parser-js from 0.7.19 to 0.7.36 fixes 4 vulnerabilities. This list is ordered by the number of vulnerabilities fixed by each upgrade, and applying all of them will fix 28 vulnerabilities.

We can also show you the dependency graph of the vulnerable dependencies.
image

The dependencies highlighted in blue are direct dependencies. Here, for ua-parser-js, we see that there are at least 4 different paths leading up to it, and the dependency depth (shortest path) is 3.

Now let's look at the actual in-place upgrade. In this screen we can choose which of the in-place upgrades you want to apply. It's almost like a shopping cart of upgrades that you can select. Once you select the ones you want, you can write the results out to the project’s package-lock.json, run tests, CI/CD and see which ones don’t break.

image
image

Relock and direct dependency bumps

The other strategy to fixing vulnerabilities is relocking and direct dependency bumps. Relocking recomputes your entire dependency graph, taking the newest possible versions of all packages in your graph, while still respecting your graph version constraints. This causes a larger number of changes to your graph, which potentially carries a larger risk of breakages.

image

When we relock, we fix 48 vulnerabilities instantly, and are left with 27 vulnerabilities. Of these, 11 are actually impossible to resolve, because they are in transitive dependencies where there is a lack of fix paths for them. That is, one or more intermediate dependencies’ version constraints force the vulnerable packages to be in your graph. There are no possible dependency upgrades that would get rid of any of these 11 vulnerabilities. For these, the only ways to remediate them would be to mitigate the vulnerabilities some other way, or investigate if they are false positives and create a VEX statement.

For the 16 that we can fix, our tool provides direct dependency upgrade options, ordered by the number of vulnerabilities resolved. These correspond to changes to the users’ package.json to change the versions of their direct dependencies. These are often major version upgrades that carry a bit of risk, so users can interactively try applying these to see what works for them.

Automatic mode / CLI usage

So that was the interactive mode, which enables users to understand their vulnerabilities and prioritise them. We also offer all of the functionality we just saw through our command line flags in a way that enables this to be scripted.

image

For instance, we can set the maximum dependency depth via --max-depth, the minimum severity via --min-severity, whether we want to relock via --relock, and more.

We also wrote a PoC script to show how we can automate the process of determining non-breaking dependency upgrades to achieve the best possible upgrade result with zero human interaction, by using unit tests in a feedback loop.

Guided remediation demo

Our script continuously tries to perform each suggested upgrade in progressively riskier ways, and runs tests to see if any breakages are caused. If they do, bad upgrades are added to a blocklist. The upgrades that do work are combined to produce the optimal set of available upgrades that don’t cause breakages.

image

By running this script against keystone-classic on all 169 vulnerabilities without any filtering, we were able to fully automatically find that relocking plus upgrading 4 of the 6 possible direct dependencies resulted in zero breakages for the project. This results in 114 vulnerabilities fixed. Of the remaining 55, 44 are not possible to be fixed due to lack of fix paths. The remaining 11 fixable vulnerabilities will require the bad packages mongoose and marked to be upgraded, which likely have breaking changes that require human involvement.

image

Thanks for reading this far :) let us know if anybody has any feedback on this.

@seelder
Copy link

seelder commented Jan 8, 2024

@oliverchang Will OSV-scanner still rely on lockfiles for now? Or will it pull data from deps.dev, or have some other way of pulling other dependency information, e.g. based on package.json? (Sorry to randomly ask a question - I'm a PhD student interested in OSV-scanner as part of a research project)

@oliverchang
Copy link
Collaborator Author

@oliverchang Will OSV-scanner still rely on lockfiles for now? Or will it pull data from deps.dev, or have some other way of pulling other dependency information, e.g. based on package.json? (Sorry to randomly ask a question - I'm a PhD student interested in OSV-scanner as part of a research project)

Guided remediation will have a mode where we resolve manifests into full transitive graphs (leveraging deps.dev).

(And not at all! Very glad to see research interest in this project).

@abhisek
Copy link
Contributor

abhisek commented Jan 13, 2024

@sarnesjo @oliverchang May not be relevant anymore, but closing the loop on what we discussed in #352 (comment)

I did some work on reconstructing dependency graph from gradle.lockfile and data from deps.dev. The approach I used was

  1. Consider gradle.lockfile as the source of truth for all nodes (package version) in the graph
  2. Use deps.dev to identify dependencies for a given node (package version)
  3. Find target nodes already existing the graph from step [1] and add relations based on data discovered from [2]

This may not be very reliable and may not entirely match the dependency graph actually generated by gradle but it has the necessary information to suggest remediation for a top level (found in build.gradle) dependency instead of recommending the user to update a transitive dependency, which a user can't really do in normal workflow.

a

michaelkedar added a commit that referenced this issue Jan 22, 2024
Starting to make guided remediation public #352 🎉  

This PR has the code used to resolve the dependency graph of a manifest
(i.e. a `package.json`) using the deps.dev resolvers and find
vulnerabilities within it.

Doesn't include the code to actually parse/write `package.json` files -
will probably add that in the next PR.

Much of this has been reviewed internally already, but I've made some
significant changes/refactoring to `dependency_chain.go` and the
`computeVulns()` function in `resolve.go`, so please take a more careful
look at those.
michaelkedar added a commit that referenced this issue Mar 5, 2024
Link:
https://michaelkedar.github.io/osv-scanner/experimental/guided-remediation/

Doc page for guided remediation. Will appreciate feedback if things
aren't clear or if something's missing.

#352
@oliverchang
Copy link
Collaborator Author

@michaelkedar michaelkedar added the guided remediation Related to guided remediation / osv-scanner fix label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request guided remediation Related to guided remediation / osv-scanner fix priority
Projects
None yet
Development

No branches or pull requests

6 participants