Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offer alias-based de-duplication of vulnerabilities #1994

Open
nscuro opened this issue Oct 2, 2022 · 5 comments · May be fixed by DependencyTrack/frontend#838 or #3685
Open

Offer alias-based de-duplication of vulnerabilities #1994

nscuro opened this issue Oct 2, 2022 · 5 comments · May be fixed by DependencyTrack/frontend#838 or #3685
Labels
enhancement New feature or request vuln-aliases Issues related to vulnerability aliases

Comments

@nscuro
Copy link
Member

nscuro commented Oct 2, 2022

Current Behavior:

#1642 introduced tracking of vulnerability aliases. We now know which vulnerabilities describe the same issue, but we don't yet use this data to reduce the overall noise of findings. Clients may perform de-duplication based on their specific needs (e.g., always preferring GHSA over CVE), but we should offer a canonical solution from the server-side.

Proposed Behavior:

There should be a mechanism to de-duplicate vulnerabilities. In order to stay backwards-compatible, new API endpoints or opt-in parameters should be introduced.

Note that the intention is not to de-duplicate during vulnerability data ingestion! We still want to keep the data from all sources.

There are multiple constraints that need to be considered. Steve mentioned a few of them in #1912 (comment):

  • The most desired approach would be to favor CVE's over any of the alternative identifiers. This assumption should not be hard-coded.
  • There will be occasions where a CVE does not exist, yet there are aliases between say GHSA and OSSINDEX. Need to figure out how to handle this case.
  • There will be occasions where a CVE does not exist initially, but a OSSINDEX finding does. That OSSINDEX finding could be audited. At a later time, a CVE may be created and now there's a mapping. This happened with log4j and is likely the norm for high-profile vulnerabilities. I don't think we want to de-dup any finding that has an existing audit.

(There will be more constraints than this)

@ShuP1
Copy link

ShuP1 commented May 26, 2023

In our case, this issue is the reason not to enable GitHub Advisories.
Added vulnerabilities and better descriptions are awesome.

But the noise induced by ~80% of duplicated findings makes it unusable.

Ignoring finding with an existing alias would be good enough as a first solution.
Even if identifiers will be undeterministric (first found wins), it greatly reduces the human pain of checking each vulnerability twice.

@LaVibeX
Copy link
Contributor

LaVibeX commented Mar 19, 2024

Hi @nscuro we would like to discuss our approach to tackle this issue.

image

The most desired approach would be to favor CVE's over any of the alternative identifiers. This assumption should not be hard-coded.

  1. We want to create a section where the admin can prioritize the vulnerability sources.
  2. The deduplication will not be applied to previously attributed vulnerabilities to the components.
  3. Toggle button to hide duplicates (view that takes non audited duplicates and hides them; if all duplicates are non audited, show vulnerability source originated from prioritization of step 1).

There will be occasions where a CVE does not exist, yet there are aliases between say GHSA and OSSINDEX. Need to figure out how to handle this case.

  1. Cascade priority: If for instance CVE does not exist but GHSA and OSSINDEX do, consider the vulnerability source with the highest priority as defined in step 1.

There will be occasions where a CVE does not exist initially, but a OSSINDEX finding does. That OSSINDEX finding could be audited. At a later time, a CVE may be created and now there's a mapping. This happened with log4j and is likely the norm for high-profile vulnerabilities. I don't think we want to de-dup any finding that has an existing audit.

  1. In this instance, first come, first serve. If OSSINDEX was attributed first, the CVE will only be shown as an alias in the future, not attributed.

@fatcatnoregret
Copy link

fatcatnoregret commented Apr 16, 2024

Why not to use some internal Dependency-Track id (e.g. INT-1234) as a main identifier for vulnerabilities and put identifiers from public vulnerability databases in the alias section from the very begining?
Example:
New vulnerability identified, CVE-2024-1234. In DT we can see it as INT-1234 and CVE-2024-1234 (or GHSA, or VulnDB) as alias. As soon as there will be GHSA, we can apply its id to alliases, and also attach some additional information.

@LaVibeX
Copy link
Contributor

LaVibeX commented May 6, 2024

Why not to use some internal Dependency-Track id (e.g. INT-1234) as a main identifier for vulnerabilities and put identifiers from public vulnerability databases in the alias section from the very begining? Example: New vulnerability identified, CVE-2024-1234. In DT we can see it as INT-1234 and CVE-2024-1234 (or GHSA, or VulnDB) as alias. As soon as there will be GHSA, we can apply its id to alliases, and also attach some additional information.

Hey @fatcatnoregret,

Thank you for your suggestion about using internal IDs as the main identifier for vulnerabilities and adding other identifiers in the alias section. It's a great idea! However, we might still face the challenge of deciding which information to display when users click on a vulnerability. This could be similar to the issue we're trying to address.

@ad8-adriant
Copy link

Is there a way this problem can be solved without discarding the duplicated findings? One of my use cases for DT is generating VEX documents that can be distributed to customers, and while I agree that it's frustrating having to analyse the same vulnerability three times, I still want a response for all three identifiers to appear in the exported VEX.

I'd like to propose an approach where instead of de-duplicating findings by blocking/removing them, there was a "de-duplicated view" where related findings are presented as a group, so they can be analysed as a single item, and a response set as a single action. That would simplify the process of handling duplicate findings without any loss of information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request vuln-aliases Issues related to vulnerability aliases
Projects
None yet
5 participants