Offer alias-based de-duplication of vulnerabilities #1994

nscuro · 2022-10-02T20:07:48Z

Current Behavior:

#1642 introduced tracking of vulnerability aliases. We now know which vulnerabilities describe the same issue, but we don't yet use this data to reduce the overall noise of findings. Clients may perform de-duplication based on their specific needs (e.g., always preferring GHSA over CVE), but we should offer a canonical solution from the server-side.

Proposed Behavior:

There should be a mechanism to de-duplicate vulnerabilities. In order to stay backwards-compatible, new API endpoints or opt-in parameters should be introduced.

Note that the intention is not to de-duplicate during vulnerability data ingestion! We still want to keep the data from all sources.

There are multiple constraints that need to be considered. Steve mentioned a few of them in #1912 (comment):

The most desired approach would be to favor CVE's over any of the alternative identifiers. This assumption should not be hard-coded.

There will be occasions where a CVE does not exist, yet there are aliases between say GHSA and OSSINDEX. Need to figure out how to handle this case.

There will be occasions where a CVE does not exist initially, but a OSSINDEX finding does. That OSSINDEX finding could be audited. At a later time, a CVE may be created and now there's a mapping. This happened with log4j and is likely the norm for high-profile vulnerabilities. I don't think we want to de-dup any finding that has an existing audit.

(There will be more constraints than this)

ShuP1 · 2023-05-26T10:08:57Z

In our case, this issue is the reason not to enable GitHub Advisories.
Added vulnerabilities and better descriptions are awesome.

But the noise induced by ~80% of duplicated findings makes it unusable.

Ignoring finding with an existing alias would be good enough as a first solution.
Even if identifiers will be undeterministric (first found wins), it greatly reduces the human pain of checking each vulnerability twice.

LaVibeX · 2024-03-19T16:01:42Z

Hi @nscuro we would like to discuss our approach to tackle this issue.

The most desired approach would be to favor CVE's over any of the alternative identifiers. This assumption should not be hard-coded.

We want to create a section where the admin can prioritize the vulnerability sources.
The deduplication will not be applied to previously attributed vulnerabilities to the components.
Toggle button to hide duplicates (view that takes non audited duplicates and hides them; if all duplicates are non audited, show vulnerability source originated from prioritization of step 1).

There will be occasions where a CVE does not exist, yet there are aliases between say GHSA and OSSINDEX. Need to figure out how to handle this case.

Cascade priority: If for instance CVE does not exist but GHSA and OSSINDEX do, consider the vulnerability source with the highest priority as defined in step 1.

There will be occasions where a CVE does not exist initially, but a OSSINDEX finding does. That OSSINDEX finding could be audited. At a later time, a CVE may be created and now there's a mapping. This happened with log4j and is likely the norm for high-profile vulnerabilities. I don't think we want to de-dup any finding that has an existing audit.

In this instance, first come, first serve. If OSSINDEX was attributed first, the CVE will only be shown as an alias in the future, not attributed.

fatcatnoregret · 2024-04-16T14:53:47Z

Why not to use some internal Dependency-Track id (e.g. INT-1234) as a main identifier for vulnerabilities and put identifiers from public vulnerability databases in the alias section from the very begining?
Example:
New vulnerability identified, CVE-2024-1234. In DT we can see it as INT-1234 and CVE-2024-1234 (or GHSA, or VulnDB) as alias. As soon as there will be GHSA, we can apply its id to alliases, and also attach some additional information.

LaVibeX · 2024-05-06T08:07:32Z

Why not to use some internal Dependency-Track id (e.g. INT-1234) as a main identifier for vulnerabilities and put identifiers from public vulnerability databases in the alias section from the very begining? Example: New vulnerability identified, CVE-2024-1234. In DT we can see it as INT-1234 and CVE-2024-1234 (or GHSA, or VulnDB) as alias. As soon as there will be GHSA, we can apply its id to alliases, and also attach some additional information.

Hey @fatcatnoregret,

Thank you for your suggestion about using internal IDs as the main identifier for vulnerabilities and adding other identifiers in the alias section. It's a great idea! However, we might still face the challenge of deciding which information to display when users click on a vulnerability. This could be similar to the issue we're trying to address.

ad8-adriant · 2024-11-05T03:24:06Z

Is there a way this problem can be solved without discarding the duplicated findings? One of my use cases for DT is generating VEX documents that can be distributed to customers, and while I agree that it's frustrating having to analyse the same vulnerability three times, I still want a response for all three identifiers to appear in the exported VEX.

I'd like to propose an approach where instead of de-duplicating findings by blocking/removing them, there was a "de-duplicated view" where related findings are presented as a group, so they can be analysed as a single item, and a response set as a single action. That would simplify the process of handling duplicate findings without any loss of information.

nscuro added enhancement New feature or request vuln-aliases Issues related to vulnerability aliases labels Oct 2, 2022

syalioune mentioned this issue Jan 8, 2023

Vulnerability with Unassigned Severity #2293

Closed

2 tasks

lme-nca mentioned this issue Feb 8, 2023

Dependency-track parser should include Vulnerability Alias information DefectDojo/django-DefectDojo#7582

Closed

2 tasks

This was referenced May 6, 2024

Avoid duplicates with alias DependencyTrack/frontend#838

Open

Avoid duplicates with alias BACKEND #3685

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offer alias-based de-duplication of vulnerabilities #1994

Offer alias-based de-duplication of vulnerabilities #1994

nscuro commented Oct 2, 2022

ShuP1 commented May 26, 2023 •

edited

Loading

LaVibeX commented Mar 19, 2024

fatcatnoregret commented Apr 16, 2024 •

edited

Loading

LaVibeX commented May 6, 2024

ad8-adriant commented Nov 5, 2024

Offer alias-based de-duplication of vulnerabilities #1994

Offer alias-based de-duplication of vulnerabilities #1994

Comments

nscuro commented Oct 2, 2022

Current Behavior:

Proposed Behavior:

ShuP1 commented May 26, 2023 • edited Loading

LaVibeX commented Mar 19, 2024

fatcatnoregret commented Apr 16, 2024 • edited Loading

LaVibeX commented May 6, 2024

ad8-adriant commented Nov 5, 2024

ShuP1 commented May 26, 2023 •

edited

Loading

fatcatnoregret commented Apr 16, 2024 •

edited

Loading