Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSV.dev can't match deleted versions of packages #2407

Open
maaaaz opened this issue Jul 21, 2024 · 14 comments
Open

OSV.dev can't match deleted versions of packages #2407

maaaaz opened this issue Jul 21, 2024 · 14 comments
Labels
backlog Important but currently unprioritized bug Something isn't working stale The issue or PR is stale and pending automated closure

Comments

@maaaaz
Copy link

maaaaz commented Jul 21, 2024

Hello there,

Thanks for this amazing work but I am reporting here a crucial bug: known malicious packages are not detected when scanned.

How to reproduce:

argcomplete==3.4.0
pymocks==0.0.1
attrs==21.2.0
Automat==20.2.0
Babel==2.8.0
bcrypt==3.2.0
blinker==1.4
certifi==2020.6.20
chardet==4.0.0
click==8.0.3
cloud-init==24.1.3
colorama==0.4.4
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
cryptography==3.4.8
dbus-python==1.2.18
distlib==0.3.4
distro==1.7.0
distro-info==1.1+ubuntu0.2
filelock==3.6.0
httplib2==0.20.2
hyperlink==21.0.0
idna==3.3
importlib-metadata==4.6.4
incremental==21.3.0
jeepney==0.7.1
Jinja2==3.0.3
jsonpatch==1.32
jsonpointer==2.0
jsonschema==3.2.0
keyring==23.5.0
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
MarkupSafe==2.0.1
mercurial==6.1.1
more-itertools==8.10.0
netifaces==0.11.0
oauthlib==3.2.0
packaging==24.1
pbr==5.8.0
pexpect==4.8.0
pipx==1.6.0
platformdirs==4.2.2
ptyprocess==0.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.1
Pygments==2.11.2
PyGObject==3.42.1
PyHamcrest==2.0.2
PyJWT==2.3.0
pyOpenSSL==21.0.0
pyparsing==2.4.7
pyparted==3.11.7
pyrsistent==0.18.1
pyserial==3.5
python-apt==2.4.0+ubuntu3
python-debian==0.1.43+ubuntu1.1
python-magic==0.4.24
pytz==2022.1
PyYAML==5.4.1
requests==2.25.1
SecretStorage==3.3.1
service-identity==18.1.0
six==1.16.0
sos==4.5.6
ssh-import-id==5.11
stevedore==3.5.0
systemd-python==234
tomli==2.0.1
Twisted==22.1.0
ubuntu-pro-client==8001
ufw==0.36.1
urllib3==1.26.5
userpath==1.9.2
virtualenv==20.13.0+ds
virtualenv-clone==0.3.0
virtualenvwrapper==4.8.4
wadllib==1.3.6
WALinuxAgent==2.2.46
zipp==1.0.0
zope.interface==5.4.0
  • Scan this "requirements.txt" file:
$ ./osv-scanner_linux_amd64 scan -L requirements.txt
Scanned /home/runner/requirements.txt file and found 83 packages
╭─────────────────────────────────────┬──────┬───────────┬──────────────┬───────────┬──────────────────╮
│ OSV URL                             │ CVSS │ ECOSYSTEM │ PACKAGE      │ VERSION   │ SOURCE           │
├─────────────────────────────────────┼──────┼───────────┼──────────────┼───────────┼──────────────────┤
│ https://osv.dev/GHSA-h4m5-qpfp-3mpv │ 7.8  │ PyPI      │ babel        │ 2.8.0     │ requirements.txt │
│ https://osv.dev/PYSEC-2021-421      │      │           │              │           │                  │
│ https://osv.dev/GHSA-43fp-rhv2-5gv8 │ 6.8  │ PyPI      │ certifi      │ 2020.6.20 │ requirements.txt │
│ https://osv.dev/PYSEC-2022-42986    │      │           │              │           │                  │
│ https://osv.dev/GHSA-xqr8-7jwr-rhp7 │ 7.5  │ PyPI      │ certifi      │ 2020.6.20 │ requirements.txt │
│ https://osv.dev/PYSEC-2023-135      │      │           │              │           │                  │
│ https://osv.dev/GHSA-c33w-24p9-8m24 │ 3.7  │ PyPI      │ configobj    │ 5.0.6     │ requirements.txt │
│ https://osv.dev/GHSA-3ww4-gg4f-jr7f │ 7.5  │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-5cpq-8wj7-hf2v │      │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-9v9h-cgj8-h64p │ 5.5  │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-jfhm-5ghh-2f97 │ 7.5  │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/PYSEC-2023-254      │      │           │              │           │                  │
│ https://osv.dev/GHSA-jm77-qphf-c4w8 │      │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-v8gr-m533-ghj9 │      │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-w7pp-m8wf-vj6r │ 6.5  │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-x4qr-2fvf-3mr5 │ 7.4  │ PyPI      │ cryptography │ 3.4.8     │ requirements.txt │
│ https://osv.dev/GHSA-jjg7-2v4v-x38h │ 7.5  │ PyPI      │ idna         │ 3.3       │ requirements.txt │
│ https://osv.dev/PYSEC-2024-60       │      │           │              │           │                  │
│ https://osv.dev/GHSA-h5c8-rqwp-cp95 │ 5.4  │ PyPI      │ jinja2       │ 3.0.3     │ requirements.txt │
│ https://osv.dev/GHSA-h75v-3vvj-5mfj │ 5.4  │ PyPI      │ jinja2       │ 3.0.3     │ requirements.txt │
│ https://osv.dev/GHSA-3pgj-pg6c-r5p7 │ 5.7  │ PyPI      │ oauthlib     │ 3.2.0     │ requirements.txt │
│ https://osv.dev/PYSEC-2022-269      │      │           │              │           │                  │
│ https://osv.dev/GHSA-mrwq-x4v8-fh7p │ 5.5  │ PyPI      │ pygments     │ 2.11.2    │ requirements.txt │
│ https://osv.dev/PYSEC-2023-117      │      │           │              │           │                  │
│ https://osv.dev/GHSA-ffqj-6fqr-9h24 │ 7.4  │ PyPI      │ pyjwt        │ 2.3.0     │ requirements.txt │
│ https://osv.dev/PYSEC-2022-202      │      │           │              │           │                  │
│ https://osv.dev/GHSA-9wx4-h78v-vm56 │ 5.6  │ PyPI      │ requests     │ 2.25.1    │ requirements.txt │
│ https://osv.dev/GHSA-j8r2-6x86-q33q │ 6.1  │ PyPI      │ requests     │ 2.25.1    │ requirements.txt │
│ https://osv.dev/PYSEC-2023-74       │      │           │              │           │                  │
│ https://osv.dev/GHSA-c2jg-hw38-jrqq │ 8.1  │ PyPI      │ twisted      │ 22.1.0    │ requirements.txt │
│ https://osv.dev/PYSEC-2022-195      │      │           │              │           │                  │
│ https://osv.dev/GHSA-rv6r-3f5q-9rgx │ 7.5  │ PyPI      │ twisted      │ 22.1.0    │ requirements.txt │
│ https://osv.dev/PYSEC-2022-160      │      │           │              │           │                  │
│ https://osv.dev/GHSA-vg46-2rrj-3647 │ 5.4  │ PyPI      │ twisted      │ 22.1.0    │ requirements.txt │
│ https://osv.dev/GHSA-xc8x-vp79-p3wm │ 5.3  │ PyPI      │ twisted      │ 22.1.0    │ requirements.txt │
│ https://osv.dev/PYSEC-2023-224      │      │           │              │           │                  │
│ https://osv.dev/GHSA-34jh-p97f-mpxf │ 4.4  │ PyPI      │ urllib3      │ 1.26.5    │ requirements.txt │
│ https://osv.dev/GHSA-g4mx-q9vg-27p4 │ 4.2  │ PyPI      │ urllib3      │ 1.26.5    │ requirements.txt │
│ https://osv.dev/PYSEC-2023-212      │      │           │              │           │                  │
│ https://osv.dev/GHSA-v845-jxx5-vc9f │ 8.1  │ PyPI      │ urllib3      │ 1.26.5    │ requirements.txt │
│ https://osv.dev/PYSEC-2023-192      │      │           │              │           │                  │
│ https://osv.dev/GHSA-jfmj-5v4g-7637 │ 6.9  │ PyPI      │ zipp         │ 1.0.0     │ requirements.txt │
╰─────────────────────────────────────┴──────┴───────────┴──────────────┴───────────┴──────────────────╯

Nothing is told about this pymocks package.

I tried with different expressions: pymocks==0.0.1, pymocks etc. but it never got detected.
As this package is globally malicious, its detection should not need a version string: the sole presence of the package name in a lockfile should be enough to detect it !

Cheers!

@maaaaz maaaaz changed the title Crucial bug: osv-scanner does not detect malicious package in lockfiles Crucial bug: osv-scanner does not detect known malicious package in lockfiles Jul 21, 2024
@G-Rath
Copy link
Collaborator

G-Rath commented Jul 21, 2024

This looks like an osv.dev API bug, as both osv-detector and osv-scanner --experimental-offline report this vulnerability

@G-Rath G-Rath added the bug Something isn't working label Jul 21, 2024
@andrewpollock
Copy link
Contributor

I can confirm that

$ curl -d \
          '{"version": "0.0.1", "package": {"name": "pymocks", "ecosystem": "PyPI"}}' \
          "https://api.osv.dev/v1/query"
{}

is not matching https://api.osv.dev/v1/vulns/MAL-2022-7426 so this is an OSV.dev API bug not an OSV-Scanner one. I'll move this over.

@andrewpollock andrewpollock transferred this issue from google/osv-scanner Jul 22, 2024
@andrewpollock
Copy link
Contributor

Looking at https://api.osv.dev/v1/vulns/MAL-2022-7426, I can see what the problem is, and it's somewhat systemic to malicious packages records: because typically such packages get removed from the package registry, there are no versions to enumerate. OSV.dev's API today is reliant upon all known vulnerable (in this case, "malicious") versions being enumerated and present in the affected.versions[] field for detection.

This deficiency is being worked on in #2401

@G-Rath
Copy link
Collaborator

G-Rath commented Jul 22, 2024

@andrewpollock would it be straightforward and worthwhile to introduce a (hopefully) hotpath for advisories that are marked as impacting all versions, since shouldn't that be a case of matching the ecosystem + name?

I'd say that technically it's an optimization which as a bonus would enable these advisories to be matched against without requiring a more fulsome api change

@maaaaz
Copy link
Author

maaaaz commented Jul 22, 2024

Hello and thanks for taking care of that issue.

would it be straightforward and worthwhile to introduce a (hopefully) hotpath for advisories that are marked as impacting all versions, since shouldn't that be a case of matching the ecosystem + name?

I totally agree, version is irrelevant for known malicious packages.

@another-rex
Copy link
Contributor

There are cases where version do matter for malicious packages, e.g. in cases where a normal package repository was taken over by a malicious actor, and they made a new release containing malicious code. All previous versions are still valid non-malicious packages in that case (or sometimes not, if the registry is not immutable, the attacker might be able to swap out old versions as well).

@maaaaz
Copy link
Author

maaaaz commented Jul 23, 2024

I don't know if and how this issue is linked to this one: github/advisory-database#4612

@andrewpollock
Copy link
Contributor

I don't know if and how this issue is linked to this one: github/advisory-database#4612

They're related only in that they both relate to malware (as opposed to security vulnerabilities)

Copy link

This issue has not had any activity for 60 days and will be automatically closed in two weeks

See https://github.com/google/osv.dev/blob/master/CONTRIBUTING.md for how to contribute a PR if you're interested in helping out.

@github-actions github-actions bot added the stale The issue or PR is stale and pending automated closure label Sep 24, 2024
@another-rex
Copy link
Contributor

Looking at this again, I think a solution if a package is removed (therefore cannot be enumerated) is to assume all versions are vulnerable?

If it was the case I commented above, and the original maintainer regains control, I would guess that the entire package would not be removed from the repository, but only the latest version will be yanked.

We would need a way to determine if us not being able to enumerate is not a temporary/transient error, or because a package is removed completely.

@github-actions github-actions bot removed the stale The issue or PR is stale and pending automated closure label Sep 25, 2024
@maaaaz
Copy link
Author

maaaaz commented Sep 30, 2024

if a package is removed (therefore cannot be enumerated) is to assume all versions are vulnerable?

I agree.

When a non-legitimate-package-spoofing malicious package is found, removed or not, it should be simply marked as malicious by the package name.

Copy link

This issue has not had any activity for 60 days and will be automatically closed in two weeks

See https://github.com/google/osv.dev/blob/master/CONTRIBUTING.md for how to contribute a PR if you're interested in helping out.

@github-actions github-actions bot added the stale The issue or PR is stale and pending automated closure label Nov 30, 2024
@andrewpollock andrewpollock added the backlog Important but currently unprioritized label Nov 30, 2024
@rohitcoder
Copy link

Hey Guys, just checking on this again, looks like this has been open for a long, let me know if there is already a PR or WIP, I can also contribute to this feature if someone can point me to the responsible code block (If OSV API is also open Source).

@another-rex
Copy link
Contributor

We are still currently working out the approach we want to take to resolve this, as there are a few different ways to solve this issue. Definitely still WIP here, but hopefully we'll have a solution for this soon.

The API is open source, you can see the code under the gcp/api/ folder.

@oliverchang oliverchang changed the title Crucial bug: osv-scanner does not detect known malicious package in lockfiles OSV.dev can't match deleted versions of packages Dec 5, 2024
hogo6002 added a commit that referenced this issue Dec 16, 2024
[Malicious
package](https://github.com/ossf/malicious-packages/tree/main/osv/malicious)
publishes OSV records for crates.io, npm, NuGet, PyPI, and RubyGems.
Queries for NuGet, PyPI, and RubyGems (npm and crates.io use semantic
versioning, so the matching process is different) only match
vulnerabilities against specific `affected versions`. However, malicious
package records may only provide `affected ranges` instead of individual
versions in some cases (e.g.
https://api.osv.dev/v1/vulns/MAL-2022-7426). OSV also can't enumerate
affected versions for a malicious package as those versions have been
deleted. This causes issues like
#2407

Switching the API query version matching from
`_query_by_generic_version()` to
[`_query_by_comparing_versions()`](https://github.com/google/osv.dev/blob/4d981692feb5e088d12177d97a44fe30701b3854/gcp/api/server.py#L1186)
can address this issue. The `_query_by_comparing_versions()` function
matches [both affected versions and affected
ranges](https://github.com/google/osv.dev/blob/4d981692feb5e088d12177d97a44fe30701b3854/gcp/api/server.py#L1381),
but might slow down the performance for a bit.

Adding this PR after the end-of-year release to give more time to verify
performance on the test instance before rolling out to prod.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Important but currently unprioritized bug Something isn't working stale The issue or PR is stale and pending automated closure
Projects
None yet
Development

No branches or pull requests

5 participants