-
-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pants lockfile generation includes un-used dists and thus un-vetted dists. #12458
Comments
(From Slack):
@benjyw went on to point out:
-- I suspect for the vast majority of users they will prefer the flexibility of the resolve working on multiple platforms over the risk of worse security/stability. We know many Pants users use macOS for desktop use and Linux for CI, for example. I'm assuming this "issue"/"feature" has continued to work well for Pipenv/Poetry/pip-compile, implied by their continued usage of it. Fwit, organizations who take supply chain attacks really seriously can hand edit the requirements.txt-style lockfile Pants generates to remove hashes for release files they do not trust. If that ends up being important to an org, we could perhaps look into helping to automate that as a followup. |
I think you're right, but I'm also not convinced the current state of the art is the best way to trade off. Another way would be to be able to generate partial lockfiles and then merge them. Not much less convenient, but more secure. This all assumes the current state of the art is not doing the wrong thing and just assuming that if a resolve computes req1==X, then all dists for X should be included. Environment markers can make that assumption invalid. |
The lockfile does at least limit the window in which you can be exposed to an attack, namely the time you generate the lockfile. Assuming that happened when PyPI was uncompromised, your resolves will continue to be safe until the next lockfile change. That is not nothing. |
@chrisjrn proposed an idea that could allow us to do this if we go with Poetry 🙌 In between running -- For hashes, [metadata.files]
attrs = [
{file = "attrs-21.2.0-py2.py3-none-any.whl", hash = "sha256:149e90d6d8ac20db7a955ad60cf0e6881a3f20d37096140088356da6c716b0b1"},
{file = "attrs-21.2.0.tar.gz", hash = "sha256:ef6aaac3ca6cd92904cdd0d83f629a15f18053ec84e6432106f7a4d04ae4f5fb"},
]
colorama = [
{file = "colorama-0.4.4-py2.py3-none-any.whl", hash = "sha256:9f47eda37229f68eee03b24b9748937c7dc3868f906e8ba69fbcbdd3bc5dc3e2"},
{file = "colorama-0.4.4.tar.gz", hash = "sha256:5941b2b48a20143d2267e95b1c2a7603ce057ee39fd88e7329b0c292aa16869b"},
] I've confirmed that if you delete an entry from the list for a particular dep, then run Presumably, we could inspect the file names to determine if any are unused? -- You can simply delete a dep from For example, Pytest depends on [[package]]
name = "atomicwrites"
version = "1.4.0"
description = "Atomic file writes."
category = "dev"
optional = false
python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" Instead, the platform information comes from Pytest's entry: [[package]]
name = "pytest"
version = "5.4.3"
description = "pytest: simple powerful testing with Python"
category = "dev"
optional = false
python-versions = ">=3.5"
[package.dependencies]
atomicwrites = {version = ">=1.0", markers = "sys_platform == \"win32\""}
attrs = ">=17.4.0"
colorama = {version = "*", markers = "sys_platform == \"win32\""}
more-itertools = ">=4.0.0"
packaging = "*"
pluggy = ">=0.12,<1.0"
py = ">=1.5.0"
wcwidth = "*" We'd need to know how to evaluate the marker and then follow the dep chain to recursively remove all un-used dists, iiuc. Another possibility could be modifying
-- Of course, we would also need to have figured out which platforms/environments are going to be used. This is easy when |
…-compile (#12549) **Disclaimer**: This is not a formal commitment to Poetry, as we still need a more rigorous assessment it can handle everything we need. Instead, this is an incremental improvement in that Poetry handles things much better than pip-compile. It gets us closer to the final result we want, and makes it much more ergonomic to use the experimental feature—like `generate_all_lockfiles.sh` now not needing any manual editing. But we may decide to switch from Poetry to something like pdb or Pex. -- See #12470 for why we are not using pip-compile. One of the major motivations is that Poetry generates lockfiles compatible with all requested Python interpreter versions, along with Linux, macOS, and Windows. Meaning, you no longer need to generate the lockfile in each requested environment and manually merge like we used to. This solves #12200 and obviates the need for #12463. -- This PR adds only basic initial support. If we do decide to stick with Poetry, some of the remaining TODOs: - Handle PEP 440-style requirements. - Hook up to `[python-setup]` and `[python-repos]` options. - Hook up to caching. - Support `--platform` via post-processing `poetry.lock`: #12557 - Possibly remove un-used deps/hashes to reduce supply chain attack risk: #12458 -- Poetry is more rigorous than pip-compile in ensuring that all interpreter constraints are valid, which prompted needing to tweak a few of our tools' constraints.
Long overdue follow up, but sloppy thinking here:
Definitely not true. If an sdist is locked and the sdist build system has any floating direct deps or transitive deps (~extremely common), you are not safe from a future injection if the unlocked build system or its dependencies releases new versions. Towards that end though and also re:
You can use |
Pants uses pip-compile to generate lockfiles and pip-compile includes - apparently - all dists for a given version of a requirement whether they were actually used in the resolve or not. This is a security and stability problem.
For example, if the original requirement is
foo>=1.0.0
and the IC isCPython==3.7.*
the lockfile might containfoo==1.0.0
with hashes for the sdist and the cp37m wheels. Say the lockfile was generated with and later tested with a CPython 3.7 interpreter with the pymalloc extension; so the wheel is what is actually resolved and tested. The sdists will then go unused and untested. As such a CPython 3.7 without the pymalloc extension lockfile consumer will resolve and build the sdist - potentially years later. In normal cases the sdist will be faithful to the cp37m wheel and generate a cp37 wheel that is ~equivalent. Even so; there is no guaranty this wheel behaves the same - there may be code that is conditional upon the pymalloc extension presence and be buggy in the non-presence branch. Worse - the wheels could be a honeypot and the sdist the trap (see: https://docs.google.com/document/d/17Y6_YjLqv6hY5APWLaRw5CMvr_FVDBIWK6WS57mioNg/edit?usp=sharing for related concerns).It appears to be the case that poetry.lock and Pipfile.lock have the same issue.
The text was updated successfully, but these errors were encountered: