-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow subsetting a PEX lockfile in PEX format, not just pip #2411
Comments
I'm interested to try to implement this, and I imagine the implementation of |
Why does the existing export-subset, fed back in to |
A side note: you often reference commands like |
Ah, if you think it works sufficiently well/captures everything it needs to, I'm happy to use that as official guidance. I have a confirmation question: I note that I can see this potentially not mattering for the Pants/caching use case, but I imagine it may matter for the "reduce a lockfile for a bug report" case?
Ah, sorry. I like the all-in-one pex a lot: I have plopped it into my I acknowledge that it's a little confusing to have two slightly different meanings of the
Just brainstorming an additional option:
If this was supported, I'd personally just use |
Yup, should work just fine and give all the same guarantees.
They don't matter in-practice for Pants, unless things have changed. Against my advice, Pants opted to use universal locks which include just 1 lock; so I embarked on a huge Pex code effort to support those. The case See pantsbuild/pants#12458 for some background on picking the popular way of doing things (Poetry-style lock files at the time -> Pex Perhaps a better entry point / discussion: pantsbuild/pants#12200 or pantsbuild/pants#12568 Basically, there was a bit of a war waged around this and I lost and implemented the most complex, least secure thing - |
@huonw if you find issues with the |
I use a user-local venv for this myself. It's a bit nicer than the Pex PEX IMO since I can change its version easily:
And later:
Or:
I think the Pex PEX is only more convenient for Pants itself, which tries not to know anything about Python, and largely succeeds by being able to download a Pex "binary". |
Yup. That has been exactly the plan. |
Okay, I've had a chance to experiment a bit. Some observations:
That first one seems unresolvable with |
Ah, yeah. That's right. I went through some hoops to support locks of both VCS requirements and local project directories, neither of which Pip's Ok then. My comment above (#2411 (comment)) applies then. Let me know if you need more guidance or cry uncle. |
Thanks, the prompt to should avoid following |
Ok, I assigned you to help me keep track not to touch this. |
A lockfile pins a potentially-huge universe of dependencies, and there's several use-cases where efficiently cutting that down to only a smaller applicable set would be handy:
This might be able to be implemented as a new
pex
value (or similar) forpex3 lock export-subset --format=...
To make this more concrete, a workflow might be:
cowsay
andtensorflow
, e.g.PEX_SCRIPT=pex3 pex lock create cowsay tensorflow -o test.lock
(contents at the end)pex
that uses onlytensorflow
:pex tensorflow --lock test.lock -o test.pex
, within some system that does process-based caching like Pants or Bazel. (Even with a warm PEX cache, this takes ~30s on my machine. I'm aware of the various settings that can improve this, but that's orthogonal to this feature request, I think.)cowsay
without changingtensorflow
or any of its dependent libraries.Currently, naive (aka reliable) entire-file-based caching of the process execution in step 2, will mean step 4's rerun has to execute and cannot be served from cache.
If we had the requested feature, the process runners could instead do two steps to build this PEX:
pex3 lock export-subset --format=pex --lock=test.lock tensorflow -o reduced.lock
(the--format=pip
version takes about 0.5s on my machine)pex tensorflow --lock reduced.lock -o test.pex
Under this scheme, when
cowsay
changes, thepex3 lock export-subset --format=pex
invocation is invalidated (its input has changed), and has to rerun... but thereduced.lock
output will be identical, and thuspex tensorflow --lock reduced.lock -o test.pex
can be served from cache. Theexport-subset
invocation is very fast in comparison to the full build, and thus this feature would unlock more efficient use of PEX.The text was updated successfully, but these errors were encountered: