-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi Platform support for Pipfile.lock #5130
Comments
Could the lock file just be per-platform/version/etc?
What would be the drawback to that type of approach, aside from legacy support for the existing |
What would the advantage of maintaining multiple lock files per platform and environment @dijital20 ? I am not a big fan of the idea of having platform specific lock files and then how to know which one to use during sync.
Not for sync -- that would be very very bad because you loose the trust that you are verifying the dependencies you used in dev/test are the same you are deploying to production because the hashes would regenerate. The current design can be extended to account for multiple platforms already -- for multi-platform dependencies all of the associated hashes are included in the lock, so this issue is really about dependencies that are for a different platform than the one you are locking on should also be included in this lock file with the relevant platform marker. Today this only happens if you lock on the environment that requirement is compatible with. Pipenv's own lock file has this problem -- if we do not lock on windows we loose a windows-specific dependency from the lock file. EDIT: Just re-read the issue report and I am very surprised that the list of hashes for cryptography would be different between those two environments -- this part deserves some additional triage with the latest version of pipenv. |
Yeah, platform-specific and Python-version-specific dependencies are the issue to solve here. I had similar issues though when I moved a project from Python 3.8 to Python 3.10. My Pipfile and lock files both had Python 3.8 in them, and the standard
I am not sure if platform-specific lock files is the solution... just thinking through it as we ran into this today, as a colleague is developing a new Python module that he wants to support both Windows and Linux on, but there are some dependencies which are platform specific. Some alternatives to consider:
|
@dijital20 or @xyxz-web -- were either of you by chance using private python repositories isntead of pypi? I just locked on the latest pipenv
|
For https://pypi.org/project/cryptography/#files The lock you posted @matteius looks like it got all of them. What @xyxz-web is seeing are probably the ones appropriate for the platform and interpreter they're using. Like I said, I have a colleague who is trying to author a new package and wants to support both Windows and Linux, so he's developing out of both Windows and WSL2 Ubuntu. At least one of the dependencies (I suspect lxml) is different between the platforms, so if he locks on one platform, the lock file doesn't work on the other and vice versa. We have changed to using I hope this is helpful. :) Thank you for the discussion on this. |
@dijital20 You are right that locked that on windows, but I just tried on Ubuntu and got the same result:
I am going to need y'all to provide the output from |
I also get the long list of hashes when I lock from within WSL - seems to include both Windows and Linux hashes. |
@xyxz-web hash collection of all packages is a specific feature of pypi. There is no API into private pypi servers for collecting the set of hashes, and I suspect that is where the problem comes in. For reference, here is the code that collects the hashes: https://github.com/pypa/pipenv/blob/main/pipenv/utils/resolver.py#L757-L788 Specifically look for where it collects hashes from pypi: https://github.com/pypa/pipenv/blob/main/pipenv/utils/resolver.py#L726-L755 I would recommend if possible to use pypi as a primary index in the |
@matteius perhaps with: import platform
# somwehere inside the `sync` method definition
open(f"Pipfile.lock.{platform.machine()}") I would not include the Python version in the lockfile name, project should be developed for the version running on the production servers to avoid surprises and provide more or less stable development experience and features provided by the specific version of the language used. |
I think this is a really important problem to solve right now. For example, if we have a project that uses
in
(the above allows me to work on M1 Mac and run CI on x86_64 Github Action + deploy to amd64 Linux server, but it's not sustainable as next time someone adds or updates a package it'll be lost) The Ruby Bundler solves this problem by adding separate lock file entries for each platform, e.g.
|
I agree with you, it is becoming higher on the list of priorities and we have tackled a number of other bugs and technical debt items that set us up better to work on this. I am inclined to try and find a solution that will work with the single @januszm Can you explain more about your workaround Another thing I have been thinking more about as it relates to pre-built wheels and private pypi's, is it would be great if we could get the private pypi servers to support a similar json API to pypi for fetching the available package hashes. Of the data that API returns, pipenv specifically leverages the part of the payload that is: That wouldn't help for packages that need to be built on system architectures to get those hashes (because no pre-built wheels exist) but it would help in general with this issue and with performance, though a platform specific build solution I believe needs to exist as well to manage hashes generated from different lock architectures. I think the numpy issue though, there are pre-built wheels for most architectures and pipenv will not pull all of the hashes from a private pypi because it would have to download all of the wheels because such an API is not supported. Perhaps part of the solution could be an option to override and tell it to download all platform eligible wheels from a private pypi in order to collect all the hashes, but again I would need to study that code some more to find out. |
Thanks for the detailed response @matteius . I don't know if I will be able to professionally explain what exactly this workaround is because I have never studied I am not sure how hashes work in pip and pipenv but IMO it seems unnecessary to calculate hashes for extensions that are built locally. The hashes for the source code, which is common to all architectures, should provide necessary minimum security so it can be computed once. Probably that's why it works with my workaround, the hash of files that are built locally doesn't matter because they come from source code that already has hashes. UPDATE: [package.dependencies]
numpy = [
{version = ">=1.18.5", markers = "platform_machine != \"aarch64\" and platform_machine != \"arm64\" and python_version < \"3.10\""},
{version = ">=1.19.2", markers = "platform_machine == \"aarch64\" and python_version < \"3.10\""},
{version = ">=1.20.0", markers = "platform_machine == \"arm64\" and python_version < \"3.10\""},
{version = ">=1.21.0", markers = "python_version >= \"3.10\""},
]
python-dateutil = ">=2.8.1"
# ... |
@januszm Thanks I will take a closer look at that later, fwiw, I think you may be actually installing pre-built wheels for the platform. For example, there are prebuilt arm64 wheels for macos: https://pypi.org/project/numpy/1.23.2/#files The issue for pipenv though for pre-built wheels is when going through a private pypi, I believe it does not download all of the wheels to get the hashes, only the ones that match the system you are locking on. |
Ah true, if we could make pipenv work with all platforms, not only the one we're locking on, it would already be a great improvement |
TL;DR another case where this issue has a negative impact is when trying to publish a Jupyter notebook using Details: I have a notebook that I develop locally (M1 mac) and I'm trying to serve it using https://mybinder.org/. During the build it seems like Possible workaround: In drorata/candy-analysis@ed2c6eb I removed the |
One addition, FWIW: I now faced a similar related issue. I'm trying to dockerize some work which was developed on an M1 mac. To that end, in the COPY ./Pipfile* .
RUN python -m pip install --upgrade pip
RUN pip install pipenv && pipenv install --dev --system --deploy It comes as no surprise, that I get the following warning: |
@drorata I am not sure how the markers are making it into your Pipfile for numpy ... I just locked on many different systems numpy, on windows, linux VM, Mac M1, and all did not have the markers restricting numpy. What version of |
Also noting to the larger group that I am unable to reproduce getting not the full set of hashes for some of these packages on |
@matteius Thanks for your answer! I'm used version |
Does it mean this issue was fixed in 24.08? EDIT: looks like it does, after the pipenv upgrade I see that the platform specific markers are gone.
|
I've tried to find a way to detect if this could be an issue in my pipenv run grep "Root-Is-Purelib: false" $(dirname $(dirname $(which python3)))/lib/python3*/site-packages * -R --include=WHEEL |
Just wanted to update this thread that named package categories were released in October and some users have found it a good way to manage multi-platform dependencies for a single project (such as developers on different operating systems all using pytorch) -- https://pipenv.pypa.io/en/latest/basics/#specifying-package-categories I would suggest trying to make use of package categories to solve these edge cases, as there are not current plans to support lock file per platform. |
@matteius thanks for the update, however I think this section of the doc should contain more details about how this should actually be used to support multiple platforms
Eg. How to define groups for arm64 and x86_64 and how to use them to install packages |
@januszm That is a good suggestion to improve the docs around this -- I think in general the docs stand to be improved. I'd also like inputs about other options or configuration variables that would help make using the named categories easier -- for example right now |
Thanks, I'm not sure if I understand correctly how this will work, but does it mean that people working on different architectures will generate different sections of the Pipfile.lock but will not interrupt each others groups ? So if developer A adds numpy on x86_64, she changes Pipfile.lock, then developer B on Macbook M1 also adds numpy, arm64 group is added and Pipfile.lock contains both variants? (from now on all developers can pipenv sync and get numpy for their platform). I just hope that this way of installing an architecture-dependent package via |
@matteius Thanks for sharing your release including the named categories all over the place. It's a nice feature but it does not solve the "multi-platform" problem. It's now possible to define packages per category but that's already all. I'm still facing quite similar problems like @januszm. I have a setup with some tools in the dev-section of the Pipfile. It works great till those tools does not have platform-dependent dependencies. You might think that's quite rare but think about To keep all those packages in the lock-file, the The only way to go I currently see would be a complete duplication of the whole "dev"-section per platform. Is this really, what you propose with the "named categories"? |
Can I get some more examples of cross platform locking problems that don't involve wanting different versions of libraries (which technically can be handled by categories) -- I've been playing around with our resolver implementation and I see a way to specify the platforms to the finder we create in a new method for the finder I've worked on but I don't have great examples to test with. @kolibri91 Can you explain more about the abut the |
Hi, of course. Please take a look at the following minimal example. I created a minimalistic Pipfile and executed
|
I believe I'm running into this as well: #5723 I have a package which only works on Linux, but development is happening on an M1 Mac. Because pipenv is skipping the package entirely, there are no hashes in the lockfile. When I attempt to |
For Why do those work differently? |
X-posting this for visibility, a recent issue lead to me working on a path forward for cross platform dependencies, and I think I have something that could stick. #5892 Its probably only consider sys_platform markers right now, but I think it makes sense to apply it to all marker types. Please have a look ... here is an example:
` |
There are on-going actions of |
Locking on one platform may lead to dependencies needed on a different platform to be dropped. So currently we have to `pipenv lock --keep-outdated` or `pipenv upgrade` on platforms having issues. Pipenv may get support for multi-platform lock files soon: pypa/pipenv#5130
The started off using pipenv 2022.9.21 and used the |
I'd love to hear more about this being solved by pipenv itself. In the meantime, see the last approach here with “packages categories” which may help you work around the problem. |
It is a challenging problem to solve--would love to support a reasonable solution without just going back to extra-index-urls which was the source of the disputed SEV vulnerability regarding package confusion attacks, the very thing I spent a lot of time considering how to prevent in pipenv which is how we end up here. I don't have a ton of excess time currently to devote to a multi-platform solution but I am supportive of finding one with the community. I think it would take some serious contributions from someone to get it across the finish line. |
My current lazy plan is to NOT commit the lock files into source control and execute the lock during the CI build. I guess that is essentially the same as using --skip-lock. Then I will pin certain versions of dependencies to block them from auto-updating within the Pipfile itself as opposed to relying on the lock file to pin everything. I realize this is not best practice, but it is no worse than what I was doing with version 2022.9.21 and --skip-lock. Reading through this thread, my favorite option was the idea to specify which platforms the Pipfile should support. This would tell pipenv which platforms to lock. I don't have a lot of internal knowledge on how Pypi works, but docker is able to pull and run images for a platform different than the host by specifying a --platform switch. I am assuming this user specified platform is used in the API queries made to docker hub to pull down available images. I would have assumed that the code that reads all the PyPI indices (public and private) includes the platform allowing pipenv to run the dependency mechanism multiple times for each platform listed in the Pipfile. |
I think that is what it does do, except for packages that don't provide pre-built wheels, that is when multi-platform is really problematic because you typically cannot build the sdist for the alternative OS to obtain the hashes when locking. I believe that is why some have proposed lock file sections, or overrides or even separate files specific to platforms. In theory something like that may be tenable, but its a complex landscape that someone needs to dig deeply into to improve the conditions. |
Are the hashes just to prevent tampering? If so, I would be happy with a lock file that does not include hashes and instead just records the versions of everything. This would at least prevent a dependency from auto-upgrading when a new release comes out, like NumPy 2.x did very recently and broke our builds. Would a simplified lock file with just versions and no hashes make the cross platform problem easier to solve? |
@scastria that is correct --- One thing to maybe tryin the interim is have the pipeline generate the requirements file from the lock file you use locally (possibly with hashes) and see if your CI can install with just |
Is your feature request related to a problem? Please describe.
Currently it is not possible to create a Pipfile.lock for multiple platforms automatically.
Only hashes for the current platform are locked.
This is problematic when you want to use Pipefile.lock to share locked dependency versions in a multi platform setup, e.g. development on Windows or Mac and production on Linux.
E.g. cryptography generates the following Pipfile.lock entry on Windows 10 64bit:
and the following one on a Linux VM:
Describe the solution you'd like
My preferred solution would be a new entry in the Pipfile that defines which platforms hashes must always be included when locking.
If the current systems platform isn't in the list it should be be added to the lock list though to retain backwards compatibility.
The Pipfile entry could look like this:
If possible it would be great if the values could be eagerly validated and an error message printed on invalid values.
Note: Ticket #210 is similar to this one, but I was asked to open a new ticket
The text was updated successfully, but these errors were encountered: