Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speculative idea - Cache metadata for sdists? #7980

Open
pfmoore opened this issue Apr 4, 2020 · 3 comments
Open

Speculative idea - Cache metadata for sdists? #7980

pfmoore opened this issue Apr 4, 2020 · 3 comments
Labels
C: cache Dealing with cache and files in it state: needs discussion This needs some more discussion type: enhancement Improvements to functionality

Comments

@pfmoore
Copy link
Member

pfmoore commented Apr 4, 2020

What's the problem this feature will solve?
In this discussion, @uranusjr said:

One of the most resource-consuming (both for development and runtime) part of dependency resolution is the need to download a package matching the host environment, extract it, and potentially build it to get dependencies.

Describe the solution you'd like
Thinking about this, pip could cache metadata for sdists, much like the wheel cache. When we prepare a sdist to get metadata, we could save that metadata to a local cache, keyed by project name and version. Then, future runs could use the cache rather than downloading and building the sdist again.

Alternative Solutions
The wheel cache provides similar data, so we could check the wheel cache when looking for sdist metadata. However:

  1. We'd have to make sure we picked the right wheel. Wheels for other platforms/architectures could be in the cache, and (much as we'd prefer it if this wasn't the case) metadata isn't currently guaranteed to be consistent across platforms.
  2. Not all sdists get built to wheels. Some will potentially get prepared, the metadata checked, and then discarded. So caching sdist metadata directly will help more cases.

Additional context
As @uranusjr pointed out, the biggest benefits will be to platforms like musl which don't have wheels. However, a common musl platform is Docker alpine images, and as Docker containers are throwaway, a cache won't help much here. So we should make sure the benefits will be achieved in practice before investing time into this solution.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Apr 4, 2020
@sbidoul
Copy link
Member

sbidoul commented Apr 4, 2020

We'd have to make sure we picked the right wheel. Wheels for other platforms/architectures could be in the cache

Does the wheel cache adequately cope with that via supported pep 425 tags?

Could it be an alternative approach to leverage the wheel cache, and extend it to support "partial" wheels containing metadata only? These would basically be wheels containing the outcome of PEP 517 prepare_metadata_for_build_wheel only. For legacy non PEP 517 sdists, it might be possible to easily generate such partial wheels with the outcome of generate_metadata.

@pfmoore
Copy link
Member Author

pfmoore commented Apr 4, 2020

Maybe...? Those certainly sound like good things to look at when we look at how to implement this.

@pradyunsg
Copy link
Member

I'm on board in principle. :)

Here's a significantly outdated, and poorly worded variant of the same idea: https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb#stage-2

@pradyunsg pradyunsg added C: cache Dealing with cache and files in it state: needs discussion This needs some more discussion type: enhancement Improvements to functionality labels Apr 8, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: cache Dealing with cache and files in it state: needs discussion This needs some more discussion type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests

3 participants