Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount pip_cache_path in Docker container #3556

Merged
merged 3 commits into from
Feb 15, 2018
Merged

Conversation

humitos
Copy link
Member

@humitos humitos commented Jan 26, 2018

This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.

Closes #3553

This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.
@safwanrahman
Copy link
Member

I actually dont like having pip cache in build.
One may put unvalidate package in the cache and that may invalid everything!
Most of the project do not include hash in their requirements.
So I think it may make build fails more often!

@RichardLitt RichardLitt added the Improvement Minor improvement to code label Jan 26, 2018
@humitos
Copy link
Member Author

humitos commented Jan 26, 2018

@safwanrahman we were discussing this in the chat to "fix the problem" of working under intermitent/slow internet connections. This would be only for development and not a setting for production actually.

I think the examples you propose (the "why is not a good idea") are valid. So, maybe we should limit this line https://github.com/rtfd/readthedocs.org/pull/3299/files#diff-d61f6aff9736141f30b581471d5d55ceR444 only when DEBUG=True. What do you think?

@safwanrahman
Copy link
Member

@humitos Yeah. I think putting it in development mode is totally a great idea!

@@ -449,7 +449,7 @@ def checkout_path(self, version=LATEST):
@property
def pip_cache_path(self):
"""Path to pip cache."""
if getattr(settings, 'GLOBAL_PIP_CACHE', False):
if getattr(settings, 'GLOBAL_PIP_CACHE', False) and settings.DEBUG:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a default GLOBAL_PIP_CACHE in dev.py?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. By default this feature is deactivated. It could be something like:

GLOBAL_PIP_CACHE = os.path.join(SITE_ROOT, '.cache', 'pip')

what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say leave deactivated so our development environments are close to production use.

@@ -728,6 +728,10 @@ def create_container(self):
'bind': self.project.doc_path,
'mode': 'rw'
},
self.project.pip_cache_path: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly redundant, as the project pip cache path is under doc_path already.

Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip. To use underlying storage like this, but then repeated remount this path to different locations through docker might cause problems if pip stores and metadata that includes the full path. I don't know enough about pip internals to answer this.

It seems like the most correct option would be to explicitly set CACHE_DIR on the pip commands if using a global cache, and to mount the global cache conditionally on the binds argument here if GLOBAL_PIP_CACHE is True.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly redundant, as the project pip cache path is under doc_path already.

Mmm... Yes. This is redundant when the GLOBAL_PIP_CACHE is not set. So, maybe we should add this path only when it's set (as you said, conditionally).

Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip

We are using the --cache-dir in the command at python_environments.py file, and that dir points to the project.pip_cache_path

I prefer to use the argument option instead of the env variable (since it's less explicit and more complicated to debug later)

I don't know enough about pip internals to answer this

As far as I know, pip doesn't save metadata, but creates a hash of the package downloaded and use it as filename --but use the real one for wheels:

├── http
│   ├── 0
│   │   ├── 0
│   │   │   ├── 3
│   │   │   │   └── f
│   │   │   │       └── 8
│   │   │   │           └── 003f82596103cb4b5d71b69aeb8119f406b1bc78110a190848c4fb24
│   │   │   ├── 6
│   │   │   │   └── 6
│   │   │   │       └── b
│   │   │   │           └── 0066b726b642ce2657f32a7c7b9f63ae698a2a16695cda32f18eb806
└── wheels
    ├── 02
    │   └── 2e
    │       └── 0d
    │           └── 30ef9919ce3a07990f191c164d07b083761535fb721a1af3cf
    │               ├── ip_associations_python_novaclient_ext-0.2-cp27-none-any.whl
    │               └── ip_associations_python_novaclient_ext-0.2-cp36-none-any.whl
    ├── 06
    │   └── 3d
    │       └── 74
    │           └── f64c3ce3266df2598031818ecc86bc25c8ccede0723973f119
    │               └── anyjson-0.3.3-cp27-none-any.whl

I'd say that mounting conditionally under the binds depending on this setting is the missing piece here and the rest is safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good. I don't think there is any metadata dependent on the path either, so this is most likely safe I'm sure.

Refactor, by creating a new method, where the binds are created to be
easier to read.
@humitos
Copy link
Member Author

humitos commented Jan 30, 2018

I just refactor the creation of bindings by creating a new method in that class which will mount the global cache conditionally.

Copy link
Contributor

@agjohnson agjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look 👍

Copy link
Member

@safwanrahman safwanrahman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
r+

@safwanrahman
Copy link
Member

@ericholscher Can I merge this Pull request?

@safwanrahman safwanrahman merged commit 1376860 into master Feb 15, 2018
@stsewd stsewd deleted the humitos/cache/pip-docker branch July 23, 2018 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Improvement Minor improvement to code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants