-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MNT] CI add pip dependency caching #1352
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
One key change worth documenting is in the cache script there is this step:
I found the default pip install torch to be multiple gbs in size because it installs all the GPU dependencies (which we dont use in the CI). I instead opt to only install the CPU dependencies for the cache which is only 200mbs of size. This allowed us to stay under the 10gb limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR caches the dependencies and the wheels built each time a PR is merged to main (if pyproject.toml is updated).
The majority of dependencies (even more for patch versions) will install new versions without any update to pyproject.toml
. Our main indication that there is a breakage on fresh installations is the general CI failing due to an incompatible dependency update.
This change currently could potentially delay a response. Even if we catch this on the periodic tests and make an update, there is a potential for a mismatch of dependency versions used in that test and PR tests. Someone could probably pick at some security issues on being behind on versions, but I'm not going to do that.
tl;dr: I think we should update the cache more frequently 🙂.
# GNU tar on windows runs much faster than the default BSD tar | ||
- name: Use GNU tar instead BSD tar if Windows | ||
if: ${{ inputs.runner_os == 'Windows' }} | ||
shell: cmd | ||
run: echo C:\Program Files\Git\usr\bin>>"%GITHUB_PATH%" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why? Is this the only action which does this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this completely now since it doesn't make a difference now we're using the windows D drive (it seems to have fixed all the cache restoring issues)
- name: Set optimal environment variables for pip cache and restore | ||
uses: ./.github/actions/pip_cache | ||
with: | ||
runner_os: ${{ runner.os }} | ||
python_version: ${{ matrix.python-version }} | ||
restore_cache: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for these two jobs only?
- name: Install CPU version of pytorch on linux | ||
uses: nick-fields/retry@v3 | ||
if: steps.cache.outputs.cache-hit != 'true' && runner.os == 'Linux' | ||
with: | ||
timeout_minutes: 30 | ||
max_attempts: 3 | ||
command: python -m pip install torch --index-url https://download.pytorch.org/whl/cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only Linux, are there no benefits for other OS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So pytorch by default on macos and windows will only install the CPU version unless you specify otherwise. Linux on the otherhand it installs everything (GPU version). This is why the Linux cache sizes were over 2GBs whereas windows and macos was like that 600 mbs.
name: Update pip cache on merge to main | ||
on: | ||
push: | ||
branches: | ||
- main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe set it so this can be run manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it so it runs at the same time as periodic. The reason they are separate actions is because I don't want to use the cpu version of pytorch in periodic just in case something breaks because of using the cpu vs regular install (it should be identical)
Even with the comments and questions in the review, I really like the idea of this and hope it works out. Thanks for the work. |
Reference Issues/PRs
What does this implement/fix? Explain your changes.
This PR adds further caching to the CI allowing all dependencies and wheels to be cache'd and restored locally greatly speeding up the CI.
One of the main bottle necks for the CI currently is installing dependencies and building wheels. Currently every time the CI is ran dependencies are installed from pip and built. In the 'perfect' scenario this is fine and installs and building of wheels takes around 5 minutes or less. However, due to external factors such as network issues, a bad CI node and other factors sometimes this install and building step can take upwards of 20 minutes. Normally in every CI run, one runner takes 15+ minutes to install and build the dependencies. As the CI is only as fast as its slowest runner this presents a bottleneck.
This PR caches the dependencies and the wheels built each time a PR is merged to main (if pyproject.toml is updated). As a result the CI can the restore the dependencies and wheel on all branches and runs skipping the download and build phase. This reduces the time to install down to around 3 minutes in the best case and in the worse case (due to bad runner) 7 minutes. This greatly speeds up the CI as the slowest runner should be much faster.
Does your contribution introduce a new dependency? If yes, which one?
Any other comments?
PR checklist
For all contributions
For new estimators and functions
__maintainer__
at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.For developers with write access