Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get-pip.py is downloaded even if it's not used #999

Closed
edmorley opened this issue Jul 22, 2020 · 0 comments · Fixed by #1007
Closed

get-pip.py is downloaded even if it's not used #999

edmorley opened this issue Jul 22, 2020 · 0 comments · Fixed by #1007
Assignees

Comments

@edmorley
Copy link
Member

edmorley commented Jul 22, 2020

During the Python setup step, get-pip.py is unconditionally downloaded for every build:

# Heroku uses the get-pip utility maintained by the Python community to vendor Pip.
# https://github.com/pypa/get-pip
GETPIP_URL="https://lang-python.s3.amazonaws.com/etc/get-pip.py"
GETPIP_PY="${TMPDIR:-/tmp}/get-pip.py"
if ! curl -s "${GETPIP_URL}" -o "$GETPIP_PY" &> /dev/null; then
mcount "failure.python.get-pip"
echo "Failed to pull down get-pip"
exit 1
fi

However it's only used if performing a fresh Python install (eg new app, manually cleared cache or Python/stack version upgrade) or if the Pip version differs:

# If a new Python has been installed or Pip isn't up to date:
if [ "$FRESH_PYTHON" ] || [[ ! $(pip --version) == *$PIP_UPDATE* ]]; then
puts-step "Installing pip"
# Remove old installations.
rm -fr /app/.heroku/python/lib/python*/site-packages/pip-*
rm -fr /app/.heroku/python/lib/python*/site-packages/setuptools-*
/app/.heroku/python/bin/python "$GETPIP_PY" pip=="$PIP_UPDATE" &> /dev/null

The get-pip.py download step should be inside the later conditional, so that the download can be skipped.

Doing this would save time downloading the file and remove another potential failure mode for builds.

@edmorley edmorley self-assigned this Jul 22, 2020
edmorley added a commit that referenced this issue Jul 23, 2020
The versions installed by the buildpack have been updated as follows:
* pip:
  - If using Python 3.4: No change (already using the last to support 3.4)
  - If using pipenv: No change (need to update to a newer pipenv first)
  - For everything else: `20.0.2` -> `20.1.1`
* setuptools:
  - If using Python 3.4: `39.0.1` -> `43.0.0` (latest for 3.4)
  - If using Python 2.7: `39.0.1` -> `44.1.1` (latest for 2.7)
  - For everything else: `39.0.1` -> `47.1.1` (until #1006 fixed)
* wheel:
  - If using Python 3.4: `unpinned` -> `0.33.6`
  - For everything else: `unpinned` -> `0.34.2`

This fixes #949 and fixes #1005, and means packages that rely on newer
setuptools will now install successfully.

Changelogs:
https://pip.pypa.io/en/stable/news/
https://setuptools.readthedocs.io/en/latest/history.html#v47-1-1
https://wheel.readthedocs.io/en/latest/news.html

In addition:
* Installed versions are now deterministic (fixes #1000, fixes #1003)
* The build output now includes the versions used, making it easier to
  debug future upgrades (closes #939)
* Errors during pip/setuptools/wheel install now correctly fail the
  build, and stderr is no longer sent to `/dev/null` (fixes #1002)
* Setuptools is no longer installed twice (fixes #1001)
* Everything that is downloaded is now used (fixes #999)
* `--no-cache` and `--disable-version-check` are now used, saving
  unnecessary work and preventing creation of unwanted files in `/app`
* The `PIP_UPDATE` env var no longer leaks into subprocesses.

As part of fixing version pinning, we now use pip itself to determine
whether the installed packages are up to date, since parsing pip's
output is fragile (eg #1003).

This means `pip install` is now called every time, however this is a
no-op for repeat builds where the versions have not changed, since
unless `--upgrade` is specified pip does not hit the index (PyPI) if
requirements are satisfied.

For the installation itself `get-pip.py` is no longer used, since:
- It uses `--force-reinstall`, which is unnecessary here and would slow
  down repeat builds (given we call pip install every time now).
  Trying to work around this by using `get-pip.py` only for the initial
  install, and real pip for subsequent updates would mean we lose
  protection against cached broken installs, plus significantly
  increase the version combinations test matrix.
- It means downloading pip twice (once embedded in `get-pip.py`, and
  again during the install, since `get-pip.py` can't install the
  embedded version directly)
- We would still have to manage several versions of get-pip.py, to
  support older Pythons.

We don't use `ensurepip` since:
- Not all of the previously generated Python runtimes on S3 include it
- We would still have to upgrade pip afterwards
- The versions of pip/setuptools bundled with ensurepip differ greatly
  depending on Python version, and we could easily start using a CLI
  flag for the first pip install before upgrade that isn't supported
  on all versions, without even knowing it (unless we test against
  hundreds of Python archives).

The new pip wheel assets on S3 were generated using:

```
$ pip download --no-cache pip==19.1.1
Collecting pip==19.1.1
  Downloading pip-19.1.1-py2.py3-none-any.whl (1.4 MB)
  Saved ./pip-19.1.1-py2.py3-none-any.whl
Successfully downloaded pip

$ pip download --no-cache pip==20.1.1
Collecting pip==20.1.1
  Downloading pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
  Saved ./pip-20.1.1-py2.py3-none-any.whl
Successfully downloaded pip

$ aws s3 sync . s3://lang-python/common/ --exclude "*" --include "*.whl" --acl public-read --dryrun
(dryrun) upload: ./pip-19.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-19.1.1-py2.py3-none-any.whl
(dryrun) upload: ./pip-20.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-20.1.1-py2.py3-none-any.whl

$ aws s3 sync . s3://lang-python/common/ --exclude "*" --include "*.whl" --acl public-read
upload: ./pip-19.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-19.1.1-py2.py3-none-any.whl
upload: ./pip-20.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-20.1.1-py2.py3-none-any.whl
```
edmorley added a commit that referenced this issue Jul 29, 2020
Previously the pip/setuptools/wheel install step was skipped so long
as Python hadn't just been clean installed (ie so long as not a new app,
emptied cache, Python upgrade, stack change) and pip was the expected
version.

This meant that setuptool/wheel could be the wrong version (or even just
not installed at all), and this would not be corrected.

Now, we now use pip itself to determine whether the installed packages
are up to date, since parsing pip's output is fragile (eg #1003) and
would be tedious given there would be three packages to check.

Unfortunately `get-pip.py` uses `--force-reinstall` which means
performing this step every time is not the no-op it would otherwise be,
but this will be resolved by switching away from `get-pip.py` in the
next commit.

Fixes #1000.
Fixes #1003.
Closes #999.
edmorley added a commit that referenced this issue Jul 29, 2020
Previously the pip/setuptools/wheel install step was skipped so long
as Python hadn't just been clean installed (ie so long as not a new app,
emptied cache, Python upgrade, stack change) and pip was the expected
version.

This meant that setuptool/wheel could be the wrong version (or even just
not installed at all), and this would not be corrected.

Now, we now use pip itself to determine whether the installed packages
are up to date, since parsing pip's output is fragile (eg #1003) and
would be tedious given there would be three packages to check.

Unfortunately `get-pip.py` uses `--force-reinstall` which means
performing this step every time is not the no-op it would otherwise be,
but this will be resolved by switching away from `get-pip.py` in the
next commit.

Fixes #1000.
Fixes #1003.
Closes #999.
dryan pushed a commit to dryan/heroku-buildpack-python that referenced this issue Nov 19, 2020
…u#1007)

Previously the pip/setuptools/wheel install step was skipped so long
as Python hadn't just been clean installed (ie so long as not a new app,
emptied cache, Python upgrade, stack change) and pip was the expected
version.

This meant that setuptool/wheel could be the wrong version (or even just
not installed at all), and this would not be corrected.

Now, we now use pip itself to determine whether the installed packages
are up to date, since parsing pip's output is fragile (eg heroku#1003) and
would be tedious given there would be three packages to check.

Unfortunately `get-pip.py` uses `--force-reinstall` which means
performing this step every time is not the no-op it would otherwise be,
but this will be resolved by switching away from `get-pip.py` in the
next commit.

Fixes heroku#1000.
Fixes heroku#1003.
Closes heroku#999.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant