Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify use of index-url vs. extra-index-url #76

Open
matthewfeickert opened this issue Mar 21, 2024 · 16 comments
Open

Clarify use of index-url vs. extra-index-url #76

matthewfeickert opened this issue Mar 21, 2024 · 16 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@matthewfeickert
Copy link
Member

matthewfeickert commented Mar 21, 2024

Clarify the need(?) and use of --index-url and --extra-index-url. I might have over complicated some of the instructions given a misunderstanding of priority, but there seems to be no priority and no way to enforce priority. This I think lead to some of the confusion in #41.

References:

@matthewfeickert matthewfeickert added the documentation Improvements or additions to documentation label Mar 21, 2024
@matthewfeickert matthewfeickert self-assigned this Mar 21, 2024
@matthewfeickert
Copy link
Member Author

Relevant thing to point to (c.f. pypa/pip#8606 (comment)) is PEP 708 – Extending the Repository API to Mitigate Dependency Confusion Attacks (currently unimplimented).

@matthewfeickert
Copy link
Member Author

Note that uv differs from pip here in that it does give higher priority to --extra-index-url (which is different than I would have assumed!).

From uv pip install --help:

...
  -i, --index-url <INDEX_URL>
          The URL of the Python package index (by default: <https://pypi.org/simple>).
          
          The index given by this flag is given lower priority than all other indexes specified via the `--extra-index-url` flag.
          
          Unlike `pip`, `uv` will stop looking for versions of a package as soon as it finds it in an index. That is, it isn't possible for `uv` to consider versions of the same package across multiple indexes.
          
          [env: UV_INDEX_URL=]

      --extra-index-url <EXTRA_INDEX_URL>
          Extra URLs of package indexes to use, in addition to `--index-url`.
          
          All indexes given via this flag take priority over the index in `--index-url` (which defaults to PyPI). And when multiple `--extra-index-url` flags are given, earlier values take priority.
          
          Unlike `pip`, `uv` will stop looking for versions of a package as soon as it finds it in an index. That is, it isn't possible for `uv` to consider versions of the same package across multiple indexes.
          
          [env: UV_EXTRA_INDEX_URL=]

...

@martinfleis
Copy link
Member

I'm curious if that's intended. I thought that their aim was to be a drop-in replacement.

@matthewfeickert
Copy link
Member Author

I'm curious if that's intended.

I think it is, though we could easily check (and ask).

I thought that their aim was to be a drop-in replacement.

I believe the goal is for functionality to be achieved, not for design choices to be replicated. In the limitations section of the README they note

Limitations

While uv supports a large subset of the pip interface, it does not support the entire feature set.
In some cases, those differences are intentional; in others, they're a result of uv's early stage of development.

For details, see our pip compatibility guide.

@matthewfeickert
Copy link
Member Author

Yeah, it is intentional given astral-sh/uv#2083.

@Carreau
Copy link
Collaborator

Carreau commented Mar 25, 2024

If there are issue with the order of indexes, one alternative would be for scientific python to have a "proxy index", that merges the json of multiple upstream index.

I'm wondering if a proxy index like that would even need a permanent server or could be hosted purely on an edge cloud as it is likely stateless.

@matthewfeickert
Copy link
Member Author

@Carreau do you have examples of these proxy indexes? I haven't heard of this before, so it would be interesting to see how they work.

@Carreau
Copy link
Collaborator

Carreau commented Mar 26, 2024

Hum, it's theoretical, but basically you fetch multiple <repos>/simple/<package> and merge the pages. I believe I might talk about this with @ivanov last June in seattle.

Think:

import flask
import request
import bs4
app = flask.app(__file__)

repos = ['https://pypi.org/', 'https://nightly.com/']


@app.route('/simple/<package>')
def simple(package):
    pages = [requests.get(r+'/simple/'+package) for r in repos]
    bodys = [bs4.parse(p).body for p in pages]
    return HEAD +  concat(bodys) + FOOTER

That's the legacy API – I think anaconda.org only have it – but there is a new JSON API as well, so we might need some work to figure out the details.

We should only need to serve indexes, as the download come from somewhere else.

@Carreau
Copy link
Collaborator

Carreau commented Mar 28, 2024

Here is a poc using flask: https://github.com/Carreau/multi-index which does work locally.

It's sync, but it should not be hard to make it async with various caching.

@Carreau
Copy link
Collaborator

Carreau commented Jun 6, 2024

See https://github.com/Carreau/cloudflare-pypi-multi-index deployed on cloudflare workers on
https://nightly.carreau.workers.dev/nightly

$ pip install --index-url https://nightly.carreau.workers.dev/nightly --pre --upgrade ipython

Should now just work and "merge" Pypi and https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/, we should be able to add any other PyPI mirror or have crazy things like /random return only a subset of the wheel, or /binary, strip all the tgz if whl are present.. for example.

If it's of interest we could have that owned by the tools team, which would simplify the above (and we could have usage metrics...)

@matthewfeickert
Copy link
Member Author

This is pretty great @Carreau! Thank you for building this.

$ docker run --rm -ti python:3.12 /bin/bash
root@8cf753ac7bdf:/# python -m venv venv && . venv/bin/activate
(venv) root@8cf753ac7bdf:/# python -m pip --quiet install --upgrade pip wheel
(venv) root@8cf753ac7bdf:/# python -m pip install --index-url https://nightly.carreau.workers.dev/nightly --pre --upgrade matplotlib
Looking in indexes: https://nightly.carreau.workers.dev/nightly
Collecting matplotlib
  Downloading https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/matplotlib/3.10.0.dev252%2Bg7ccfd3813b/matplotlib-3.10.0.dev252%2Bg7ccfd3813b-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.3/8.3 MB 17.8 MB/s eta 0:00:00
Collecting contourpy>=1.0.1 (from matplotlib)
  Downloading https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/contourpy/1.3.0.dev1/contourpy-1.3.0.dev1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (320 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 320.2/320.2 kB 15.1 MB/s eta 0:00:00
Collecting cycler>=0.10 (from matplotlib)
  Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
  Downloading fonttools-4.53.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (162 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 162.2/162.2 kB 1.7 MB/s eta 0:00:00
Collecting kiwisolver>=1.3.1 (from matplotlib)
  Downloading kiwisolver-1.4.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.4 kB)
Collecting numpy>=1.23 (from matplotlib)
  Downloading https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/numpy/2.1.0.dev0/numpy-2.1.0.dev0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.3/19.3 MB 25.2 MB/s eta 0:00:00
Collecting packaging>=20.0 (from matplotlib)
  Downloading packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting pillow>=8 (from matplotlib)
  Downloading pillow-10.3.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting pyparsing>=2.3.1 (from matplotlib)
  Downloading pyparsing-3.1.2-py3-none-any.whl.metadata (5.1 kB)
Collecting python-dateutil>=2.7 (from matplotlib)
  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting six>=1.5 (from python-dateutil>=2.7->matplotlib)
  Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.53.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 28.5 MB/s eta 0:00:00
Downloading kiwisolver-1.4.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 63.5 MB/s eta 0:00:00
Downloading packaging-24.0-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.5/53.5 kB 6.0 MB/s eta 0:00:00
Downloading pillow-10.3.0-cp312-cp312-manylinux_2_28_x86_64.whl (4.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 40.0 MB/s eta 0:00:00
Downloading pyparsing-3.1.2-py3-none-any.whl (103 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.2/103.2 kB 16.0 MB/s eta 0:00:00
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 27.6 MB/s eta 0:00:00
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, pyparsing, pillow, packaging, numpy, kiwisolver, fonttools, cycler, python-dateutil, contourpy, matplotlib
Successfully installed contourpy-1.3.0.dev1 cycler-0.12.1 fonttools-4.53.0 kiwisolver-1.4.5 matplotlib-3.10.0.dev252+g7ccfd3813b numpy-2.1.0.dev0 packaging-24.0 pillow-10.3.0 pyparsing-3.1.2 python-dateutil-2.9.0.post0 six-1.16.0
(venv) root@8cf753ac7bdf:/# python -m pip list
Package         Version
--------------- -------------------------
contourpy       1.3.0.dev1
cycler          0.12.1
fonttools       4.53.0
kiwisolver      1.4.5
matplotlib      3.10.0.dev252+g7ccfd3813b
numpy           2.1.0.dev0
packaging       24.0
pillow          10.3.0
pip             24.0
pyparsing       3.1.2
python-dateutil 2.9.0.post0
six             1.16.0
wheel           0.43.0
(venv) root@8cf753ac7bdf:/#

If it's of interest we could have that owned by the tools team, which would simplify the above (and we could have usage metrics...)

SGTM. Does @scientific-python/tools-team agree?

The only question I have is that I assume that you're currently paying for the any Cloudflare expenses for this at the moment, and while it seems like currently

the first 100,000 requests each day are free

for Cloudflare, which is good, can we make it so that your payment details don't ever get charged?

@tupui
Copy link
Member

tupui commented Jun 8, 2024

Yeah careful with Cloudflare pricing. There has been some articles where people complained about sudden changes and in the end they had to rewrite their whole integration because they could not pay...

For the index thing. I had also that issue/concern at work. In the end I ended up checking that the SHA in my lockfile was present on the index I wanted (and being strict about not following redirections). Something I hope I can ditch with uv once it's stable.

@Carreau
Copy link
Collaborator

Carreau commented Jun 10, 2024

can we make it so that your payment details don't ever get charged

I don't think I have payment methods setup on cloudflare.

I also think that scientific-python should have a cloudflare account and the tools team should get delegated access to it anyway, and make sure it works well to delegate and we know how to debug/deploy/track metrics.

I think we can / should play with this a bit before recommending it also. In particular my POC does not support the /json because anaconda nightly upload channel does not.

@bsipocz
Copy link
Member

bsipocz commented Jun 10, 2024

What does happen when the free requests run out? Automatic charges, or access denied requests? If the latter, then I would say do the migration as soon as you can.

@Carreau
Copy link
Collaborator

Carreau commented Jun 11, 2024

https://developers.cloudflare.com/workers/platform/pricing/ and https://developers.cloudflare.com/workers/platform/limits/ suggest that it's 100,000 reset at midnight every day that stop working once passed the limit.

I think there are also 2 things:

  • migrating to a scientific-python managed cloudflare (that can still be free).
  • moving from free to paid plan.

I think we should do 1 anyway as urls do reflect the org, and I would prefer to use scientifc-python.workers.dev instead of carreau.workers.dev

Paid plan we can discuss later, as there is a flat rate of $5/month that bump us from 100k to 10M regardless of wether you are under or above 100k requests.

@bsipocz
Copy link
Member

bsipocz commented Jun 11, 2024

I fully support the plan above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants