Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redirects from non-hash containing pypi.io and files.pythonhosted.org missing for newer package versions (actual problem: underscores _ instead of hyphens - in file name) #13240

Closed
dlaehnemann opened this issue Mar 20, 2023 · 6 comments

Comments

@dlaehnemann
Copy link

Describe the bug

For the package snakemake-wrapper-utils, the versions 0.5.1 and 0.5.2 do not have the redirects from the simple and readable URLs without the hash value.

These are available for the earlier versions, for example: https://pypi.io/packages/source/s/snakemake-wrapper-utils/snakemake-wrapper-utils-0.5.0.tar.gz

Expected behavior
Have the following redirects for version 0.5.2 (and 0.5.1 accordingly):

To Reproduce

Just check out the above links under Expected behaviour, which should optimally work.

My Platform

I think that this is independent of platform.

Additional context

We use these consistent URLs in the bioconda project, to keep yaml files for conda recipes readable and easy to maintain. For the here reported package, see the last working version right here:

https://github.com/bioconda/bioconda-recipes/blob/a5c1155190cc811934ff681dcd1276d84e160130/recipes/snakemake-wrapper-utils/meta.yaml#L8-L10

I can still find the respective readable link pattern (which is for example described in this stackoverflow question and its responses) for newly published versions of other packages, for example:
https://files.pythonhosted.org/packages/source/s/snakemake/snakemake-7.24.2.tar.gz

Thus, and because I didn't find any deprecation info for this in the docs or changelogs, I am assuming that this is not some deprecation of these readable URLs, but rather some kind of temporary hiccup with setting up these redirects. Probably something similar to (although probably with another root cause than) issue #11940.

Interestingly, the publication of the latter of the two package versions that currently don't have the redirects (0.5.2) coincides with a pypi incident report from January 10, 2023 (PyPI File Hosting Latencies and Timeouts). Thus, maybe snakemake-wrapper-utils is not the only package with this issue.

Generally, I guess that querying the JSON API for the download URL would be the official guidance (instead of using the somewhat legacy? readable URL redirects), but this guidance could do with a clear example on how to do this. I have found a code example on how to query pypi's JSON API on stackoverflow, so maybe something along those lines could be added to the docs.

As for the bioconda project, we could possibly automatically insert those into the conda recipes (along with the expected sha256 checksum) and this should be more robust and future-proof, right?

@dlaehnemann dlaehnemann added bug 🐛 requires triaging maintainers need to do initial inspection of issue labels Mar 20, 2023
@ewdurbin
Copy link
Member

The source file formatting changed:

snakemake-wrapper-utils-0.5.0.tar.gz

vs

snakemake_wrapper_utils-0.5.1.tar.gz

So you can get the redirect you expect by querying https://files.pythonhosted.org/packages/source/s/snakemake-wrapper-utils/snakemake_wrapper_utils-0.5.2.tar.gz

$ curl -s -I https://files.pythonhosted.org/packages/source/s/snakemake-wrapper-utils/snakemake_wrapper_utils-0.5.2.tar.gz | grep location
location: https://files.pythonhosted.org/packages/52/e1/449dcfcba437f9bda6cd232664182dcf7acee78c3fe77a238078c84ec7b7/snakemake_wrapper_utils-0.5.2.tar.gz

@ewdurbin ewdurbin added APIs/feeds and removed requires triaging maintainers need to do initial inspection of issue bug 🐛 labels Mar 20, 2023
@dlaehnemann
Copy link
Author

Many thanks for the swift response!

Just to confirm that this is intentional, the link will now contain the package name once in kebab-case and once in snake_case, right. Like this:
https://files.pythonhosted.org/packages/source/{ first_letter_of_package_name }/{ package-name-kebab-case }/{ package_name_snake_case }-{ version }.tar.gz

Also, would you consider using the JSON API to get the URL more future-proof than relying on these standardized URL redirects?

@ewdurbin
Copy link
Member

The url format is as documented at https://warehouse.pypa.io/api-reference/integration-guide.html#querying-pypi-for-package-urls:

/packages/{python_version}/{project_l}/{project_name}/{filename}

We do not do any handling of the filename, that is the unloaders decision.

Yes. Querying the JSON API is the official guidance

@ewdurbin
Copy link
Member

The redirect endpoint queries the JSON API, so this might serve as helpful reference:

https://github.com/pypi/conveyor/blob/230ebcdcda0f15072d2ed2ee0ce94613244cbc07/conveyor/views.py#L33-L104

@dlaehnemann
Copy link
Author

The url format is as documented at https://warehouse.pypa.io/api-reference/integration-guide.html#querying-pypi-for-package-urls:

/packages/{python_version}/{project_l}/{project_name}/{filename}

We do not do any handling of the filename, that is the unloaders decision.

Thanks so much, this was really useful information!

I guess the new file names are then probably down to some change in poetry, which we use for the publishing of this package. I'll report here and cross-reference, if I get to the bottom of this.

Yes. Querying the JSON API is the official guidance

I'll bring this up in the conda context.

@dlaehnemann dlaehnemann changed the title redirects from non-hash containing pypi.io and files.pythonhosted.org missing for newer package versions redirects from non-hash containing pypi.io and files.pythonhosted.org missing for newer package versions (actual problem: underscores _ instead of hyphens - in file name) Mar 20, 2023
@dlaehnemann
Copy link
Author

The following info basically has nothing to do with pypi or its warehouse. But for others coming here with a similar problem, I want to nevertheless document the root cause of this problem in my case: It was a change in poetry build behaviour, which means that sdist files built by poetry will have all hyphens / dashes (-) in package names (where this is the standard) substituted by underscores (_):

And this is actually a change to conform to the specs for sdist file names:
https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode

dlaehnemann added a commit to bioconda/bioconda-recipes that referenced this issue Mar 20, 2023
… underscores)

The official specs for sdist file names require that the package name is in snake case, see these places:
* https://peps.python.org/pep-0625/#specification
* https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode
* https://peps.python.org/pep-0491/#escaping-and-unicode

[`poetry` switched to conforming to these specs with version `1.2.2`](https://github.com/python-poetry/poetry/releases/tag/1.2.2), see:
* poetry-core pull request with the actual code changes: python-poetry/poetry-core#484
* poetry pull request pulling them in: python-poetry/poetry#6621

Thus, all (bio-)conda recipes whose sources are built with `poetry` and uploaded to pypi are likely to face the same issue.

As `snakemake-wrapper-utils` only contains hyphens (`-`) as non-word characters, the suggested `replace("-","_")` should be safe and enough.

But a thorough solution should be implemented in `grayskull` (that does conda recipe templating for pypi packages), and in the tooling of `conda-forge` and `bioconda`. One solution to make such recipes more future-proof would be to make tooling respect the [official guidance to query the pypi JSON API to get download urls](https://warehouse.pypa.io/api-reference/integration-guide.html#official-guidance). Examples of how to do this are here:
* [stackoverflow answer with example code of querying the pypi JSON API for a download URL](https://stackoverflow.com/a/48327216)
* [link to the pypi code for generating the currently used standardized redirect links at `pypi.io`, that also uses the JSON API](pypi/warehouse#13240 (comment))
dlaehnemann added a commit to bioconda/bioconda-recipes that referenced this issue Mar 20, 2023
* update snakemake-wrapper-utils to 0.5.2

* make package name in sdist `.tar.gz` filename conform to specs (using underscores)

The official specs for sdist file names require that the package name is in snake case, see these places:
* https://peps.python.org/pep-0625/#specification
* https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode
* https://peps.python.org/pep-0491/#escaping-and-unicode

[`poetry` switched to conforming to these specs with version `1.2.2`](https://github.com/python-poetry/poetry/releases/tag/1.2.2), see:
* poetry-core pull request with the actual code changes: python-poetry/poetry-core#484
* poetry pull request pulling them in: python-poetry/poetry#6621

Thus, all (bio-)conda recipes whose sources are built with `poetry` and uploaded to pypi are likely to face the same issue.

As `snakemake-wrapper-utils` only contains hyphens (`-`) as non-word characters, the suggested `replace("-","_")` should be safe and enough.

But a thorough solution should be implemented in `grayskull` (that does conda recipe templating for pypi packages), and in the tooling of `conda-forge` and `bioconda`. One solution to make such recipes more future-proof would be to make tooling respect the [official guidance to query the pypi JSON API to get download urls](https://warehouse.pypa.io/api-reference/integration-guide.html#official-guidance). Examples of how to do this are here:
* [stackoverflow answer with example code of querying the pypi JSON API for a download URL](https://stackoverflow.com/a/48327216)
* [link to the pypi code for generating the currently used standardized redirect links at `pypi.io`, that also uses the JSON API](pypi/warehouse#13240 (comment))

* closing bracket; use jinja2 `replace()` filter instead of python `str.replace()`

* update sha256 according to https://pypi.org/pypi/snakemake-wrapper-utils/0.5.2/json
cokelaer pushed a commit to cokelaer/bioconda-recipes that referenced this issue Apr 28, 2023
* update snakemake-wrapper-utils to 0.5.2

* make package name in sdist `.tar.gz` filename conform to specs (using underscores)

The official specs for sdist file names require that the package name is in snake case, see these places:
* https://peps.python.org/pep-0625/#specification
* https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode
* https://peps.python.org/pep-0491/#escaping-and-unicode

[`poetry` switched to conforming to these specs with version `1.2.2`](https://github.com/python-poetry/poetry/releases/tag/1.2.2), see:
* poetry-core pull request with the actual code changes: python-poetry/poetry-core#484
* poetry pull request pulling them in: python-poetry/poetry#6621

Thus, all (bio-)conda recipes whose sources are built with `poetry` and uploaded to pypi are likely to face the same issue.

As `snakemake-wrapper-utils` only contains hyphens (`-`) as non-word characters, the suggested `replace("-","_")` should be safe and enough.

But a thorough solution should be implemented in `grayskull` (that does conda recipe templating for pypi packages), and in the tooling of `conda-forge` and `bioconda`. One solution to make such recipes more future-proof would be to make tooling respect the [official guidance to query the pypi JSON API to get download urls](https://warehouse.pypa.io/api-reference/integration-guide.html#official-guidance). Examples of how to do this are here:
* [stackoverflow answer with example code of querying the pypi JSON API for a download URL](https://stackoverflow.com/a/48327216)
* [link to the pypi code for generating the currently used standardized redirect links at `pypi.io`, that also uses the JSON API](pypi/warehouse#13240 (comment))

* closing bracket; use jinja2 `replace()` filter instead of python `str.replace()`

* update sha256 according to https://pypi.org/pypi/snakemake-wrapper-utils/0.5.2/json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants