Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PEP 625 (Filename of a Source Distribution) #12245

Open
di opened this issue Sep 21, 2022 · 18 comments
Open

Support PEP 625 (Filename of a Source Distribution) #12245

di opened this issue Sep 21, 2022 · 18 comments
Labels
blocked Issues we can't or shouldn't get to yet feature request

Comments

@di
Copy link
Member

di commented Sep 21, 2022

What's the problem this feature will solve?
PEP 625 has been accepted, PyPI should be updated to support the PEP.

Describe the solution you'd like
PyPI needs to implement some changes to support the PEP:

  • a restriction on the filenames considered valid for a source distribution (and a corresponding deprecation/notification)
  • validation of the distribution and version sections of the filename, including normalization.
@di
Copy link
Member Author

di commented Sep 21, 2022

This is likely blocked on pypa/packaging#527.

@di
Copy link
Member Author

di commented Jun 26, 2023

This is probably also blocked on finding a good migration path that isn't too obtrusive for users. Right now, there are a lot of files being uploaded that would fail to upload once this change is implemented, and IMO this would break too many users for us to enable this right now:

warehouse=> select DATE_TRUNC('day', upload_time) as day, count(filename) from release_files where packagetype = 'sdist' and filename ilike '%-%-%' group by DATE_TRUNC('day', upload_time) order by day desc limit 30;
         day         | count
---------------------+-------
 2023-06-26 00:00:00 |  1028
 2023-06-25 00:00:00 |   381
 2023-06-24 00:00:00 |   687
 2023-06-23 00:00:00 |  1093
 2023-06-22 00:00:00 |  1453
 2023-06-21 00:00:00 |  1486
 2023-06-20 00:00:00 |  1606
 2023-06-19 00:00:00 |  1200
 2023-06-18 00:00:00 |   354
 2023-06-17 00:00:00 |   723
 2023-06-16 00:00:00 |  1455
 2023-06-15 00:00:00 |  1161
 2023-06-14 00:00:00 |  1567
 2023-06-13 00:00:00 |  1557
 2023-06-12 00:00:00 |  1157
 2023-06-11 00:00:00 |   358
 2023-06-10 00:00:00 |   693
 2023-06-09 00:00:00 |  1327
 2023-06-08 00:00:00 |  1958
 2023-06-07 00:00:00 |  1631
 2023-06-06 00:00:00 |  1430
 2023-06-05 00:00:00 |  1116
 2023-06-04 00:00:00 |   325
 2023-06-03 00:00:00 |   783
 2023-06-02 00:00:00 |  1327
 2023-06-01 00:00:00 |  1710
 2023-05-31 00:00:00 |  1693
 2023-05-30 00:00:00 |  1109
 2023-05-29 00:00:00 |   959
 2023-05-28 00:00:00 |   387
(30 rows)

I think a good migration path would be:

  • ensuring the most popular build tools have supported outputting PEP 625-compliant filenames for some sufficiently long period of time
  • perhaps making upload tools like twine silently normalize this at upload time, possibly with a warning?

@dstufft
Copy link
Member

dstufft commented Jun 26, 2023

The other thing we could do, is forcibly normalize ourselves, though we hadn't done that in the past and I know that would break at least twine's checks if a file has been uploaded already.

@di
Copy link
Member Author

di commented Jul 18, 2023

Blocked on #14156 as well.

@di di added the blocked Issues we can't or shouldn't get to yet label Jul 18, 2023
@stiankri
Copy link

Until the version in the sdist filename is verified as described in this issue, it's possible to create multiple sdists per release (as seen from the filename point of view).

Example: upload foo-1.tar.gz with release 1 and foo-1.zip with release 1.1. From the metadata point of view it's still just one sdist per release, but from the Simple API point of view (which the package managers use) there are two sdists for release 1.

This does not seem to be in the spirit of PEP 527:

[T]his PEP proposes to allow one, and only one, sdist per release of a project.

Which is currently verified on upload.

@di
Copy link
Member Author

di commented Mar 20, 2024

Probably also blocked on pypa/setuptools#3593 as the predominant builder of source distributions.

@di
Copy link
Member Author

di commented Mar 20, 2024

There doesn't seem to be any real progress here towards builders producing normalized source distribution filenames:

warehouse=> SELECT DATE_TRUNC('month', upload_time) AS month, COUNT(filename)
FROM release_files
WHERE packagetype = 'sdist'
    AND filename ILIKE '%-%-%'
    AND upload_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '30 months'
GROUP BY DATE_TRUNC('month', upload_time)
ORDER BY month DESC;
        month        | count
---------------------+-------
 2024-03-01 00:00:00 | 23683
 2024-02-01 00:00:00 | 31742
 2024-01-01 00:00:00 | 33589
 2023-12-01 00:00:00 | 33818
 2023-11-01 00:00:00 | 35584
 2023-10-01 00:00:00 | 32092
 2023-09-01 00:00:00 | 33117
 2023-08-01 00:00:00 | 38100
 2023-07-01 00:00:00 | 34178
 2023-06-01 00:00:00 | 35241
 2023-05-01 00:00:00 | 35136
 2023-04-01 00:00:00 | 32816
 2023-03-01 00:00:00 | 39726
 2023-02-01 00:00:00 | 34714
 2023-01-01 00:00:00 | 32340
 2022-12-01 00:00:00 | 26588
 2022-11-01 00:00:00 | 29160
 2022-10-01 00:00:00 | 27748
 2022-09-01 00:00:00 | 30693
 2022-08-01 00:00:00 | 35739
 2022-07-01 00:00:00 | 30297
 2022-06-01 00:00:00 | 31412
 2022-05-01 00:00:00 | 35092
 2022-04-01 00:00:00 | 29901
 2022-03-01 00:00:00 | 33199
 2022-02-01 00:00:00 | 27257
 2022-01-01 00:00:00 | 28129
 2021-12-01 00:00:00 | 27028
 2021-11-01 00:00:00 | 30112
 2021-10-01 00:00:00 | 30402
 2021-09-01 00:00:00 | 28612
(31 rows)

chart

@dimbleby
Copy link

pypa/setuptools#3593 has been closed (implemented) for a little while now, would be interesting to see if it is yet making a dent in that graph

@di
Copy link
Member Author

di commented Aug 7, 2024

Indeed, quite a nice drop:

chart (3)

At this rate, we should be low enough in another month or two to start emitting warnings about a deprecation, and probably by EOY we could fully support PEP 625.

@di
Copy link
Member Author

di commented Nov 6, 2024

And here's the latest:

Image

Less than 10K uploads in October.

@di
Copy link
Member Author

di commented Nov 18, 2024

PR to start warning uploaders of non-PEP 625 compliant filenames is here: #17110

@bmispelon
Copy link

Hi,

This change is creating some confusion in the Django project (unsurprisingly, considering the non-standard capitalization of our package and build files historically), especially around the timeline and extent of the deprecation: https://code.djangoproject.com/ticket/35980.
Our current solution has been to pin setuptools but that's not a solution we're keen to keep in the long term.

Could you clarify what you meant in the comment above with "[...] probably by EOY we could fully support PEP 625."? Does that mean that non-pep625 build files could start being rejected, or is that just a prediction about the progress towards the goal of all newly uploaded build files being pep625-compliant?

The warning email from PyPI that Django received with our latest release also mentions that "In the future, PyPI will require all newly uploaded source distribution filenames to comply with PEP 625.". Is there a clearer timeline for this? Will there be a mechanism where maintainers will be given advance notice of that change, or is there a specific GH issue one could subscribe to for example?
Django has a major version set to be released in April, and it would be helpful for our maintainers to know whether we need to audit/fix our various release scripts and documentation ahead of that date.

Thanks for all the work you do on PyPI, we're big fans 🎉

@di
Copy link
Member Author

di commented Dec 9, 2024

Hi @bmispelon, thanks for your comment here, sorry this is causing confusion for the Django project.

Could you clarify what you meant in the #12245 (comment) with "[...] probably by EOY we could fully support PEP 625."? Does that mean that non-pep625 build files could start being rejected, or is that just a prediction about the progress towards the goal of all newly uploaded build files being pep625-compliant?

By "fully support PEP 625" I meant that PyPI would be in compliance with PEP 625, and reject uploads of filenames that are invalid per PEP 625. I think it's unlikely that we actually will do this by EOY, though.

Is there a clearer timeline for this? Will there be a mechanism where maintainers will be given advance notice of that change, or is there a specific GH issue one could subscribe to for example?

We don't currently have a timeline other than "eventually". If it's helpful, I can commit to having the warning emails include a clear deadline at a minimum of 6 months before said deadline, and share that deadline on this issue as well. Would that suffice? That also means this wouldn't be enforced before Django's release in April.

FWIW, I think this change should be unnecessary -- the actual project name shouldn't need to change, just the underlying build tooling that is producing the source distribution. In this case, upgrading setuptools should be all that is necessary (aside from project-specific changes to handle the new filename, of course).

@konstin
Copy link
Contributor

konstin commented Dec 9, 2024

To add more context to this, the project name is used in different places with different normalizations: There is the human-readable name, and there is the dist-info name (re.sub(r"[-_.]+", "_", name).lower()). Note that the dist-info normalization is different from the regular package name normalization (re.sub(r"[-_.]+", "-", name).lower()) to avoid a dash in a name that is delimited by a dash.

  • pyproject.toml project.name: human-readable name ("Tools SHOULD normalize this name, as soon as it is read for internal consistency.")
  • METADATA Name: Same as pyproject.toml project.name
  • Source distribution filenames: dist-info name
  • Wheel filename: dist-info name
  • The top level <name>-<version> directory inside a source distribution: dist-info name. This isn't explicitly called out in the spec, but it uses {name}-{version} for both filename and top level directory. It doesn't impact tooling as there is only one top level directory anyway.
  • <name>-<version>.dist-info and <name>-<version>.data directories inside the wheel: dist-info name. This isn't explicitly called out in the spec, but it uses {distribution}-{version} for both filename and .dist-info names and unnormalized names would cause ambiguity after installation.

Afaik the only blocker here is pypa/setuptools#3777

@dimbleby
Copy link

I do not think that even pypa/setuptools#3777 is a blocker here, though it might take a slightly pedantic reading to agree with this.

As I understand it, this issue is specifically about PEP 625 and source distribution names: so though it seems to be true that setuptools is producing wheels whose naming is not PEP-491 compliant - that would be out of scope.

There probably should be an analogous issue here for wheels: "support [enforce] PEP491". But I think that is not what this issue says, and I guess it is unlikely that warehouse would insist on PEP-491 while setuptools produces non-compliant wheels.

@nessita
Copy link

nessita commented Dec 10, 2024

I do not think that even pypa/setuptools#3777 is a blocker here, though it might take a slightly pedantic reading to agree with this.

As I understand it, this issue is specifically about PEP 625 and source distribution names: so though it seems to be true that setuptools is producing wheels whose naming is not PEP-491 compliant - that would be out of scope.

There probably should be an analogous issue here for wheels: "support [enforce] PEP491". But I think that is not what this issue says, and I guess it is unlikely that warehouse would insist on PEP-491 while setuptools produces non-compliant wheels.

Hi! This is Natalia, one of the current Django Fellows. Thank you David for your feedback!

For Django, the release process uses a unified workflow that relies on setuptools to generate both the tarball and the wheel. However, the issue with setuptools is that it produces a lowercase django tarball and a capitalized Django wheel. This inconsistency complicates our tooling, which must handle both naming formats. Ideally, we would have a consistent naming convention across both formats to avoid this problem.

As a workaround, we’ve been pinning setuptools>=61.0.0,<69.3.0 to produce consistently named Django tarballs and wheels. However, this triggers the current deprecation warning when uploading to PyPI, and for Django, the inconsistent wheel naming is a blocker to releasing tarballs with the expected naming format.

@dimbleby
Copy link

Ah, then I guess pypa/setuptools#3777 is a blocker specifically for django - but still not, strictly speaking, a blocker here...

setuptools sounds open to pull requests fixing this and I assume that it must be not very difficult once the relevant bit of code is identified, perhaps you are motivated to make that happen.

@nessita
Copy link

nessita commented Dec 10, 2024

Ah, then I guess pypa/setuptools#3777 is a blocker specifically for django - but still not, strictly speaking, a blocker here...

Ish :-)

I believe this inconsistency (which goes beyond just name casing) could affect more packages than just Django, for similar reasons. It’s problematic and likely complicates tooling to have release artifacts with varying casing or punctuation.

setuptools sounds open to pull requests fixing this and I assume that it must be not very difficult once the relevant bit of code is identified, perhaps you are motivated to make that happen.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Issues we can't or shouldn't get to yet feature request
Projects
None yet
Development

No branches or pull requests

7 participants