Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New index schema and registry API for extended feature syntax. #9111

Closed
wants to merge 6 commits into from

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented Jan 29, 2021

This introduces a new features2 field in the index to segregate the new feature syntax for namespaced features and weak dependency features. The intent here is to prevent versions of Cargo older than 1.19 from breaking, since they will fail to run even with a Cargo.lock file.

To prevent publishing from breaking when talking to a registry that does not understand this new field, this switches the publish API to a new URL (api/v2/crates/new).

This also introduces a new schema field to the index and publish API, so that in the future changes can be made, and versions of cargo after this is introduced will silently ignore entries it doesn't understand.

This is split into a number of different commits, with more information in each commit.

The intent here is to make it more flexible to create different registry
setups, and to reuse code a little more easily.
The intent here is to make it easier to work with API errors.

This also includes some new tests for checking network errors.
This moves the new syntax for namespaced features and weak dependency
features into a new field ("features2") in the index and API. This
is necessary to avoid breaking Cargo versions older than 1.19,
which fail to parse the index even if there is a Cargo.lock file.
This switches the API url for publishing from `api/v1/crates/new` to
`api/v2/crates/new` if the package contains the new features syntax.

This is necessary because if the registry does not know about the new
feature syntax, it will (likely) silently accept the uploaded package
without the `features2` field, which will end up creating an index
with an invalid entry (missing features).
This adds a `v` field which indicates a format version for an index
entry. If Cargo encounters a version newer than it understands,
it will ignore those entries. This makes it safer to make changes
to the index entries (such as adding new things), and not need to
worry about how older cargos will react to it.

This version field is also part of the publish API.

Version `2` indicates that it uses the new features field.
@rust-highfive
Copy link

r? @alexcrichton

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 29, 2021
@ehuss
Copy link
Contributor Author

ehuss commented Jan 29, 2021

I'm not too happy with this. So...which seems worse?

  • Method A (existing behavior): Leave the new features in the features field, and let 1.18 and older break completely (even if they have a Cargo.lock) if they use a package that starts using the new syntax. 1.19 and newer will ignore any package published with the new features.
    • The workaround for this is to use a local index with a snapshot in time before the new features are introduced.
  • Method B (this PR): Add the new features2 field, which allows all versions with a pre-existing Cargo.lock to work correctly. However, updating the lock file in versions 1.51 and older may potentially select entries with the new features, which will probably result in something that won't build correctly (because they ignore the features2 field).
    • The workaround for this is to use the --precise flag of cargo update to force it to select older versions before features2 was introduced if you need to generate a Cargo.lock with a package that starts using features2 in a newer version.

@ehuss
Copy link
Contributor Author

ehuss commented Jan 29, 2021

This includes a test which is ignored by default which can test the behavior of every toolchain. Here's what its output looks like:

Version Unlocked Locked
1.0.0-x86_64-unknown-linux-gnu failed failed
1.1.0-x86_64-unknown-linux-gnu failed failed
1.2.0-x86_64-unknown-linux-gnu failed failed
1.3.0-x86_64-unknown-linux-gnu failed failed
1.4.0-x86_64-unknown-linux-gnu failed failed
1.5.0-x86_64-unknown-linux-gnu failed failed
1.6.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.7.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.8.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.9.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.10.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.11.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.12.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.13.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.14.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.15.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.16.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.17.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.18.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.19.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.20.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.21.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.22.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.23.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.24.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.25.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.26.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.27.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.28.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.29.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.30.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.31.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.32.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.33.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.34.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.35.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.36.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.37.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.38.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.39.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.40.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.41.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.42.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.43.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.44.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.45.2-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.46.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.47.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
1.48.0-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
stable-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
beta-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1
nightly-x86_64-unknown-linux-gnu success bar=1.0.2 success bar=1.0.1

This shows that without a Cargo.lock, every version will select a dependency with the new feature syntax (and will essentially build them as-if the features aren't activated). With a Cargo.lock, they will stick to whichever version they are locked to.

I'm unable to get test results for 1.0 through 1.5. For some reason they are unable to extract the tar files from the testsuite. I suspect it is something related to #2327, but my energy for investigating has run low.

@alexcrichton
Copy link
Member

Thanks for putting this all together! I think this all seems pretty reasonable for "let's keep 1.18 and prior building with a lockfile", but I've still got a few questions.

  • How come the publish API path is being altered here? Presumably you'd get a worse error if you publish the new features to a registry that doesn't support it, but nothing fundamental is changing, right?
  • Do you know if crates.io is ready to receive some updates for this PR, primarily in writing new index entries?
  • Thinking for a moment, perhaps a better publish API would be for Cargo to send the index entry to crates.io itself? Registries then wouldn't have to synthesize a new entry and we wouldn't have to update all registries every time our index format updates (which while presumably rare would be nice to not have to worry about too).

I personally continue to not really be all that interested in worrying about older Cargo's picking up newer versions of crates. I feel like misinterpreting Cargo features is the least of our worries in that respect. I suspect almost no crates published in the last year even compile with Rust 1.0 due to feature usage of libstd or the language. Basically what I mean to say is that we never really designed Cargo in mind with cargo update on Rust 1.0 continuing to work years into the future. So much of that is broken I don't think it's really worth worrying about.

I do think that it seems reasonable to have a version field in the index entry though. It's not harming anyone and it could help tooling later on down the road too.

@ehuss
Copy link
Contributor Author

ehuss commented Jan 29, 2021

How come the publish API path is being altered here? Presumably you'd get a worse error if you publish the new features to a registry that doesn't support it, but nothing fundamental is changing, right?

If you publish the new features2 field to the api/v1 path, the new field is ignored (at least by crates.io, I haven't checked other registry implementations). The registry would end up creating an entry in the index with incorrect features (all of the features2 entries would be missing). This would then result in broken behavior when you actually try to use the newly published package. I wanted to be more cautious here because there are increasingly more registry implementations, and we can't guarantee they will all be updated before this PR is merged (in fact, can pretty much guarantee none of them will).

Do you know if crates.io is ready to receive some updates for this PR, primarily in writing new index entries?

It is not. I have not started that work, I wanted to see how this goes first. I'm hoping it is not too difficult. I'm also uncertain when we should do that, since that is permanently committing to the new format, and I'm not sure how to decide when that's ready.

Thinking for a moment, perhaps a better publish API would be for Cargo to send the index entry to crates.io itself? Registries then wouldn't have to synthesize a new entry and we wouldn't have to update all registries every time our index format updates (which while presumably rare would be nice to not have to worry about too).

That sounds reasonable. I'll need to take a look at the crates.io code to see how well that will fit in.

older Cargo's picking up newer versions of crates

Yea, that seems like a reasonable stance. However, I was thinking that for a stabilization timeline, I'd like to see this PR in at least 3 versions back before stabilizing, to avoid too much frustration for people. That is, if this goes in 1.51, then the earliest it should be stabilized is 1.54. That allows people using stable-1 to continue to safely run cargo update. (Considering our speed at stabilizing things, I'm not too worried. 😄 )

I'm uncertain how quickly projects will jump on using this new syntax. Since it is an MSRV bump, I assume most will delay for a while, but I'd like to be a little cautious.

@alexcrichton
Copy link
Member

Oh I was imagining that if we don't change the publish path that we wouldn't send features2, we'd send everything in features. That means registries would either reject it due to being malformed (e.g. what I think crates.io would do) or pass it through to the index un-blemished. If we do add a features2 on publish to the registry I agree we need a new path, I just wasn't sure if we needed to do that.

Thinking for a moment, perhaps a better publish API would be for Cargo to send the index entry to crates.io itself? Registries then wouldn't have to synthesize a new entry and we wouldn't have to update all registries every time our index format updates (which while presumably rare would be nice to not have to worry about too).

That sounds reasonable. I'll need to take a look at the crates.io code to see how well that will fit in.

Actually on second though this may not be a great idea. I wouldn't want a malicious actor being able to place whatever they want into the index, so crates.io I think will want to maintain tight control over what goes into the index. That does mean, though, that if we bump the version and change the syntax we'll need to always update crates.io as well.

That is, if this goes in 1.51, then the earliest it should be stabilized is 1.54.

While I don't mind being conservative this does seem overly conservative to me. I don't feel this is really all that much different than a new language feature being stabilized. If a crate wants to work with stable-1 then everything it depends on has to have the same policy and no one would use this feature until it hits stable-1 anyway.

In any case though if you'd prefer to be more conservative I don't mind!

@ehuss
Copy link
Contributor Author

ehuss commented Feb 1, 2021

Actually on second though this may not be a great idea.

Oh, good point!

Let me know if you have any other questions. This does seem like a semi-major change, but I'm uncertain of what else to consider.

@alexcrichton
Copy link
Member

Here's a few more possible suggestions:

  • Perhaps a config.json field in the index could indicate whether the registry supports the v2 endpoint or not? That way if you publish to a custom registry (or crates.io, really) without it supporting it we'd have a first-class error rather than handling the 404.
  • For the v2 endpoint I'd probably drop the features2 field, and have the registry split it into the v2 format index entry since registries are writing the index entry anyway.

Other than that though I don't have much else on this either, so while I agree it's somewhat major it seems fine to go ahead and start rolling this out. This is unfortunately more work than I thought it would be to handle 1.19....

One option is to add the index entry versioning and we could add that ASAP to Cargo and even backport it to ensure at least that part hits stable as soon as it can?

@alexcrichton
Copy link
Member

I dunno I'm also a little wishy-washy on the versioning in the index entry. I'm not really sure it buys us anything unless we actually have a plan to drop Rust versions at some point. If we don't ever drop Rust versions then we're always stuck with 1.18 and prior. Similarly we're always stuck with Cargo versions that don't look for an index entry version field. Given our current trajectory it seems like we'll never make use of this version field since we'll always be handling those older versions?

@Eh2406
Copy link
Contributor

Eh2406 commented Feb 1, 2021

A version field lets us change the meaning of fields that are newer then when the version field was added.
Let's say we add a new sintaxe to features2 in cargo 1.99 but cargo 1.55 will crash or misunderstand that sintaxe. Then by bumping the version field we can tell that version to not try.

It is limited. It can't change the presence of existing fields (while we support 1.19). It can't change the meaning of existing fields (while we support version before it was added). But even the little slice of flexibility seems useful.

@alexcrichton
Copy link
Member

That's a good point yeah! While I'm not too sure how useful the ability will be in the future that I think convinces me that we may as well go ahead and implement it.

bors added a commit that referenced this pull request Feb 3, 2021
Add some documentation for index and registry stuff.

This adds some internal docs for index and registry things.

Split out of #9111.
bors added a commit that referenced this pull request Feb 3, 2021
Add RegistryBuilder for tests, and update crates-io error handling.

This adds `RegistryBuilder` to the test suite to make it more flexible to create different registry setups, and to reuse code a little more easily.

This also makes a small adjustment to the registry API to add a `ResponseError` type to make it easier to work with API errors.  As part of this, some tests were added to validate the API behavior for response errors.  There are only a few very small changes here:
* Extra newlines are removed from the headers printed in the error message.
* The UTF-8 error now also includes the text "invalid response from server".
* The "file too large" crates.io publish error now displays the tarball size.  (There is no test for this because it is only issued for talking to `crates.io`.)

Split from #9111.
@bors
Copy link
Contributor

bors commented Feb 3, 2021

☔ The latest upstream changes (presumably #9125) made this pull request unmergeable. Please resolve the merge conflicts.

@ehuss ehuss removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 9, 2021
@ehuss
Copy link
Contributor Author

ehuss commented Feb 10, 2021

This has been split into several other PRs: #9125, #9126, #9161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants