Skip to content
This repository has been archived by the owner on Nov 5, 2022. It is now read-only.

Request: Align validations between Cargo and Crates.io #41

Open
shepmaster opened this issue May 19, 2019 · 4 comments
Open

Request: Align validations between Cargo and Crates.io #41

shepmaster opened this issue May 19, 2019 · 4 comments

Comments

@shepmaster
Copy link
Member

After extensive testing locally and in CI, I finally went to publish to crates.io and received the error:

error: api errors (status 200 OK): invalid upload request: invalid value: string "futures-0.1", expected a valid feature name containing only letters, numbers, hyphens, or underscores at line 1 column 966

This is unfortunate as my multi-crate deploy is now in an unfinished state and I have to burn through a version number.

(@carols10cents told me to file this here)

@shepmaster
Copy link
Member Author

my multi-crate deploy is now in an unfinished state

Incidentally, it would be really nice if cargo publish --dry-run could somehow work for multi-crate publishes without publishing the dependencies.

@ehuss
Copy link

ehuss commented May 20, 2019

There are a number of open issues for better validation: rust-lang/cargo#4300 rust-lang/cargo#5554 rust-lang/cargo#4377 rust-lang/crates.io#7250.

I think there has been some previous talk about sharing some validation code. I'm of the opinion that it would be a good idea to create a new library for this purpose, which contains only very basic validation functions. I think it is unlikely Cargo and crates.io could share something like the Manifest definition, so it would probably just need to contain a few free functions and constants. It would also only validate very basic elements. For example, it would not validate category names, or crates.io's list of reserved crate names (which lives in a database).

Perhaps those limitations means it would not have much value. I would like to hear from others what they think.

@ehuss
Copy link

ehuss commented May 28, 2019

I did a survey of some of the validation done between the two. Of course cargo does lots of other validation that isn't applicable here (profiles, dependencies, etc.).

I think it is feasible to create a new package that the two projects share, that would have about 5 to 10 free functions defining some validation. I'm still not sure if it will provide significant maintenance improvements (weighed against the maintenance burden of publishing and coordinating a new package). There are also a lot of judgement calls listed below, as to how Cargo would treat the extra restrictions. Adding a warning to the publish/package step doesn't really help much (just makes --dry-run a little better, and feedback a little sooner). Happy to hear any thoughts here.


cargo

crates.io

Shareability

Package name

  • Not empty.
  • All characters `is_alphanumeric` or `_` or `-`.
  • NOTE: init/new has its own blacklist (rust keywords, reserved names) and cannot start with a digit
  • Not empty.
  • First character `is_alphabetic`.
  • All characters `is_alphanumeric` or `_` or `-`.
  • All characters are ASCII.
  • Max 64 characters.
  • Rejects reserved crate names (stored in database).

Difficult to share the extra crates.io validations.  Not sure why restrictions are in database and not in code. This needs careful consideration.

version

  • Must parse with `semver`
  • Must parse with `semver`

Not likely shareable, part of type definition.

authors

No restrictions.

  • Must have at least one non-empty string.

Possibly shareable as a publish/package warning?

description

  • Warns if not set.
  • Must be set.

Keep as-is? See https://github.com/rust-lang/cargo/issues/4840

homepage, documentation, repository

  • Warns if all 3 not set.
  • Optional, but if set must pass:
  • parse with `url` crate
  • must be `http` or `https`
  • must be a “base” url

Possibly shareable as a publish/package warning or error?

readme

  • If set, must point to an existing file.


N/A

keyword

No restrictions.

  • Not empty
  • First character `is_alphabetic`
  • All characters `is_alphanumeric` or `_` or `-`
  • All characters are ASCII
  • At most 20 characters.

Possibly shareable as a publish/package warning or error?

keywords

No restrictions.

  • At most 5 entries.

Possibly shareable as a publish/package warning or error?

category

No restrictions.

  • Unknown categories result in a warning (and are ignored).

Unlikely shareable. Maybe address with future “dry run” publish API?

categories

No restrictions.

  • At most 5 entries.

Possibly shareable as a publish/package warning or error?

license

  • Publish warns if neither license nor license_file is set.
  • Either license or license_file should be set.
  • If license is set, must validate with `license_exprs` crate.

Possibly shareable as a publish/package warning or error? See also https://github.com/rust-lang/cargo/issues/2039 for discussion on license parsing. Also https://github.com/rust-lang/cargo/issues/3540

Feature name

  • Not empty
  • Not empty
  • All characters `is_alphanumeric` or `_` or `+` or `-`
  • All characters are ASCII

Possibly shareable, as a Cargo.toml hard error?  Depends on if packages must be ASCII?

Feature list entry

  • Entries must be other features or dependencies.
  • Slash parsing, namespaced crates, etc.
  • "feature_name" or "feature_name/feature_name"

Same as above.

Dependencies

  • Path dependencies must also specify `version`.
  • Version requirement must parse with `semver`.
  • `registry` not allowed
  • `*` not allowed
  • Must match an existing package name.
  • Version requirement must parse with `semver`

Unlikely to be shareable.

Badges

  • Each entry must be a table.
  • Invalid badge formats result in a warning (and are ignored).
  • See `Badge` enum definition for complete list.

Maybe shareable?  Warn on publish/package?  Wouldn’t really improve things except for `—dry-run`. See also https://github.com/rust-lang/cargo/issues/3576

@MaulingMonkey
Copy link

Sharing some thoughts from rust-lang/cargo#7256 :

  • Feature names, in particular, seem to be a reoccuring footgun, causing burned version numbers and long histories of unpublishable feature names that must be reworked - even if you've been using cargo publish --dry-run before a more public release.
  • alexcrichton had some concerns about validating directly in cargo - namely, that it could lag behind what crates.io accepts. This isn't 100% solved by a shared library either.
  • Adding a crates.io endpoint that cargo publish --dry-run can do a test upload to, would solve that problem, ensuring we're always validating against the latest rules, and moot the need for refactoring logic out into a new library to boot.

@ehuss ehuss mentioned this issue May 8, 2020
bors added a commit to rust-lang/crates.io that referenced this issue May 15, 2020
Permit '+' in feature name, as in "c++20"

Up front: I am familiar with the several discussions linked by rust-lang/crates-io-cargo-teams#41 (comment), and the conversation beginning at #1331 (comment). This PR is not motivated by attempting to match whatever behavior Cargo currently has. Instead, it's a small thing I think we can decide now whether to allow. But it's necessary to say no corresponding Cargo change is required to accommodate this crates.io change.

This PR updates feature validation during publish to accept e.g. "c++20" and "dependency/c++20". We continue to not accept "c++20/feature" as the prefix before the slash would normally refer to a crate name of a dependency and a '+' would not be allowed in those.

I am interested in using such feature names in https://github.com/dtolnay/cxx.

In a Cargo.toml plusses appear as:

```toml
[features]
default = ["c++20"]
"c++20" = ["my-dependency/c++20"]
```

To give some slight further justification for why this particular ascii character above other possible characters we might allow: `+` is pretty common in OS package names in a way that no other currently disallowed character is. Some examples pulled arbitrarily from `apt-cache pkgnames | rg '\+'`:

- https://packages.debian.org/buster/dvd+rw-tools
- https://packages.debian.org/buster/elpa-ghub+
- https://packages.debian.org/buster/libarpack++2-dev
- https://packages.debian.org/buster/libdwarf++0
- https://packages.debian.org/buster/libelf++0
- https://packages.debian.org/buster/magics++
- https://packages.debian.org/buster/memtest86+
- https://packages.debian.org/buster/minisat+
- https://packages.debian.org/buster/paw++
- https://packages.debian.org/buster/swish++
- https://packages.debian.org/buster/vera++
- https://packages.debian.org/buster/voro++

The actual names of the projects contain `+`; various ones in the descriptions in the above links are styled as ARPACK++, Memtest86+, Magics++, Paw++, MinSat+, SWISH++, Vera++, Voro++. When binding to something like this behind a feature, using `+` in the feature name is the most intuitive.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants