Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 36 additions & 20 deletions PURL-SPECIFICATION.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ The rules for each component are:
- **type**:

- The package ``type`` MUST be composed only of ASCII letters and numbers,
'.', '+' and '-' (period, plus, and dash).
period '.', plus '+', and dash '-'.
- The ``type`` MUST start with an ASCII letter.
- The ``type`` MUST NOT be percent-encoded.
- The ``type`` is case insensitive. The canonical form is lowercase.
Expand Down Expand Up @@ -176,25 +176,30 @@ The rules for each component are:

- **qualifiers**:

- The ``qualifiers`` string is prefixed by a '?' separator when not empty
- This '?' is not part of the ``qualifiers``
- This is a query string composed of zero or more ``key=value`` pairs each
separated by a '&' ampersand. A ``key`` and ``value`` are separated by the equal
'=' character
- These '&' are not part of the ``key=value`` pairs.
- ``key`` must be unique within the keys of the ``qualifiers`` string
- ``value`` cannot be an empty string: a ``key=value`` pair with an empty ``value``
is the same as no key/value at all for this key
- For each pair of ``key`` = ``value``:

- The ``key`` must be composed only of ASCII letters and numbers, '.', '-' and
'_' (period, dash and underscore)
- A ``key`` cannot start with a number
- A ``key`` must NOT be percent-encoded
- A ``key`` is case insensitive. The canonical form is lowercase
- A ``key`` cannot contain spaces
- A ``value`` must be a percent-encoded string
- The '=' separator is neither part of the ``key`` nor of the ``value``
- The ``qualifiers`` component MUST be prefixed by an unencoded question
mark '?' separator when not empty. This '?' separator is not part of the
``qualifiers`` component.
- The ``qualifiers`` component is composed of one or more ``key=value``
pairs. Multiple ``key=value`` pairs MUST be separated by an
unencoded ampersand '&'. This '&' separator is not part of an
individual ``qualifier``.

- A ``key`` and ``value`` MUST be separated by the unencoded equal sign '='
character. This '=' separator is not part of the ``key`` or ``value``.
- A ``value`` MUST NOT be an empty string: a ``key=value`` pair with an
empty ``value`` is the same as if no ``key=value`` pair exists for this
``key``.

- For each ``key=value`` pair:

- The ``key`` MUST be composed only of lowercase ASCII letters and numbers,
period '.', dash '-' and underscore '_'.
- A ``key`` MUST start with an ASCII letter.
- A ``key`` MUST NOT be percent-encoded.
- Each ``key`` MUST be unique among all the keys of the ``qualifiers``
component.
- A ``value`` MAY be composed of any character and all characters MUST be
encoded as described in the "Character encoding" section.


- **subpath**:
Expand All @@ -206,9 +211,11 @@ The rules for each component are:
in the canonical form
- Each ``subpath`` segment MUST be a percent-encoded string
- When percent-decoded, a segment:

- MUST NOT contain a '/'
- MUST NOT be any of '..' or '.'
- MUST NOT be empty

- The ``subpath`` MUST be interpreted as relative to the root of the package


Expand Down Expand Up @@ -486,3 +493,12 @@ License
~~~~~~~

This document is licensed under the MIT license

Definitions
~~~~~~~~~~~

[ASCII] See, e.g.,

- American National Standards Institute, "Coded Character Set -- 7-bit
American Standard Code for Information Interchange", ANSI X3.4, 1986.
- https://en.wikipedia.org/wiki/ASCII.
12 changes: 7 additions & 5 deletions faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Scheme

**QUESTION**: Can the ``scheme`` component be followed by a colon and two slashes, like a URI?

No. Since a ``purl`` never contains a URL Authority, its ``scheme`` should not be suffixed with double slash as in 'pkg://' and should use 'pkg:' instead. Otherwise this would be an invalid URI per RFC 3986 at https://tools.ietf.org/html/rfc3986#section-3.3::
**ANSWER**: No. Since a ``purl`` never contains a URL Authority, its ``scheme`` should not be suffixed with double slash as in 'pkg://' and should use 'pkg:' instead. Otherwise this would be an invalid URI per RFC 3986 at https://tools.ietf.org/html/rfc3986#section-3.3::

If a URI does not contain an authority component, then the path
cannot begin with two slash characters ("//").
Expand All @@ -24,9 +24,10 @@ For example, although these two purls are strictly equivalent, the first is in c

pkg://gem/[email protected]


**QUESTION**: Is the colon between ``scheme`` and ``type`` encoded? Can it be encoded? If yes, how?

The "Rules for each ``purl`` component" section provides that "[t]he ``scheme`` MUST be followed by an unencoded colon ':'.
**ANSWER**: The "Rules for each ``purl`` component" section provides that the ``scheme`` MUST be followed by an unencoded colon ':'.

In this case, the colon ':' between ``scheme`` and ``type`` is being used as a separator, and consequently should be used as-is, never encoded and never requiring any decoding. Moreover, it should be a parsing error if the colon ':' does not come directly after 'pkg'. Tools are welcome to recover from this error to help with malformed purls, but that's not a requirement.

Expand All @@ -37,10 +38,11 @@ Type
**QUESTION**: What behavior is expected from a purl spec implementation if a
``type`` contains a character like a slash '/' or a colon ':'?

The "Rules for each purl component" section provides that
**ANSWER**: The "Rules for each purl component" section provides that the
package ``type``

[t]he package ``type`` MUST be composed only of ASCII letters and numbers,
'.', '+' and '-' (period, plus, and dash)
MUST be composed only of ASCII letters and numbers, period '.', plus '+',
and dash '-'.

As a result, a purl spec implementation must return an error when encountering
a ``type`` that contains a prohibited character.