Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/manual/rl-next/store-object-info.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
synopsis: Store object info JSON format now uses `null` rather than omitting fields.
prs: 9995
---

The [store object info JSON format](@docroot@/protocols/json/store-object-info.md), used for e.g. `nix path-info`, no longer omits fields to indicate absent information, but instead includes the fields with a `null` value.
For example, `"ca": null` is used to to indicate a store object that isn't content-addressed rather than omitting the `ca` field entirely.
This makes records of this sort more self-describing, and easier to consume programmatically.

We will follow this design principle going forward;
the [JSON guidelines](@docroot@/contributing/json-guideline.md) in the contributing section have been updated accordingly.
1 change: 1 addition & 0 deletions doc/manual/src/SUMMARY.md.in
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@
- [Documentation](contributing/documentation.md)
- [Experimental Features](contributing/experimental-features.md)
- [CLI guideline](contributing/cli-guideline.md)
- [JSON guideline](contributing/json-guideline.md)
- [C++ style guide](contributing/cxx.md)
- [Releases](release-notes/index.md)
{{#include ./SUMMARY-rl-next.md}}
Expand Down
82 changes: 0 additions & 82 deletions doc/manual/src/contributing/cli-guideline.md
Original file line number Diff line number Diff line change
Expand Up @@ -389,88 +389,6 @@ colors, no emojis and using ASCII instead of Unicode symbols). The same should
happen when TTY is not detected on STDERR. We should not display progress /
status section, but only print warnings and errors.

## Returning future proof JSON

The schema of JSON output should allow for backwards compatible extension. This section explains how to achieve this.

Two definitions are helpful here, because while JSON only defines one "key-value"
object type, we use it to cover two use cases:

- **dictionary**: a map from names to value that all have the same type. In
C++ this would be a `std::map` with string keys.
- **record**: a fixed set of attributes each with their own type. In C++, this
would be represented by a `struct`.

It is best not to mix these use cases, as that may lead to incompatibilities when the schema changes. For example, adding a record field to a dictionary breaks consumers that assume all JSON object fields to have the same meaning and type.

This leads to the following guidelines:

- The top-level (root) value must be a record.

Otherwise, one can not change the structure of a command's output.

- The value of a dictionary item must be a record.

Otherwise, the item type can not be extended.

- List items should be records.

Otherwise, one can not change the structure of the list items.

If the order of the items does not matter, and each item has a unique key that is a string, consider representing the list as a dictionary instead. If the order of the items needs to be preserved, return a list of records.

- Streaming JSON should return records.

An example of a streaming JSON format is [JSON lines](https://jsonlines.org/), where each line represents a JSON value. These JSON values can be considered top-level values or list items, and they must be records.

### Examples


This is bad, because all keys must be assumed to be store types:

```json
{
"local": { ... },
"remote": { ... },
"http": { ... }
}
```

This is good, because the it is extensible at the root, and is somewhat self-documenting:

```json
{
"storeTypes": { "local": { ... }, ... },
"pluginSupport": true
}
```

While the dictionary of store types seems like a very complete response at first, a use case may arise that warrants returning additional information.
For example, the presence of plugin support may be crucial information for a client to proceed when their desired store type is missing.



The following representation is bad because it is not extensible:

```json
{ "outputs": [ "out" "bin" ] }
```

However, simply converting everything to records is not enough, because the order of outputs must be preserved:

```json
{ "outputs": { "bin": {}, "out": {} } }
```

The first item is the default output. Deriving this information from the outputs ordering is not great, but this is how Nix currently happens to work.
While it is possible for a JSON parser to preserve the order of fields, we can not rely on this capability to be present in all JSON libraries.

This representation is extensible and preserves the ordering:

```json
{ "outputs": [ { "outputName": "out" }, { "outputName": "bin" } ] }
```

## Dialog with the user

CLIs don't always make it clear when an action has taken place. For every
Expand Down
128 changes: 128 additions & 0 deletions doc/manual/src/contributing/json-guideline.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts: we should recommend that consumers ignore any fields they don't know.
I also like the GraphQL-like idea of specifying which attrs to expect, so that the consumer can safely use the whole value without anything unexpected occurring in it.
However, this seems less relevant (and unlikely to see much adoption) in the CLI; rather applies more to a fetchTree meta return attr (#8940 (comment)), where use of the whole object is actually relevant for reproducibility. Or as a solution for in-sandbox path info JSONs that isn't quite as capable as versioning, but does prevent over-fetching.

Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# JSON guideline

Nix consumes and produces JSON in a variety of contexts.
These guidelines ensure consistent practices for all our JSON interfaces, for ease of use, and so that experience in one part carries over to another.

## Extensibility

The schema of JSON input and output should allow for backwards compatible extension.
This section explains how to achieve this.

Two definitions are helpful here, because while JSON only defines one "key-value" object type, we use it to cover two use cases:

- **dictionary**: a map from names to value that all have the same type.
In C++ this would be a `std::map` with string keys.

- **record**: a fixed set of attributes each with their own type.
In C++, this would be represented by a `struct`.

It is best not to mix these use cases, as that may lead to incompatibilities when the schema changes.
For example, adding a record field to a dictionary breaks consumers that assume all JSON object fields to have the same meaning and type, and dictionary items with a colliding name can not be represented anymore.

This leads to the following guidelines:

- The top-level (root) value must be a record.

Otherwise, one can not change the structure of a command's output.

- The value of a dictionary item must be a record.

Otherwise, the item type can not be extended.

- List items should be records.

Otherwise, one can not change the structure of the list items.

If the order of the items does not matter, and each item has a unique key that is a string, consider representing the list as a dictionary instead.
If the order of the items needs to be preserved, return a list of records.

- Streaming JSON should return records.

An example of a streaming JSON format is [JSON lines](https://jsonlines.org/), where each line represents a JSON value.
These JSON values can be considered top-level values or list items, and they must be records.

### Examples

This is bad, because all keys must be assumed to be store types:

```json
{
"local": { ... },
"remote": { ... },
"http": { ... }
}
```

This is good, because the it is extensible at the root, and is somewhat self-documenting:

```json
{
"storeTypes": { "local": { ... }, ... },
"pluginSupport": true
}
```

While the dictionary of store types seems like a very complete response at first, a use case may arise that warrants returning additional information.
For example, the presence of plugin support may be crucial information for a client to proceed when their desired store type is missing.



The following representation is bad because it is not extensible:

```json
{ "outputs": [ "out" "bin" ] }
```

However, simply converting everything to records is not enough, because the order of outputs must be preserved:

```json
{ "outputs": { "bin": {}, "out": {} } }
```

The first item is the default output. Deriving this information from the outputs ordering is not great, but this is how Nix currently happens to work.
While it is possible for a JSON parser to preserve the order of fields, we can not rely on this capability to be present in all JSON libraries.

This representation is extensible and preserves the ordering:

```json
{ "outputs": [ { "outputName": "out" }, { "outputName": "bin" } ] }
```

## Self-describing values

As described in the previous section, it's crucial that schemas can be extended with with new fields without breaking compatibility.
However, that should *not* mean we use the presence/absence of fields to indicate optional information *within* a version of the schema.
Instead, always include the field, and use `null` to indicate the "nothing" case.

### Examples

Here are two JSON objects:

```json
{
"foo": {}
}
```
```json
{
"foo": {},
"bar": {}
}
```

Since they differ in which fields they contain, they should *not* both be valid values of the same schema.
At most, they can match two different schemas where the second (with `foo` and `bar`) is considered a newer version of the first (with just `foo`).
Within each version, all fields are mandatory (always `foo`, and always `foo` and `bar`).
Only *between* each version, `bar` gets added as a new mandatory field.

Here are another two JSON objects:

```json
{ "foo": null }
```
```json
{ "foo": { "bar": 1 } }
```

Since they both contain a `foo` field, they could be valid values of the same schema.
The schema would have `foo` has an optional field, which is either `null` or an object where `bar` is an integer.
7 changes: 5 additions & 2 deletions doc/manual/src/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
For how Nix uses content addresses, see:

- [Content-Addressing File System Objects](@docroot@/store/file-system-object/content-address.md)
- [content-addressed store object](#gloss-content-addressed-store-object)
- [Content-Addressing Store Objects](@docroot@/store/store-object/content-address.md)
- [content-addressed derivation](#gloss-content-addressed-derivation)

Software Heritage's writing on [*Intrinsic and Extrinsic identifiers*](https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers) is also a good introduction to the value of content-addressing over other referencing schemes.
Expand Down Expand Up @@ -137,9 +137,12 @@

- [content-addressed store object]{#gloss-content-addressed-store-object}

A [store object] whose [store path] is determined by its contents.
A [store object] which is [content-addressed](#gloss-content-address),
i.e. whose [store path] is determined by its contents.
This includes derivations, the outputs of [content-addressed derivations](#gloss-content-addressed-derivation), and the outputs of [fixed-output derivations](#gloss-fixed-output-derivation).

See [Content-Addressing Store Objects](@docroot@/store/store-object/content-address.md) for details.

- [substitute]{#gloss-substitute}

A substitute is a command invocation stored in the [Nix database] that
Expand Down
22 changes: 13 additions & 9 deletions doc/manual/src/protocols/json/store-object-info.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,11 @@ Info about a [store object].

An array of [store paths][store path], possibly including this one.

* `ca` (optional):
* `ca`:

Content address of this store object's file system object, used to compute its store path.
If the store object is [content-addressed],
this is the content address of this store object's file system object, used to compute its store path.
Otherwise (i.e. if it is [input-addressed]), this is `null`.

[store path]: @docroot@/store/store-path.md
[file system object]: @docroot@/store/file-system-object.md
Expand All @@ -37,28 +39,30 @@ Info about a [store object].
These are not intrinsic properties of the store object.
In other words, the same store object residing in different store could have different values for these properties.

* `deriver` (optional):
* `deriver`:

The path to the [derivation] from which this store object is produced.
If known, the path to the [derivation] from which this store object was produced.
Otherwise `null`.

[derivation]: @docroot@/glossary.md#gloss-store-derivation

* `registrationTime` (optional):

When this derivation was added to the store.
If known, when this derivation was added to the store.
Otherwise `null`.

* `ultimate` (optional):
* `ultimate`:

Whether this store object is trusted because we built it ourselves, rather than substituted a build product from elsewhere.

* `signatures` (optional):
* `signatures`:

Signatures claiming that this store object is what it claims to be.
Not relevant for [content-addressed] store objects,
but useful for [input-addressed] store objects.

[content-addressed]: @docroot@/glossary.md#gloss-content-addressed-store-object
[input-addressed]: @docroot@/glossary.md#gloss-input-addressed-store-object
[content-addressed]: @docroot@/store/store-object/content-address.md
[input-addressed]: @docroot@/glossary.md#gloss-input-addressed-store-object

### `.narinfo` extra fields

Expand Down
25 changes: 22 additions & 3 deletions src/libstore/parsed-derivations.cc
Original file line number Diff line number Diff line change
Expand Up @@ -135,18 +135,37 @@ static std::regex shVarName("[A-Za-z_][A-Za-z0-9_]*");
/**
* Write a JSON representation of store object metadata, such as the
* hash and the references.
*
* @note Do *not* use `ValidPathInfo::toJSON` because this function is
* subject to stronger stability requirements since it is used to
* prepare build environments. Perhaps someday we'll have a versionining
* mechanism to allow this to evolve again and get back in sync, but for
* now we must not change - not even extend - the behavior.
*/
static nlohmann::json pathInfoToJSON(
Store & store,
const StorePathSet & storePaths)
{
nlohmann::json::array_t jsonList = nlohmann::json::array();
using nlohmann::json;

nlohmann::json::array_t jsonList = json::array();

for (auto & storePath : storePaths) {
auto info = store.queryPathInfo(storePath);

auto & jsonPath = jsonList.emplace_back(
info->toJSON(store, false, HashFormat::Nix32));
auto & jsonPath = jsonList.emplace_back(json::object());

jsonPath["narHash"] = info->narHash.to_string(HashFormat::Nix32, true);
jsonPath["narSize"] = info->narSize;

{
auto & jsonRefs = jsonPath["references"] = json::array();
for (auto & ref : info->references)
jsonRefs.emplace_back(store.printStorePath(ref));
}

if (info->ca)
jsonPath["ca"] = renderContentAddress(info->ca);

// Add the path to the object whose metadata we are including.
jsonPath["path"] = store.printStorePath(storePath);
Expand Down
Loading