NixOS · Ericson2314 · Jun 3, 2024 · May 31, 2024 · May 31, 2024 · May 31, 2024
@@ -0,0 +1,11 @@
+---
+synopsis: Store object info JSON format now uses `null` rather than omitting fields.
+prs: 9995
+---
+
+The [store object info JSON format](@docroot@/protocols/json/store-object-info.md), used for e.g. `nix path-info`, no longer omits fields to indicate absent information, but instead includes the fields with a `null` value.
+For example, `"ca": null` is used to to indicate a store object that isn't content-addressed rather than omitting the `ca` field entirely.
+This makes records of this sort more self-describing, and easier to consume programmatically.
+
+We will follow this design principle going forward;
+the [JSON guidelines](@docroot@/contributing/json-guideline.md) in the contributing section have been updated accordingly.
@@ -121,6 +121,7 @@
   - [Documentation](contributing/documentation.md)
   - [Experimental Features](contributing/experimental-features.md)
   - [CLI guideline](contributing/cli-guideline.md)
+  - [JSON guideline](contributing/json-guideline.md)
   - [C++ style guide](contributing/cxx.md)
 - [Releases](release-notes/index.md)
 {{#include ./SUMMARY-rl-next.md}}

@@ -389,88 +389,6 @@ colors, no emojis and using ASCII instead of Unicode symbols). The same should
 happen when TTY is not detected on STDERR. We should not display progress /
 status section, but only print warnings and errors.
 
-## Returning future proof JSON
-
-The schema of JSON output should allow for backwards compatible extension. This section explains how to achieve this.
-
-Two definitions are helpful here, because while JSON only defines one "key-value"
-object type, we use it to cover two use cases:
-
- - **dictionary**: a map from names to value that all have the same type. In
-   C++ this would be a `std::map` with string keys.
- - **record**: a fixed set of attributes each with their own type. In C++, this
-   would be represented by a `struct`.
-
-It is best not to mix these use cases, as that may lead to incompatibilities when the schema changes. For example, adding a record field to a dictionary breaks consumers that assume all JSON object fields to have the same meaning and type.
-
-This leads to the following guidelines:
-
- - The top-level (root) value must be a record.
-
-   Otherwise, one can not change the structure of a command's output.
-
- - The value of a dictionary item must be a record.
-
-   Otherwise, the item type can not be extended.
-
- - List items should be records.
-
-   Otherwise, one can not change the structure of the list items.
-
-   If the order of the items does not matter, and each item has a unique key that is a string, consider representing the list as a dictionary instead. If the order of the items needs to be preserved, return a list of records.
-
- - Streaming JSON should return records.
-
-   An example of a streaming JSON format is [JSON lines](https://jsonlines.org/), where each line represents a JSON value. These JSON values can be considered top-level values or list items, and they must be records.
-
-### Examples
-
-
-This is bad, because all keys must be assumed to be store types:
-
-```json
-{
-  "local": { ... },
-  "remote": { ... },
-  "http": { ... }
-}
-```
-
-This is good, because the it is extensible at the root, and is somewhat self-documenting:
-
-```json
-{
-  "storeTypes": { "local": { ... }, ... },
-  "pluginSupport": true
-}
-```
-
-While the dictionary of store types seems like a very complete response at first, a use case may arise that warrants returning additional information.
-For example, the presence of plugin support may be crucial information for a client to proceed when their desired store type is missing.
-
-
-
-The following representation is bad because it is not extensible:
-
-```json
-{ "outputs": [ "out" "bin" ] }
-```
-
-However, simply converting everything to records is not enough, because the order of outputs must be preserved:
-
-```json
-{ "outputs": { "bin": {}, "out": {} } }
-```
-
-The first item is the default output. Deriving this information from the outputs ordering is not great, but this is how Nix currently happens to work.
-While it is possible for a JSON parser to preserve the order of fields, we can not rely on this capability to be present in all JSON libraries.
-
-This representation is extensible and preserves the ordering:
-
-```json
-{ "outputs": [ { "outputName": "out" }, { "outputName": "bin" } ] }
-```
-
 ## Dialog with the user
 
 CLIs don't always make it clear when an action has taken place. For every

@@ -0,0 +1,128 @@
+# JSON guideline
+
+Nix consumes and produces JSON in a variety of contexts.
+These guidelines ensure consistent practices for all our JSON interfaces, for ease of use, and so that experience in one part carries over to another.
+
+## Extensibility
+
+The schema of JSON input and output should allow for backwards compatible extension.
+This section explains how to achieve this.
+
+Two definitions are helpful here, because while JSON only defines one "key-value" object type, we use it to cover two use cases:
+
+ - **dictionary**: a map from names to value that all have the same type.
+   In C++ this would be a `std::map` with string keys.
+
+ - **record**: a fixed set of attributes each with their own type.
+   In C++, this would be represented by a `struct`.
+
+It is best not to mix these use cases, as that may lead to incompatibilities when the schema changes.
+For example, adding a record field to a dictionary breaks consumers that assume all JSON object fields to have the same meaning and type, and dictionary items with a colliding name can not be represented anymore.
+
+This leads to the following guidelines:
+
+ - The top-level (root) value must be a record.
+
+   Otherwise, one can not change the structure of a command's output.
+
+ - The value of a dictionary item must be a record.
+
+   Otherwise, the item type can not be extended.
+
+ - List items should be records.
+
+   Otherwise, one can not change the structure of the list items.
+
+   If the order of the items does not matter, and each item has a unique key that is a string, consider representing the list as a dictionary instead.
+   If the order of the items needs to be preserved, return a list of records.
+
+ - Streaming JSON should return records.
+
+   An example of a streaming JSON format is [JSON lines](https://jsonlines.org/), where each line represents a JSON value.
+   These JSON values can be considered top-level values or list items, and they must be records.
+
+### Examples
+
+This is bad, because all keys must be assumed to be store types:
+
+```json
+{
+  "local": { ... },
+  "remote": { ... },
+  "http": { ... }
+}
+```
+
+This is good, because the it is extensible at the root, and is somewhat self-documenting:
+
+```json
+{
+  "storeTypes": { "local": { ... }, ... },
+  "pluginSupport": true
+}
+```
+
+While the dictionary of store types seems like a very complete response at first, a use case may arise that warrants returning additional information.
+For example, the presence of plugin support may be crucial information for a client to proceed when their desired store type is missing.
+
+
+
+The following representation is bad because it is not extensible:
+
+```json
+{ "outputs": [ "out" "bin" ] }
+```
+
+However, simply converting everything to records is not enough, because the order of outputs must be preserved:
+
+```json
+{ "outputs": { "bin": {}, "out": {} } }
+```
+
+The first item is the default output. Deriving this information from the outputs ordering is not great, but this is how Nix currently happens to work.
+While it is possible for a JSON parser to preserve the order of fields, we can not rely on this capability to be present in all JSON libraries.
+
+This representation is extensible and preserves the ordering:
+
+```json
+{ "outputs": [ { "outputName": "out" }, { "outputName": "bin" } ] }
+```
+
+## Self-describing values
+
+As described in the previous section, it's crucial that schemas can be extended with with new fields without breaking compatibility.
+However, that should *not* mean we use the presence/absence of fields to indicate optional information *within* a version of the schema.
+Instead, always include the field, and use `null` to indicate the "nothing" case.
+
+### Examples
+
+Here are two JSON objects:
+
+```json
+{
+  "foo": {}
+}
+```
+```json
+{
+  "foo": {},
+  "bar": {}
+}
+```
+
+Since they differ in which fields they contain, they should *not* both be valid values of the same schema.
+At most, they can match two different schemas where the second (with `foo` and `bar`) is considered a newer version of the first (with just `foo`).
+Within each version, all fields are mandatory (always `foo`, and always `foo` and `bar`).
+Only *between* each version, `bar` gets added as a new mandatory field.
+
+Here are another two JSON objects:
+
+```json
+{ "foo": null }
+```
+```json
+{ "foo": { "bar": 1 } }
+```
+
+Since they both contain a `foo` field, they could be valid values of the same schema.
+The schema would have `foo` has an optional field, which is either `null` or an object where `bar` is an integer.
@@ -12,7 +12,7 @@
   For how Nix uses content addresses, see:
 
     - [Content-Addressing File System Objects](@docroot@/store/file-system-object/content-address.md)
-    - [content-addressed store object](#gloss-content-addressed-store-object)
+    - [Content-Addressing Store Objects](@docroot@/store/store-object/content-address.md)
     - [content-addressed derivation](#gloss-content-addressed-derivation)
 
   Software Heritage's writing on [*Intrinsic and Extrinsic identifiers*](https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers) is also a good introduction to the value of content-addressing over other referencing schemes.
@@ -137,9 +137,12 @@
 
 - [content-addressed store object]{#gloss-content-addressed-store-object}
 
-  A [store object] whose [store path] is determined by its contents.
+  A [store object] which is [content-addressed](#gloss-content-address),
+  i.e. whose [store path] is determined by its contents.
   This includes derivations, the outputs of [content-addressed derivations](#gloss-content-addressed-derivation), and the outputs of [fixed-output derivations](#gloss-fixed-output-derivation).
 
+  See [Content-Addressing Store Objects](@docroot@/store/store-object/content-address.md) for details.
+
 - [substitute]{#gloss-substitute}
 
   A substitute is a command invocation stored in the [Nix database] that

@@ -24,9 +24,11 @@ Info about a [store object].
 
   An array of [store paths][store path], possibly including this one.
 
-* `ca` (optional):
+* `ca`:
 
-  Content address of this store object's file system object, used to compute its store path.
+  If the store object is [content-addressed],
+  this is the content address of this store object's file system object, used to compute its store path.
+  Otherwise (i.e. if it is [input-addressed]), this is `null`.
 
 [store path]: @docroot@/store/store-path.md
 [file system object]: @docroot@/store/file-system-object.md
@@ -37,28 +39,30 @@ Info about a [store object].
 These are not intrinsic properties of the store object.
 In other words, the same store object residing in different store could have different values for these properties.
 
-* `deriver` (optional):
+* `deriver`:
 
-  The path to the [derivation] from which this store object is produced.
+  If known, the path to the [derivation] from which this store object was produced.
+  Otherwise `null`.
 
   [derivation]: @docroot@/glossary.md#gloss-store-derivation
 
 * `registrationTime` (optional):
 
-  When this derivation was added to the store.
+  If known, when this derivation was added to the store.
+  Otherwise `null`.
 
-* `ultimate` (optional):
+* `ultimate`:
 
   Whether this store object is trusted because we built it ourselves, rather than substituted a build product from elsewhere.
 
-* `signatures` (optional):
+* `signatures`:
 
   Signatures claiming that this store object is what it claims to be.
   Not relevant for [content-addressed] store objects,
   but useful for [input-addressed] store objects.
 
-  [content-addressed]: @docroot@/glossary.md#gloss-content-addressed-store-object
-  [input-addressed]: @docroot@/glossary.md#gloss-input-addressed-store-object
+[content-addressed]: @docroot@/store/store-object/content-address.md
+[input-addressed]: @docroot@/glossary.md#gloss-input-addressed-store-object
 
 ### `.narinfo` extra fields
 

@@ -135,18 +135,37 @@ static std::regex shVarName("[A-Za-z_][A-Za-z0-9_]*");
 /**
  * Write a JSON representation of store object metadata, such as the
  * hash and the references.
+ *
+ * @note Do *not* use `ValidPathInfo::toJSON` because this function is
+ * subject to stronger stability requirements since it is used to
+ * prepare build environments. Perhaps someday we'll have a versionining
+ * mechanism to allow this to evolve again and get back in sync, but for
+ * now we must not change - not even extend - the behavior.
  */
 static nlohmann::json pathInfoToJSON(
     Store & store,
     const StorePathSet & storePaths)
 {
-    nlohmann::json::array_t jsonList = nlohmann::json::array();
+    using nlohmann::json;
+
+    nlohmann::json::array_t jsonList = json::array();
 
     for (auto & storePath : storePaths) {
         auto info = store.queryPathInfo(storePath);
 
-        auto & jsonPath = jsonList.emplace_back(
-            info->toJSON(store, false, HashFormat::Nix32));
+        auto & jsonPath = jsonList.emplace_back(json::object());
+
+        jsonPath["narHash"] = info->narHash.to_string(HashFormat::Nix32, true);
+        jsonPath["narSize"] = info->narSize;
+
+        {
+            auto & jsonRefs = jsonPath["references"] = json::array();
+            for (auto & ref : info->references)
+                jsonRefs.emplace_back(store.printStorePath(ref));
+        }
+
+        if (info->ca)
+            jsonPath["ca"] = renderContentAddress(info->ca);
 
         // Add the path to the object whose metadata we are including.
         jsonPath["path"] = store.printStorePath(storePath);