Skip to content

lib/types.nix: implement getSubOptions for either#422272

Open
lilyball wants to merge 7 commits intoNixOS:masterfrom
lilyball:push-sxtxlvwouuno
Open

lib/types.nix: implement getSubOptions for either#422272
lilyball wants to merge 7 commits intoNixOS:masterfrom
lilyball:push-sxtxlvwouuno

Conversation

@lilyball
Copy link
Member

@lilyball lilyball commented Jul 4, 2025

This implements getSubOptions, getSubModules, and substSubModules for the either type. This allows us to generate documentation for a bunch of options that were getting missed before.

Because either is the key to making recursive types, I had to be careful about the implementation here to avoid breaking on a type like (pkgs.formats.json {}).type. To that end, this implementation only looks at child types if they're submodule or another either. This does mean types like either str (listOf submodule) will still be missed, but I'm not sure how else to handle this besides recursively searching all nested types to see if any are equal (==) to the current type, and I don't like that solution. I also considered simply stopping on attrsOf and listOf and allowing other children (this would allow e.g. either str (nullOr submodule)) but I thought that was sufficiently fragile. And I considered looking for any recursively-defined type in nixpkgs and overriding getSubOptions/getSubModules explicitly, but that would mean types defined outside of nixpkgs could still be broken by this.

For the case of either submodule submodule I decided that getSubOptions should merge all returned options, but getSubModules and substSubModules should ensure only one child actually has submodules. I don't expect this to matter because a type like that isn't really going to work, this logic makes more sense for something like either submodule (attrsOf submodule) except for the fact that we won't recurse into the attrsOf (to avoid problems on recursive types).

With this change, I found a number of options whose documentation were missing or broken. Fixes for each of those modules are included as separate commits. Most of them were pretty straightforward, but the tor module had a lot of generated options and it wasn't always clear what the best way to handle it was.

I tested this with nix-build nixos/release.nix -A manual.aarch64-linux, I'm not sure if there's any other documentation set that I should have tested too. I also didn't touch the lib tests, I'm not sure if this change should have an explicit test written for it, but I did make sure lib/tests/modules.sh passes. I'm also not sure if this change warrants a release note.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • Nixpkgs 25.11 Release Notes (or backporting 25.05 Nixpkgs Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
  • NixOS 25.11 Release Notes (or backporting 25.05 NixOS Release notes)
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other contributing documentation in corresponding paths.

Add a 👍 reaction to pull requests you find important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure about this one, from my reading of the module it looks like flags isn't supposed to be set by users directly, it's calculated from the per-flag attributes.

@nix-owners nix-owners bot requested review from hsjobeki, infinisil, peti and roberth July 4, 2025 03:56
@nixpkgs-ci nixpkgs-ci bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 6.topic: module system About "NixOS" module system internals 6.topic: lib The Nixpkgs function library 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. labels Jul 4, 2025
@lilyball
Copy link
Member Author

lilyball commented Jul 7, 2025

A few thoughts on this PR:

  • I could split this up into several separate PRs, since the option documentation fixes don't depend on the either change (they just don't affect anything without it)
  • Because this change really only affects documentation (and nixos-option) maybe it is safe enough to go ahead and do the naive recursion and just fix nixpkgs's recursive types, since any third parties that define their own recursive types will only be affected if they also build documentation (though they would still hit the issue if they use nixos-option to inspect any option using their recursive type). I haven't tested to see if this would pick up any more options though, so maybe I should find that out.
  • home-manager and nix-darwin might want to test their own documentation against this PR to see if they have any options that need documentation fixes.

@lilyball
Copy link
Member Author

lilyball commented Jul 7, 2025

I tested out the "make either fully recursive" option and while it does add more options, they're entirely within services.tor, such as services.tor.settings.ControlPort.*.GroupWritable, and they're all ultimately of type oneOf [ ... (listOf (oneOf [ ... submodule ]))]. The downside is I had to update 12 recursive types in formats plus another 14 modules that declared their own custom recursive types. That's a lot more recursive types than I expected to find.

Alternatively this also suggests we could just update the 2 affected types in services.tor (by overriding getSubOptions/getSubModules/substSubModules to defer directly to the submodule) to get the same effect without making either fully recursive. There may be options in home-manager or nix-darwin that would also benefit from the fully-recursive either, but the fact that in all of nixpkgs it's just 2 of the types in services.tor that are affected suggests that a construct like either (listOf submodule) … (where the submodule isn't also at the top level of the either) is sufficiently rare that it may not show up anywhere else at all.

Edit: actually, the fully recursive change hurts some of the tor documentation because it modifies services.tor.settings.DNSPort sub-options so that e.g. services.tor.settings.DNSPort.IsolateClientAddr is now services.tor.settings.DNSPort.*.IsolateClientAddr, which happens because the either submodule (listOf submodule) now overrides the left's options with the right. So if we do want fully recursive we need to be smarter about merging options from both sides, though also this suggests that my getSubOptions impl might want to do t2.getSubOptions prefix // t1.getSubOptions prefix just to prioritize options from the left side.

Also we can't just trivially update those two types in services.tor as an alternative, because of how substSubModules works it's a lot more complicated to modify those types to defer to the submodule.

@lilyball
Copy link
Member Author

lilyball commented Jul 7, 2025

Another thing to consider here is either module (listOf module) such as services.tor.settings.TransPort, which really should generate option documentation for both the direct sub-options and the listOf sub-options, e.g. both services.tor.settings.TransPort.IsolateClientAddr and services.tor.settings.TransPort.*.IsolateClientAddr. But right now this can't happen because even though listOf adds "*" to the prefix, this isn't reflected in the returned attrset, so there's no way to merge the sub-options from the module and the listOf module, unless I want to change either to return something like { left = t1.getSubOptions prefix; right = t2.getSubOptions prefix; } and that has negative consequences for nixos-option.

With that context, we could consider changing listOf and attrsOf to put the placeholder into the returned attrset, e.g. having listOf return { "*" = elemType.getSubOptions (prefix ++ [ "*" ]); }. That way this can be merged with the non-list version of the same module. The downside is nixos-option would have to be updated to expect this.

@lilyball lilyball force-pushed the push-sxtxlvwouuno branch from 6fe4329 to 21241df Compare July 7, 2025 11:04
@lilyball
Copy link
Member Author

lilyball commented Jul 7, 2025

The changes to either I just pushed are:

  • getSubOptions now uses lib.recursiveUpdateUntil so either moduleA moduleB where the two submodules define distinct options nested within the same key will work, e.g. either (submodule [{options.foo.bar = mkOption { … }; }]) (submodule [{options.foo.qux = mkOption { … }; }])
  • getSubOptions now resolves conflicts in favor of the left type if both types define options
  • getSubModules now returns the left type's submodules if both types define submodules (this affects services.postgrey)
  • substSubModules similarly now substitutes the left type's submodules if both types define submodules

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 26, 2025
@lilyball
Copy link
Member Author

Rebased to fix merge conflict (conflict was due to the tor module being formatted)

@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Aug 11, 2025
Copy link
Contributor

@hsjobeki hsjobeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you split the changes in types.nix and non-lib related things into a speerate PR? I.e. the fixes on some modules should be seperated. This is a common practise for us to avoid mixing lib changes with nixos fixes. I'll spend a bit more time these days looking at this. I fear that the changes in nixos might be due to a breaking change in the either type. Which me must preserve for downstream users. Moving the lib change out of this PR would make it more clear to us.

@roberth
Copy link
Member

roberth commented Aug 19, 2025

Mixing isn't so bad by itself.
The reason we usually ask this is that lib changes tend to require very thorough scrutiny, which shouldn't block other improvements even if they're somewhat related, which seems to be the case here.

@lilyball
Copy link
Member Author

I certainly can split this up, though the non-lib changes here are all fairly meaningless without the lib changes, as those documentations simply aren't rendered without the lib changes. So there's no problem with "blocking" those changes on the lib changes.

If I split this up, would it be better to do one PR for all the non-lib stuff, or a separate PR for each module (e.g. turning each commit into its own PR)?

@lilyball
Copy link
Member Author

lilyball commented Sep 6, 2025

I fear that the changes in nixos might be due to a breaking change in the either type.

To clarify on this point, the changes in nixos are to fix code that is currently broken but that brokenness wasn't noticed because the options aren't rendered into documentation until the lib change. Some of it is documentation where the nix code is simply wrong (e.g. the dconf module said example = literalExpression ''…'' without importing that, so the fix is to say example = lib.literalExpression ''…''), some of it fixes the documentation to conform to the restrictions in place when building the manual (pay-respects used a deprecated function and had a default key that referenced something from another module without using defaultText, rspamd also did the latter), and some was just options that were missing descriptions (movim and tor).

All of this stuff could be detected by using the repl to poke around at the module structure, but it went unnoticed because none of these options ended up in the documentation manual before. So technically this either change could be a breaking change just by exposing buggy code to documentation rendering, but getting those options in the rendered documentation is the whole point of this change.

Copy link
Contributor

@hsjobeki hsjobeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into this for a short bit i am not sure if implementing getSubOptions for either is the right approach to solve the documentation problem. getSubOptions is only meant to documentation to automatically traverse options and types via `optionAttrSetToDocList´ with either (and derived types such as oneOf) it is possible to create recursive types.

In case of those reursive types there is no formally correct way to stop / break on a certain point. Thats why we didn't implement getSubOptions historically i think.

Even before this PR either was already broken on recursive types - take this simple recursive type for example:

  simpleJson = types.oneOf [  # renders into a tree of either
    types.str
    (types.attrsOf simpleJson)
    (types.listOf simpleJson)
  ];

#... a module 
{
  options.foo = mkOption {
    type = simpleJson;
  };
}

This would fail to produce docs, because either has a recursive description, which would yield an infinitely long string in this case.

The actual JSON type solves this by overriding the description (this is also not ideal):

          valueType =
            nullOr (oneOf [
              #...
            ])
            // {
              description = "JSON value";
            };

We could maybe think about other solutions such as providing documentation hints, instead of trying to smartly recurse an infinite structure.

Some ideas how to solve this problem: Add something like a docsModel / docsType which requires to be finite and is provided through the interface while creating the type?

@lilyball
Copy link
Member Author

@hsjobeki

Even before this PR either was already broken on recursive types

For the description, yeah, which is why any recursive types have to override that. nixpkgs already has a bunch of definitions of recursive types and they all override the description. I don't really see any alternative to this for recursive types, you must provide a description no matter how you construct it because any automatically-constructed description will be infinite.

And I'm not sure why you said "Even before this PR", as this PR doesn't introduce any new brokenness. It restricts the getSubOptions implementation to recurse only when it's provably safe to do so; any infinite type that's actually usable in practice requires some sort of container type, such as a listOf or attrsOf, and so we can have either recurse as long as it stops on one of those. But since we can't actually say "stop on those two", we can only control the next step after the either, that's why I went with the current implementation of recursing only into another either (which means we're still controlling the recursion) or into submodule. And you cannot construct a recursive type with just either and submodule without also requiring an infinitely-sized value (and recursing infinitely to check that value's type), so that's fine.

Some ideas how to solve this problem: Add something like a docsModel / docsType which requires to be finite and is provided through the interface while creating the type?

Can you please explain what you mean by this? Please bear in mind the actual thing this PR aims to solve is submodules that are currently being omitted from documentation because they're part of finite types that use either. I did consider the idea that either should just unconditionally recurse and that any recursively-defined types need to break that recursion themselves, and it turns out that within nixpkgs alone there are 26 such types that all need to do something like

       type =
         let
           valueType =
             nullOr (oneOf [
               bool
               int
               float
               str
               path
               (attrsOf valueType)
               (listOf valueType)
             ])
             // {
               description = "JSON value";
+              getSubOptions = prefix: {};
+              getSubModules = null;
+              substSubModules = m: null;
             };
         in
         valueType;

That's a lot of recursive types to update just to make either recursive, plus any types defined outside of nixpkgs would break too.

The other option here is to say that any type that uses either and has submodules needs to instead use some other variant, some either' that does recurse, but that approach means we'll continue to have a lot of undocumented options as people just don't realize their submodules aren't getting documented properly.

I have considered introducing an unconditionally-recursive either' type just to fix the remaining options in the tor module that are still missing with this PR, but I haven't actually proposed that because I feel like if the tor module is the only module that actually has this problem, perhaps the tor module needs to have its own scoped solution (e.g. having the tor module define such an either' type itself).

@hsjobeki
Copy link
Contributor

hsjobeki commented Oct 20, 2025

@hsjobeki

Even before this PR either was already broken on recursive types

And I'm not sure why you said "Even before this PR", as this PR doesn't introduce any new brokenness.

Hm i think i phrased this wrong. Given that either as you phrased it: is key to recursive types for finite values, this means the type might be infinite; we cannot really make good assumptions (unless and please correct me if i am wrong here ;) ) about an abort criteria when to terminate a potential tree-walk. This is what i referred to as broken behavior; and which might be the reason nobody has implemented getSubOptions for either yet.

For documentation generation in nixpkgs optionAttrSetToDocList is used; which internally traverses options via getSubOptions.
I didn't follow closely why we need to implement substSubModules and getSubModules (I guess this might be a good idea potentially)
Seems we lack enough testing of the either type :/ because that would change behavior.
Have you seen: mergeOptionDecls and fixupOptionType on that? I somehow struggle in my head how these three changes effect each other.

I'll try to start with some reasoning on getSubOptions:

Given the traversal function:

          getSubOptions =
            prefix:
            lib.recursiveUpdateUntil
              (
                _: a: b:
                lib.isOption a || lib.isOption b
              )
              (optionalAttrs (isEitherOrSubmodule t2) (t2.getSubOptions prefix))
              (optionalAttrs (isEitherOrSubmodule t1) (t1.getSubOptions prefix));

I have some questions:

  • How does it stop traversing and at what point?

For example

  moduleOrSelf = types.either (submodule {
    options.bar = mkOption {
      type = types.str;
      description = "A string option inside the submodule.";
    };
  }) (moduleOrSelf) // {
    description = "Either a submodule or an attribute set of that.";
  };

This would not fail to generate docs prior to this change.
This would make this PR a breaking change to lib - because we can't know what types are composed using either in the wider ecosystem. Does that sound reasonable? Or am I being overly cautious?

  • How does it handle either submoduleA submoduleB (with overlapping / non-overlapping options)
    (Yes i saw your comment in the description; this might fall in the area that is underspecified / not tested of the module system)

For example

  types.either (submodule {
    options.bar = mkOption {
      type = types.str;
      description = "A string option inside the submodule.";
    };
    options.conflict = mkOption {
      type = types.str;
      description = "A string option inside the submodule.";
    };
  }) (submodule {
    options.foo = mkOption {
      type = types.str;
      description = "A string option inside the submodule.";
    };
    options.conflict = mkOption {
      type = types.int;
      description = "An integer option inside the submodule.";
    };
  });

It would probably show slightly wrong docs (as figured from testing), which is maybe better than currently - no docs.
So from that standpoint i tend to support this PR. We can try to push this idea forward if you added some more tests, that ensure this is a non-breaking change.

On the counterside merging options using recursiveUpdate could probably lead to suprises.

@hsjobeki
Copy link
Contributor

What i meant with docsModel / docsType.

Something along these lines:

# New type, or other solution that allows for user defined and formally correct abort criteria
recursiveJson = types.namedUnion "json" [
  types.str
  types.number
  (types.attrsOf (types.typePlug recursiveJson))  # Explicit recursion marker; docs generation returns the name, instead of the type
  (types.listOf (types.typePlug recursiveJson))   # Explicit recursion marker
];

This is just an idea; and is probably the other extreme, forcing Users to control their own recursive types, because the cannot safely be traversed.

For this PR, i would probably expect to add some tests, to ensure we don't cause downstream breakages.
Before going ahead i'd still like to ask @infinisil or @roberth as they are working much longer with the module system than me about their opinion on this pr.

@lilyball
Copy link
Member Author

How does it stop traversing and at what point?

The logic here only controls one step of the traversal. So the answer is that it will stop traversing as soon as it hits a child of either that is not itself submodule or either.

In your example moduleOrSelf type, it's true that it wouldn't fail to generate docs prior to this PR, but the type itself is already broken, because what it will do is recurse infinitely the moment you assign something other than an attrset, function, or path to the option. If you have

{
    options.someOpt = mkOption {
        type = moduleOrSelf;
    };
    config.someOpt = [ "foo" ];
}

This will end up recursing infinitely as the check function for the first nested submodule type fails, and so it falls back to the second type, which recurses and attempts the submodule again, etc. So the fact that this change will cause documentation generation to fail on this type doesn't matter, since the type is already broken.

because we can't know what types are composed using either in the wider ecosystem

This argument here is precisely why I went with the limited approach of only recursing when it's provably safe to do so, instead of unconditionally recursing and simply fixing all 26 infinite types in nixpkgs. Limiting the recursion to nested either and submodule types is provably safe because you cannot build an infinite type out of just those two without already being broken. As I said before, any infinite type that's actually usable and correct requires some container type in the infinite recursion (e.g. a list or attrset).

How does it handle either submoduleA submoduleB (with overlapping / non-overlapping options)

I believe the answer is that the option definition from submoduleA will win, though it's been a while at this point since I worked through that logic. This PR uses recursiveUpdateUntil to merge the subOptions from both types and it orders the update such that the first type "wins" conflicts.

We can try to push this idea forward if you added some more tests, that ensure this is a non-breaking change.

What sort of tests are you envisioning?


Something I did talk about in an earlier comment is that if you have a type that looks like

types.either (submodule {
  options.bar = mkOption {
    type = types.str;
    description = "A string option in the submodule.";
  };
}) (listOf (submodule {
  options.bar = mkOption {
    type = types.str;
    description = "A string option in the listOf submodule.";
  };
}))

then we should be able to generate docs for both of these, but we'll actually only generate them for one of them, and that's because the listOf getSubOptions looks like

getSubOptions = prefix: elemType.getSubOptions (prefix ++ [ "*" ]);

If instead we do

listOf = {getSubOptions = prefix: { "*" = elemType.getSubOptions (prefix ++ [ "*" ]); };
};
attrsWith = {getSubOptions = prefix: { "<${placeholder}>" = elemType.getSubOptions (prefix ++ [ "<${placeholder}>" ]); };
};

Then doing either submoduleFoo (listOf submoduleFoo) or either submoduleFoo (attrsOf submoduleFoo) will generate option documentations for both versions. I haven't made that change in this PR largely because we'd have to update nixos-option to understand this as well, and it might impact any third-party implementations of something like nixos-option (e.g. I actually have my own nixos-option implementation I wrote years ago and it would be affected by the change I'm proposing). So I'm inclined to make that change as a separate PR after we're done dealing with this one.

@lilyball
Copy link
Member Author

@hsjobeki

To make it non-breaking i would expect something like this to pass:

  recursivePayload =
    (types.either (submodule {
      options.child = mkOption {
        type = types.either recursivePayload (types.nullOr types.str);
        description = "A string option inside the submodule.";
      };
      options.payload = mkOption {
        type = types.either recursivePayload (types.nullOr types.str);
        description = "A string option inside the submodule.";
      };
    }) types.str)
    // {
      description = "recursivePayload";
    };

What would you actually expect for generated documentation for a type like this? Because it looks to me like generating documentation for this type should produce an infinite amount of docs, as it needs to document not just child, but child.child, and child.child.child, and child.child.child.child, and child.child.child.child.child, etc.

This type is not the sort of infinite type we've been talking about so far. This type is not going to hit a recursion loop when evaluating getSubOptions, but I would expect documentation generation to end up trying to walk the option tree forever. I don't think this sort of breakage is particularly useful to talk about, because a type like this is not of any practical use so I wouldn't expect to hit it, and because the way in which this is broken is still broken if we remove either and write this like

recursivePayload =
  types.submodule {
    options.child = mkOption {
      type = recursivePayload;
      description = "An option inside the submodule.";
    };
  };

This type here is like the previous one in that getSubOptions will evaluate just fine, but generating documentation for it will end up trying to generate infinitely nested children.

The closest thing I've seen to this in practice instead looks like

options.someOpt = mkOption {
  type = let
    submoduleType = submodule {
      options.child = mkOption {
        type = types.str;
        description = "A string option inside the submodule.";
      };
    };
  in types.either submoduleType (attrsOf submoduleType);
  description = "Option that's a submodule or attrsOf that submodule.";
};

And this type isn't recursie. Also today this type won't generate any docs for child at all, with this PR it will generate docs for someOption.child, and we're still missing someOption.<name>.child (and this type is the reason why I'm tempted to add an either' that recurses unconditionally, so this type can use that and get docs for both someOption.child and someOption.<name>.child.

@MattSturgeon
Copy link
Contributor

It is perfectly possible to create recursive types without either.

Sure, if you effectively reimplement either. Any recursive type that doesn't either use or reimplement either is rather useless and possibly broken. For example, let type = types.attrsOf type; is recursive without either, but it's pretty useless, the only type allowed is attsets. Or let type = listOf type;, where all you can do is nest lists, without any other type. Perhaps you can do something gimmicky with that, but it's not a type that's going to see any practical use.

I can see how you'd think this when dealing with fairly primitive types.

With submodules you can have fairly useful recursive types, where a sub-option or freeformType re-uses an outer type recursively. E.g.: https://github.com/nix-community/nixvim/blob/ecb75f49d10fe2823b0822e4e95e53f80e426742/docs/modules/page.nix#L12-L23

I imagine attrTag and other non-trivial types may have useful cases for recursion.

What would you actually expect for generated documentation for a type like this? Because it looks to me like generating documentation for this type should produce an infinite amount of docs, as it needs to document not just child, but child.child, and child.child.child, and child.child.child.child, and child.child.child.child.child, etc.

Correct. That example would need the sub-options to be declared with visible = "shallow" to be useful.

This type is not the sort of infinite type we've been talking about so far. This type is not going to hit a recursion loop when evaluating getSubOptions, but I would expect documentation generation to end up trying to walk the option tree forever.

Given getSubOptions is primarily intended1 as a way for documentation to get sub-options (or at least, a representation of the sub-options suitable for documentation), I'm not sure the distinction you're drawing is especially useful. The docs walking the option tree forever is the main infinite recursion scenario I'm concerned about.

The only reason to do this is to allow for unconditional recursion in this new type. [...] The name I'd use for this is either'.

Maybe we do need a new type to have general recursion support. Incidentally, nixvim has been using an overlay that replaces either with one that supports recursive sub-options.

I don't think either' is a good name, though. Perhaps eitherRec or recursiveEither better convey its purpose. Or an eitherWith {} that takes configuration settings and is used to construct the general-use either. We'd also want to consider how this affects oneOf.

Footnotes

  1. Note, for example, submodule sub-options are not necessarily the exact same options you see in a "real" configuration eval; they come from a limited eval that only includes modules embedded into to the submodule-type itself. Therefore, they exclude any modules added via option-definitions. You'll notice this subtlety if you try to inspect the sub-options' definitions. They also have a potentially different prefix, producing a potentially different option loc.

@lilyball
Copy link
Member Author

With submodules you can have fairly useful recursive types, where a sub-option or freeformType re-uses an outer type recursively. E.g.: https://github.com/nix-community/nixvim/blob/ecb75f49d10fe2823b0822e4e95e53f80e426742/docs/modules/page.nix#L12-L23

This is an interesting type. But it's not the type of recursive type I was talking about because it doesn't infinitely recurse in getSubOptions, it just has an infinite amount of options to document unless you break option collection. This type does disprove what I said in my other comment about the recursivePayload type though, since I claimed there that this sort of infinitely-nested-options isn't useful.

Given getSubOptions is primarily intended as a way for documentation to get sub-options (or at least, a representation of the sub-options suitable for documentation), I'm not sure the distinction you're drawing is especially useful. The docs walking the option tree forever is the main infinite recursion scenario I'm concerned about.

If you write something that produces an infinite tree of options, then it's not unreasonable to expect you to break that infinite series of options with something like visible = "shallow". Producing an infinite sea of options is not very common, but producing an infinite recursive type is something that keeps happening. My concern with this PR is being able to produce documentation for options that are not currently documented today, without causing infinite recursion on infinite types (types which, notably, don't actually contain any submodules).

It's certainly possible that someone has, out of tree, written something that would produce an infinite sea of options except it uses either, and so they didn't write visible = "shallow" since they weren't getting the options documented anyway, but I'd expect that to be rather rare if it exists at all, and such a type really should have visible = "shallow" anyway because without that, the code is expressing the intent that it document an infinite amount of options.

I don't think either' is a good name, though. Perhaps eitherRec or recursiveEither better convey its purpose.

What exactly do you think the purpose here is? Because the options that aren't being documented today, that this PR aims to produce documentation for, they aren't recursive. That's the thing, either is breaking documentation on finite types because it doesn't want to recurse infinitely in order to prove that an infinite type doesn't contain a submodule. This PR fixes most of those options, and the remaining ones are still not recursive, they just do something like either someSubmodule (attrsOf someSubmodule). The fact that these types are finite is exactly why I don't want to force users to call some eitherRec type to get them documented, why I want either to just work. And my suggested either' is just "we can't fix either enough to work for all finite options, so if you notice your option isn't getting documented, try using either' instead" (and within nixpkgs it's just the tor submodule that would use this). Ideally we'd flip this around and make either work for everything and say "use either' for infinite types that don't have submodules", but that would break all out-of-tree infinite types.

For the options that this PR has to fix because their documentation doesn't evaluate correctly, I'd wager the module authors didn't even realize those options weren't being documented. If we move ahead with a plan that requires users to switch to some alternative either variant for finite types to make documentation work, then we'll still have a bunch of undocumented options because people won't realize they need it.

@lilyball
Copy link
Member Author

After thinking about it a bit more, I've realized that a recursive submodule that doesn't set visible = "shallow" but uses either in its definition is no less already broken than a submodule that has an option with a description that throws an evaluation error but gets away with it because it's nested in an either. Both such types should be fixed, because the fact that they're currently undocumented is a limitation that we want to lift.

Which is to say, this type

childModule = types.submodule {
  options.foo = mkOption {
    type = types.str;
    description = "A string option";
  };
  options.child = mkOption {
    type = types.either types.str childModule;
    description = "A string or child submodule";
  };
};

is just as broken as this type:

someOption = mkOption {
  type = types.either types.str (types.submodule {
    options.foo = mkOption {
      type = types.str;
      description = functionThatDoesntExist "A string option";
    };
  };
  description = "A string or submodule";
};

Both of them get away with it today, both of them will have an issue with this PR, both of them are buggy and should be fixed regardless of this PR.

@hsjobeki
Copy link
Contributor

hsjobeki commented Oct 27, 2025

I don't think these two cases are equivalent:

  • functionThatDoesntExist
description = functionThatDoesntExist "...";

This is objectively broken code - a bug in users code.
The fact that this PR exposes users to their bug is due to laziness.

  • Recursive submodule
childModule = types.submodule {
  options.child = mkOption {
    type = types.either types.str childModule;
  };
};

valid, correct code

  • Type-checks all values correctly
  • Has a clear semantic meaning (recursive tree structure)
  • Works perfectly at runtime
  • Is (presumably) used in real-world configs

The only "problem" is that getSubOptions can't traverse it. But that's a limitation of the documentation system, not a bug in the type.

Proposed solutions:

For case 2, we have several options as we pointed out earlier:

  1. Add cycle/depth tracking to getSubOptions
  2. Add types.typePlug for users to mark recursion points
  3. The ideal solution would involve a migration of getSubOptions to something that can render recursive types properly. This is a generic solution for a generic problem: recursive types exist.

We shouldn't call correct types "broken" just because the documentation system doesn't handle them yet.

If we merge this PR right now it is a breaking change, which was promised to be avoided if possible.
Users of these types have documentation with limitations; after this PR, users of these types would not have documentation at all.


@roberth @infinisil @MattSturgeon: What if we try merging; and revert if anyone raises an issue?
It would be annoying for them though to figure out why their docs are broken and questionable if they find this PR

EDIT: I think this is probably a bad idea. It should be explored how a non-breaking way is feasible

@infinisil
Copy link
Member

Yeah unless we have good reason to believe nobody actually uses such recursive types that would be broken, we really should avoid such potentially breaking changes. It's really discouraging to be met with random infinite recursions for a NixOS update.

Here's a quick draft of a recursion-limit based, backwards/forwards-compatible getSubOptions:

diff --git a/lib/types.nix b/lib/types.nix
index 4e20492909c7..f6c3cd11c799 100644
--- a/lib/types.nix
+++ b/lib/types.nix
@@ -99,6 +99,12 @@ let
       }is accessed, use `${lib.optionalString (loc != null) "type."}nestedTypes.elemType` instead.
     '' payload.elemType;
 
+  # Before getSubOptions.v2, no recursion limit could be specified, so we need to assume one
+  subOptionsv1Limit = 5;
+  subOptions =
+    type: args:
+    if type.getSubOptions ? v2 then type.getSubOptions.v2 args else type.getSubOptions args.prefix;
+
   checkDefsForError =
     check: loc: defs:
     let
@@ -728,7 +734,20 @@ let
           emptyValue = {
             value = [ ];
           };
-          getSubOptions = prefix: elemType.getSubOptions (prefix ++ [ "*" ]);
+          getSubOptions = {
+            __functor =
+              self: prefix:
+              self.v2 {
+                inherit prefix;
+                limit = subOptionsv1Limit;
+              };
+            v2 =
+              { prefix, limit }:
+              subOptions elemType {
+                prefix = prefix ++ [ "*" ];
+                limit = limit - 1;
+              };
+          };
           getSubModules = elemType.getSubModules;
           substSubModules = m: listOf (elemType.substSubModules m);
           functor = (elemTypeFunctor name { inherit elemType; }) // {

Note that subOptionsv1Limit can be set quite high: We don't care if docs blow up, as long as no infinite recursion is produced and we're not missing documentation.

But yeah some explicit recursion type would probably be even better, though harder to get right.

@roberth
Copy link
Member

roberth commented Oct 28, 2025

  • either moduleA moduleB

This is unfortunately not a useful type.
Here's a doc update that explains it:

tl;dr it always picks moduleA and that's somewhat intentional.

@hsjobeki
Copy link
Contributor

hsjobeki commented Oct 29, 2025

I like @infinisil s approach, had something like this in mind

This goes in hand with changing optionAttrSetToDocList' and getSubOptions the return value should not only return options. To break recursion. Counting depth is 'easy' but the more correct solution is not far off from that i think and uses the same proposed interface.

It should allow types to return named models (i.e. Car) that allow for recursive Car references inside and outside of Car
our current tooling can simply ignore those for now. More sophisticated option rendering can be developed after that.

That would solve type recursion problems in general, either could be the first type to implement that new interface.

@lilyball
Copy link
Member Author

childModule = types.submodule {
  options.child = mkOption {
    type = types.either types.str childModule;
  };
};

valid, correct code

  • Type-checks all values correctly
  • Has a clear semantic meaning (recursive tree structure)
  • Works perfectly at runtime
  • Is (presumably) used in real-world configs

The only "problem" is that getSubOptions can't traverse it. But that's a limitation of the documentation system, not a bug in the type.

But it can traverse it. getSubOptions here will return something that looks like { options.child = { … }; }. It's just that the code that calls this will walk the returned options and call getSubOptions again. This isn't a limitation of the documentation system, it's a consequence of the fact that the user defined a module that has an infinite sea of options. Asking to document this module means asking to document the option child, and the option child.child, and the option child.child.child, and the option child.child.child.child, etc.

If you don't try to build documentation, then this type will work perfectly fine as well. There's no infinite recursion when poking at the type. It's only infinite recursion when trying to build documentation, as it gathers the options and finds an infinite amount of them.

  1. Add cycle/depth tracking to getSubOptions

This doesn't fix anything, because getSubOptions isn't recursing infinitely!

The infinite recursion problem with getSubOptions is if we define either to recurse unconditionally, instead of only recursing into either and submodule, and you call this on a recursive type that doesn't contain submodules. It's a very different problem than what you've been talking about. The infinite recursion in getSubOptions is when calling it on a type that doesn't have options, and we risk recursing infinitely to discovery that. I do think it would be good to have a solution that would allow for actually finding the options in either str (attrsOf someSubmodule) without causing infinite recursion on formats.json, but that's not a reason to block this PR. And, importantly, doing this doesn't do anything to solve the problem of trying to document an infinite amount of options.

If your worry is an infinitely recursive submodule that generates an infinite amount of options without using visible = "shallow" then some sort of recursion limit could be added to optionAttrSetToDocList instead. But I still think it's true that anyone that writes a recursive submodule without visible = "shallow" has broken code and shouldn't be surprised when generating the manual recurses infinitely in optionAttrSetToDocList.

We shouldn't call correct types "broken" just because the documentation system doesn't handle them yet.

A recursive submodule that doesn't set visible = "shallow" isn't correct. It's not that the documentation system doesn't handle them yet, it's that asking to document an infinite amount of options is obviously going to recurse infinitely.


I think I have a better example for the equivalence in broken code here. A recursive submodule without visible = "shallow" is like writing

options.someOpt = mkOption {
  type = types.either types.str (types.submodule {
    options.child = mkOption {
      type = types.str;
    };
  });
}

This type is missing the description of the nested option. With nixpkgs today this is fine because the option will get omitted from docs, but with this PR this will end up producing an error when building the docs, even though the option is perfectly usable. And similarly, a recursive submodule that uses either in its definition will work fine today because it will get skipped in docs, but with this PR the documentation build will start failing with infinite recursion while collecting options even though the submodule is perfectly usable otherwise. In both cases the user forgot to specify something (the description in this example, or visible = "shallow" for the recursive submodule) and that will cause documentation generation to fail.

If we say "we cannot risk breaking any users" then that means we simply cannot fix either to allow for documenting options that get skipped today, because there is nothing we can possibly do that will protect us from users that have documentation issues with options that are currently being skipped. If we instead say we don't want to break any users that have written options that are perfectly correct, then this PR is fine because it doesn't break those users. Recursive types are perfectly fine with this PR, that's the whole reason it limits its recursion to only either and submodule, getSubOptions will not recurse infinitely. The only problem is when a user has created an infinitely recursive set of options without doing anything to prevent that infinite set of options from being documented (e.g. visible = "shallow"), and those users shouldn't be surprised when it breaks.

@hsjobeki
Copy link
Contributor

I think to move forward with this PR we need to acknowledge the stability guarantees of lib

Before:

  • Recursive either types work fine
  • Their options are skipped in documentation (because either lacks getSubOptions)
  • Users have no way to know this is "wrong"

After

  • These same types now cause infinite recursion during doc generation -> very hard to debug
  • This affects all downstream users (home-manager, nix-darwin, private configs) who may have recursive types

lib has strong stability guarantees. A change that causes previously-working code to fail is a breaking change.
The fact that these types were never documented doesn't make them broken; It means they worked within the system's limitations.

To respect lib's stability guarantees, we must consider at least:

  • Recursion detection/limitation in optionAttrSetToDocList
    • Add depth limit or cycle detection
    • Throw clear error: "Recursion limit reached at 'path.to.option'. Add 'visible = \"shallow\"' to prevent infinite option tree." (or similar)
    • This prevents hard-to-debug infinite recursions

Or/and:

  • Proper support for recursive types. (Can be a follow up)

You wrote:

those users shouldn't be surprised when it breaks

But they will be surprised:

  • visible = "shallow" is not documented as a requirement for recursive submodules
  • The module system doesn't warn/trace about recursive types anywhere.
  • Many users don't build documentation regularly and will discover this later. Likewise finding this PR and applying the correct fix when getting an infinite recursion is pretty hard.
  • Downstream projects (home-manager, nix-darwin, private configs) will break without any warnings

As said: at minimum, this PR should include:

  • Recursion limit in optionAttrSetToDocList with clear error message

Without these safeguards, this PR violates lib's stability guarantees and will cause surprise breakage in the ecosystem.


This is a blocker for merging

Can you add a recursion depth check to optionAttrSetToDocList as part of this PR? That would turn infinite recursion into a clear error message which is a much better user experience. We can also possibly catch that with tryEval for backwards compatibility.

@lilyball
Copy link
Member Author

I'm just returning to this now.

@infinisil What infinite recursion are you trying to solve here? This PR doesn't introduce infinite recursion in getSubOptions, it restricts the recursion such that that won't happen (with this PR, the only option types that will cause getSubOptions to recurse infinitely will also cause the check function to recurse infinitely if given a value that should fail, e.g. let foo = either str foo will recurse infinitely if you try to set this to a number, so it's already a broken type). Putting limits on infinite recursion is only useful if we're going to go for unrestricted recursion, which is certainly worth exploring but I'm not convinced it won't have a performance problem (a depth limit doesn't account for recursive types that recurse multiple times, e.g. having both listOf rootType and attrsOf rootType, those make the recursion exponential instead of linear).

This PR does have the potential issue of a recursive submodule that doesn't declare visible = "shallow" and got away with it because it was being omitted from docs before, but a recursion limit in getSubOptions doesn't do anything to change that. @hsjobeki you said "The only "problem" is that getSubOptions can't traverse it.", except it can, it's optionAttrSetToDocList that fails here not getSubOptions. And the solution to this is to write visible = "shallow" on the submodule. A recursion limit just lets us proceed with documentation generation in the face of such a submodule, but it doesn't actually fix that type, and we'll end up generating documentation for N levels of recursive options, which isn't good.

What's more, recursive submodules are pretty rare (or at least, recursive submodules that use either in the recursion), but just having bugs in the definition of options that are omitted from doc generation is unfortunately not that rare, as evidenced by the fact that this PR fixes 5 modules1 that have such bugs, but didn't have to fix any recursive submodules. We already have to accept that undocumented options may have bugs and there's no way to work around that, such bugs will cause doc generation to fail once those options start being included, and so I really don't see the big deal in saying that recursive submodules (that use either in the recursion) that forgot to specify visible = "shallow" will also end up needing to be fixed (assuming there even are any that will get hit by this).

Footnotes

  1. One module has actually fixed one of its bugs already, I need to rebase this PR, but it just replaced a deprecated function. I think that module will still fail because it doesn't have a defaultText and the default references a config value from another module, which is actually a perfect example of a perfectly-written option that still fails documentation generation and nothing we do will avoid that failure.

Also implement getSubModules and substSubModules. Because either is key
to any recursively-defined type (such as `(pkgs.formats.json {}).type`),
we can't just blindly recurse into both sides or we'll hit infinite
recursion on a recursive type. To that end, this implementation only
recurses into a child type if that type is submodule or another either.
This means we'll generate documentation for `either str submodule` but
not for `either str (attrsOf submodule)`.

This change allows us to generate documentation for a bunch of options
that are currently being missed.
The `example` key of a couple of options didn't evaluate, and this went
unnoticed as this option wasn't part of generated documentation before.
The `default` references a config value from another module, so it needs
`defaultText` as well.
Option descriptions for a submodule were referencing config values from
the parent module. This isn't allowed when building the manual, so just
delete those references.
@lilyball
Copy link
Member Author

Rebased, and added a new commit to account for a new option that broke the manual rules (referencing a config value from outside the module).

@nixpkgs-ci nixpkgs-ci bot requested review from 6543 and hsjobeki November 21, 2025 04:29
@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 21, 2025
@hsjobeki
Copy link
Contributor

A recursion limit just lets us proceed with documentation generation in the face of such a submodule, but it doesn't actually fix that type, and we'll end up generating documentation for N levels of recursive options, which isn't good.

That was not my original idea. I wanted to abort with a clear error message, that explains how to fix the problem. Rather than having the native "infinite recursion" error.

Thats the least we should do. Because docs will break downstream.

Not sure if should do more, as you pointed out it might be "rare" breakages.
Any foreseeable breakage however should contain actionable messages if feasible.

@lilyball
Copy link
Member Author

I do have a local change I used while diagnosing the breakages from an experiment with doing unrestricted recursion that looks like:

Commit ID: cc6dd6b7351e15a8f5632a8b0e866442a5e6bcf9
Change ID: lvoxlxrlvmlutzrrmwznsooxxlnnsmpy
Author   : Lily Ballard <lily@ballards.net> (2025-07-07 00:57:49)
Committer: Lily Ballard <lily@ballards.net> (2025-11-20 20:22:51)

    WIP: add error contexts

diff --git a/lib/modules.nix b/lib/modules.nix
index e545f9c970..72637d45ec 100644
--- a/lib/modules.nix
+++ b/lib/modules.nix
@@ -1395,7 +1395,7 @@
   # TODO: Merge this into mergeOptionDecls
   fixupOptionType =
     loc: opt:
-    if opt.type.getSubModules or null == null then
+    if builtins.addErrorContext "while evaluating the option at ${showOption loc}" (opt.type.getSubModules or null) == null then
       opt // { type = opt.type or types.unspecified; }
     else
       opt
diff --git a/lib/options.nix b/lib/options.nix
index b1db6ff76e..89a8643a45 100644
--- a/lib/options.nix
+++ b/lib/options.nix
@@ -598,7 +598,7 @@
           let
             ss = opt.type.getSubOptions opt.loc;
           in
-          if ss != { } then optionAttrSetToDocList' opt.loc ss else [ ];
+          if ss != { } then builtins.addErrorContext "while evaluating the option at ${showOption opt.loc}" (optionAttrSetToDocList' opt.loc ss) else [ ];
         subOptionsVisible = if isBool visible then visible else visible == "transparent";
       in
       # To find infinite recursion in NixOS option docs:

I could just pull that into this PR. I don't know how cheap builtins.addErrorContext is so I don't know if this change will have performance implications (I'd want to try timing the manual build to see if that changes, but it's too awkward to do that since I'm on darwin and the --rebuild flag doesn't seem to work with remote builders).

@hsjobeki
Copy link
Contributor

hsjobeki commented Nov 23, 2025

If you could make it an explicit throw, you could catch that with tryEval. And then it could be made non-breaking. So i think you should change 'optionAttrsetToDocList' as well to make it non-breaking. It could print a warning about omited options from docs, rather than failing completely.
Preserving the current behavior in case of incompatible type trees.

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jan 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: lib The Nixpkgs function library 6.topic: module system About "NixOS" module system internals 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants