Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow resolvers access to the config #266

Closed
omry opened this issue Jun 11, 2020 · 18 comments
Closed

Allow resolvers access to the config #266

omry opened this issue Jun 11, 2020 · 18 comments
Labels
enhancement New feature or request
Milestone

Comments

@omry
Copy link
Owner

omry commented Jun 11, 2020

Currently resolvers are not getting access to the config structure.
It will be helpful to allow it.

Example use case:

OmegaConf.register_resolver("len", lambda root, value: len(OmegaConf.select(root, value)))
c = OmegaConf.create({
  'list': [1,2,3],
  'list_len': '${len:list}'
})
assert cfg.list_len == 3
@omry omry added this to the OmegaConf 2.1 milestone Jun 11, 2020
@omry omry added the enhancement New feature or request label Jun 11, 2020
@pereman2
Copy link
Contributor

Hi, want help?

@omry
Copy link
Owner Author

omry commented Jun 28, 2020

Yes.
A PR for this would be appreciated, but keep in mind that it will only get into 2.1.

Backward compatibility can be tricky here:
Passing the config in is probably a breaking change.
A good start would be to make a proposal of how to handle it.

@omry
Copy link
Owner Author

omry commented Jun 28, 2020

There is also the question of what to pass in?

  • The root of the config?
  • The node of the interpolation itself? maybe its parent container?

@pereman2
Copy link
Contributor

  1. I have a pair of questions about the sample code.
  • Since root isn't being passed as argument, does it mean that we will use it as an special word?

  • len(OmegaConf.select(root, value)) would be len(Node) which is not implemented if I'm not wrong.

Am I missing something maybe?

@omry
Copy link
Owner Author

omry commented Jun 28, 2020

I didn't test that code sample obviously :)

  • I am not sure what is the best way to do this. something in the registration API can tell us if we are in a simple lambda mode or in the mode advanced mode. possibly the resolver can be a subclass of some interface or a lambda and have different behavior based on that.
  • len is implemented for both DictConfig and ListConfig, easy enough to check:
In [1]: from omegaconf import OmegaConf
In [2]: len(OmegaConf.create({"a": 10, "b": 20}))
Out[2]: 2
In [3]: len(OmegaConf.create([1,2,3]))
Out[3]: 3

odelalleau added a commit to odelalleau/omegaconf that referenced this issue Aug 12, 2020
@odelalleau
Copy link
Collaborator

I needed this in #321 so I took a stab at it, see 9bb63d2 (look at the changes to register_resolver() in omegaconf.py). In this commit the env resolver was also changed to use this feature, which shows an example on how to use it.

Note that I got rid of the ResolverX protocol classes for typing, which didn't seem actually useful to me (and broke typing checks with the new signature of env()) -- especially now that resolvers can take non-string inputs. But maybe there's a good reason to keep them?

@omry
Copy link
Owner Author

omry commented Aug 12, 2020

Hard to jump into that diff without properly reviewing the whole thing (which I did not do yet).

ResolverX no longer make sense once we start supporting passing non-strings into resolver functions.
It was useful exactly to prevent people from accidentally registering functions that takes non strings, because it was not supported before.

@odelalleau
Copy link
Collaborator

Hard to jump into that diff without properly reviewing the whole thing (which I did not do yet).

Yeah it's a bit mixed up with other stuff, but for this issue you really only need to look at the changes to register_resolver().
The example you wrote above would instead be written:

OmegaConf.register_resolver("len", lambda value, *, root: len(OmegaConf.select(root, value)), config_arg="root")

(I'll add it to the tests)

@omry
Copy link
Owner Author

omry commented Aug 12, 2020

okay, that's pretty nice. I didn't think about using optional named arguments for this.
Can you also add the node of the interpolation itself as node?
this can be useful for relative interpolations (when we add them, and also to know the full key of the node etc.

@odelalleau
Copy link
Collaborator

Can you also add the node of the interpolation itself as node?

Done in 70d6f4b (I used the parent instead of the node, because it was already available and I assume a common use case will be to access other options from the parent)

Doing this made me realize that allowing access to the config doesn't play way with the idea of cached outputs. So I added a mechanism to disable the cache in f7afe4b, and used it to disable the cache for resolvers that use these new options (to access the config or the parent). In particular I disabled the cache for env at the same time, since env can now access arbitrary config items (e.g. by setting an environment variable to ${foo}).

Initially I tried to make it possible to use the cache by using the whole config as key, however this doesn't work well because although configs are hashable, using them as dictionary keys triggers some equality comparisons between configs. And this leads to infinite recursion, presumably because it is trying to resolve interpolations to perform this equality (I haven't looked too deeply into it, as I don't want to stray away too far).

@omry
Copy link
Owner Author

omry commented Aug 13, 2020

The intention with the cache is to protect against a function call returning something unstable.
an example can be a random function, where you will not normally want to really get a different random number everytime you access the config variable, or time based functions - where you will not want the return to change but to basically lock in on the first call.

This works as long as the function is stateless and does not depend on the config.
Adding a flag to disable the cache is good in any case, and in fact we might want to flip the behavior such that by default we do not cache the results (we can start by forcing people to make a decision via a deprecation warning, similar to how we handled other things), and later we can change the default to false.

Hashing the config might offer a way out, but it also sounds pretty expensive to do that on the call (even if we did solve the recursion issue).
Maybe create a followup task to look at it, for now I think having a use_cache flag is solving the problem.

@odelalleau
Copy link
Collaborator

That's also what I was thinking -- taking note to create a follow-up issue once the PR is merged

odelalleau added a commit to odelalleau/omegaconf that referenced this issue Aug 16, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Aug 17, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Sep 9, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Sep 10, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Sep 11, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Sep 16, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Sep 23, 2020
odelalleau added a commit to odelalleau/omegaconf that referenced this issue Oct 27, 2020
@odelalleau
Copy link
Collaborator

odelalleau commented Nov 26, 2020

Following up on this, in the end the version I originally wrote in #321 won't make it to master, so most of the comments above are now irrelevant.

If anyone wants to pick up this issue, a good starting point is this comment, suggesting to add support for ${.} to access the parent node: #321 (comment)

@omry
Copy link
Owner Author

omry commented Feb 10, 2021

I think your changes are mostly addressing this via support for nested interpolations.
While there is no way right now to interpolate to the config root, you can interpolate to any other config node. I think this is good enough for now.

@odelalleau
Copy link
Collaborator

Yes, nested interpolations definitely address most use cases. One potential reason to want access to the node rather than its value could be to resolve interpolations within the resolver itself (that's what my original re-implementation of env was doing, but I dropped support for that when I reverted the changes I had to give access to the parent node).

Something that'd be easy to do would be to add a syntax to access the node object instead of its value. For instance, ${my_resolver:${%a}} could provide directly the node object to the resolver function, at which point it is easy to ask for the parent, root, etc. But we can probably wait until there is a strong use case for this.

@omry
Copy link
Owner Author

omry commented Feb 10, 2021

Value nodes are internal API, and container nodes are already accessible through normal interpolation.
I think what is missing is an interpolation to access the root node, e.g. ${_root_}, but I until I see a real use case I don't think it's worth adding.

Generally, an advantage of supporting it at the API level as opposed to the input level is that the input is determined by the user of the resolver. not by the resolver author. at the same time I think nested interpolations will cover the important use cases.

@omry
Copy link
Owner Author

omry commented Feb 10, 2021

In progress in #445.
See discussion to see how it will address this feature. (not exactly as described).

omry pushed a commit that referenced this issue Feb 12, 2021
* Add a grammar to parse interpolations
* Add support for nested interpolations
* Deprecate register_resolver() and introduce new_register_resolver() (that allows resolvers to use non-string arguments, and to decide whether or not to use the cache)
* The `env` resolver now parses environment variables in a way that is consistent with the new interpolation grammar

Fixes #100 #230 #266 #318
@omry omry closed this as completed Feb 17, 2021
@omry
Copy link
Owner Author

omry commented Mar 13, 2021

This is implemented in #599. See the docs for an example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants