[configoptional][WIP] Allow configoptional.Optional to wrap scalar values#13524
[configoptional][WIP] Allow configoptional.Optional to wrap scalar values#13524evan-bradley wants to merge 27 commits intoopen-telemetry:mainfrom
Conversation
…tional (open-telemetry#13376)" This reverts commit 1ebb9b2.
|
To help inform the design here, I wanted to compare another option based on a PoC that @jade-guiton-dd wrote which implements a I think the implementation in this PR has the following advantages:
I see the following disadvantages, both of which I think are acceptable:
Overall, I think @jade-guiton-dd I would appreciate your thoughts here as the author of the PoC, let me know if I've misunderstood any part of it or if you think we should incorporate additional considerations while evaluating this. @mx-psi let me know if you have any thoughts as well. If there are no concerns, I will continue to bring this PR to a state where it's ready to review. |
|
One additional thought based on yesterday's discussion on the stability call: if we were working on confmap v2, I would suggest we could explore revising the |
These points are basically the same, but I agree that it's nice to let Note however that this is a departure from how the And I do exactly the same in my PoC to "handle types that require further unmarshaling": I call In that regard,
I think there may be a misunderstanding here: |
|
To give a bit more context, the main point of the
I think making it easier for component authors to build their own Optional with whatever semantics they want without understanding these quirks or implementing two different interfaces is pretty important, because our current Optional implementation has unintuitive semantics inherited from pointers/otlpreceiver (doesn't support disabling a field which is enabled by default, and treats However, I agree that this PR is somewhat simpler to use, so as long as we limit our goal to "supporting scalar values in our own Optional type", I will not push for changing this PR's design. |
|
@jade-guiton-dd Really appreciate the detailed feedback here and context on My overall goal here is to support any kind of wrapper interface similar to |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
308b5ca to
14f8552
Compare
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (63.86%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #13524 +/- ##
==========================================
- Coverage 91.65% 91.58% -0.08%
==========================================
Files 654 656 +2
Lines 42659 42747 +88
==========================================
+ Hits 39100 39148 +48
- Misses 2744 2778 +34
- Partials 815 821 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@jade-guiton-dd @mx-psi follow-up from our discussion yesterday: I've looked through contrib and found a few use cases for the current
Of these, I think (3) will be hardest to solve in a way that doesn't provide a I can think of two ways to go about this. Note that we are bound to our decision for any methods we put on the Optional type, but for confmap we can change our decision or plan for a breaking change for confmap v2.
I don't think there's really a large difference between them given it largely comes down to naming, and I'm not sure which types would want to forgo implementing Let me know what you think and if you have any other approaches you would like me to investigate. |
| if o.flavor == noneFlavor && val == nil { | ||
| // If the Optional is None and the configuration is nil, we do nothing. | ||
| // This replicates the behavior of unmarshaling into a field with a nil pointer. | ||
| return nil | ||
| } |
There was a problem hiding this comment.
I don't think this condition will ever trigger, because val is guaranteed by the hook to be a non-nil value of type T (more generally, of whatever type we returned from ScalarType).
And I think this showcases the restrictiveness of the current ScalarUnmarshaler design: unlike in Unmarshaler, where you can perform basic tests and transformations on the input data before calling confmap.Unmarshal against the appropriate type, here you're forced into choosing a single type to Decode into upfront.
But we want configoptional to be unmarshallable both from a null and from (for example, if T is int) an integer literal. So we need to find a single Go type which can handle both kinds of input and return that from ScalarType.
In our case, it's not very difficult; I think *T should do the trick. But it seems rather inelegant, and more importantly, not really extensible to more diverse forms of input data (for instance, unmarshalling from either an array OR a map, like in #13996), which I think other use cases of a ScalarUnmarshaler interface will want to handle.
So I'm of the opinion that if we want a truly generic "UnmarshalerV2" which can handle all known use cases, including scalars, it should have a similar design to the current Unmarshaler, where we manually call confmap.Unmarshal against the appropriate type, after whatever checks or modifications of the raw YAML data we want to do.
To handle scalars, this means we would need to essentially expose internal.Decode, or more generally something similar to confmap.Unmarshal which can unmarshall an any instead of a confmap.Conf, in the same way that the method receives an any instead of a Conf.
I realize this doesn't really answer the question of "should we have an interface which handles just scalars, or one with both scalars and structs". But I think that even when handling just scalars, there will be use cases that run into this issue.
There was a problem hiding this comment.
Really appreciate your thoughts on this. I want to respond to a few points right now, but I will see if I can provide some additional details soon.
I think this showcases the restrictiveness of the current
ScalarUnmarshalerdesign: unlike inUnmarshaler, where you can perform basic tests and transformations on the input data before callingconfmap.Unmarshalagainst the appropriate type, here you're forced into choosing a single type toDecodeinto upfront.
This is a fair point, the interface is much more restrictive. It's possible we're not there yet and should revise this interface accordingly, but my overall goal is that all config structs can be fully declarative/introspectable from a structural perspective (vs. making validation introspectable, which isn't really feasible). If they were fully declarative, the user would never need to call any Decode or Unmarshal methods themselves, and would likely use as-yet-unwritten interfaces to do the transformations you're describing.
I called this out in point 2 above as something this interface explicitly can't handle. It's possible that it's not feasible to express our config use cases declaratively within the Go type system, but I think it would be an overall quality-of-life improvement if we can pull it off. I wouldn't expect us to land on this, at least fully, until confmap v2.
But we want
configoptionalto be unmarshallable both from anulland from (for example, ifTisint) an integer literal. So we need to find a single Go type which can handle both kinds of input and return that fromScalarType.
I'm sorry, my memory is awful. My understanding was that null is unmarshaled as a nil map[string]any value and we weren't expecting null to be a valid value for scalar fields, but I can't remember where we ended that conversation. Right now we typically use it as a way to say "please use the default value", but given our existing use of e.g. *int right now, I could see it being interpreted differently for scalars. Was the goal to use null as enabled: false for scalar fields? I think it would be feasible to pull that off, if so.
There was a problem hiding this comment.
I think that we discussed this before on the maintainers' call, but I think that using null as enabled: false for scalar fields is probably our best option. As you say, it's feasible to do with the current design (by having ScalarType return *T).
I guess my main problem is that I view Unmarshaler as an escape hatch from the default unmarshaling logic that mapstructure provides. Hardcoding a call to Decode on a fixed type restricts this escape hatch enough that I'm not sure I understand the pratical use cases it targets.
If you can afford to unmarshal into a fixed type and extract the data you want from there in UnmarshalScalar, why not use that fixed type as the config struct, and do the extraction logic at component startup? (In the case of configoptional: if we're going to use a workaround to make configoptional[int] behave like *int with regards to unmarshaling, why not just use a *int directly? Maybe wrap it in a defined type and add convenience methods on it.)
But I understand that, like you say, having arbitrary parsing logic prevents introspection of the YAML's expected structure based on mapstructure tags. I think we have to support arbitrary parsing, because while we can afford to modify confmap to suit our use cases, third-party component authors can't. But we can try to limit the need for using it as much as possible, to make introspection useful in 99% of cases.
And I think one way would be to keep making config utility types like configoptional to avoid component authors having to reach for the escape hatch. Of course, configoptional and co will probably need to use the escape hatch to work, but we can hardcode support for them in our hypothetical future introspection logic.
So for example, one possible practical use case for custom unmarshaling logic, which mapstructure doesn't handle, is "union types", where you want to be able to handle two or more different types of YAML input (maybe null vs int, maybe two different versions of a config struct). For that use case, we could define a configunion wrapper, which uses Unmarshaler + ScalarUnmarshaler or a generic ValueUnmarshaler under the hood, to provide this functionality in a uniform way that we can handle in our introspection logic. (And maybe configoptional[T] could actually just be defined as something like configunion[T, null] for scalars?)
This is a lot of hypotheticals, but what do you think?
There was a problem hiding this comment.
I think that we discussed this before on the maintainers' call, but I think that using
nullasenabled: falsefor scalar fields is probably our best option. As you say, it's feasible to do with the current design (by havingScalarTypereturn*T).
I was actually able to get this to work by tweaking the hook so it's not necessary to make significant changes to the Optional type. I'll put something up soon once I feel confident it's fairly solid.
So for example, one possible practical use case for custom unmarshaling logic, which
mapstructuredoesn't handle, is "union types", where you want to be able to handle two or more different types of YAML input (maybenullvsint, maybe two different versions of a config struct). For that use case, we could define aconfigunionwrapper, which usesUnmarshaler+ScalarUnmarshaleror a genericValueUnmarshalerunder the hood, to provide this functionality in a uniform way that we can handle in our introspection logic. (And maybeconfigoptional[T]could actually just be defined as something likeconfigunion[T, null]for scalars?)
I think we're converging on a common idea for how to approach this, I think this describes something similar to point (2) in my comment on the PR. I'll spend some time thinking about how a configunion type would look. Since you can't pull type parameters out of a type with Go's reflection facilities, we would need to implement an interface that has methods like ScalarType to pull these out. I'm a little concerned about the nesting depth of the types if we want to have a lot of types in a union, so I may want to explore a slightly different approach to this. I'll see if I can write a POC that accomplishes this.
There was a problem hiding this comment.
@jade-guiton-dd following up from our conversation at the 2025-11-03 stability meeting, please see this commit for how I envision allowing config structs to use schemas that differ from their fields while still allowing the final schema to be introspectable: adb7f03 (#13524)
Please note that this is in a rough state right now; I'm missing all sorts of edge cases, the tests don't pass, and the code is a mess (including the scalarunmarshaler hook handling two interfaces). However, this commit should hopefully showcase:
- Removing the need for
configopaque.MapListandconfigoptional.Optionalto implementconfmap.Unmarshaler. They can rely on the new interfaces in this PR to do all unmarshaling without needing to touch the raw map. - Two new interfaces:
MigrateableConfigfor config types that allow fields to be set in YAML that don't correspond to a config struct field, andConfigMigrationfor additional config structs that define these extra schemas. I don't know that I like the use of "migrate" here, I think we may want to go with "layer" or some other term in the future. However, this functionality could be used to migrate between different schema versions for a given component's config.
I think this achieves our goals reasonably well:
- The config is introspectable through the
MigrateableConfiginterface: getting the list of additional structs will be sufficient to programmatically determine the actual schema for a given config struct. - Structs can "hide" fields that can be set in YAML (as in Optional) or declare two entirely different schemas (as with MapList).
- The config is typed at every step of the process, with confmap handling all the ugly details.
- Removes the need for recursion outside confmap.
I think the main point of detraction for this approach is that the implementation in confmap seems like it will be rather intricate. I'm not sure whether we can avoid this or not, or if it's possible to re-use some of our internal logic for working with mapstructure to simplify e.g. cleaning up the input config as we process the layers/migrations/whatever. I stopped cleaning up the implementation so we can stop and see whether we like the general idea.
Feel free to let me know if this doesn't make any sense and I'll do what I can to clear things up.
There was a problem hiding this comment.
I'm a little bit confused about the exact semantics of "layers", but it seems interesting.
It kind of looks like you implemented something like the configunion we talked about, but with the requirement that everything maps into a single struct at the end using an Unmarshaler-like interface? It seems to me like the MigratableConfig interface is essentially ScalarUnmarshaler, but where ScalarType can return multiple possible types to unmarshal into.
But it looks like there is also support for "intersection types", where you can consume certain fields along the way? (Or at least that's my interpretation; I'm not sure if the current code works properly for intersections without adding WithIgnoreUnused in the call to Decode?) That feature seems superfluous since you can usually just have one big struct with all the fields you want to be able to parse, but it sounds like it may be necessary for configoptional specifically, because we can't declare a struct { T `mapstructure:,squash`, Enabled bool }...
I'm also not sure if ScalarUnmarshaler is still useful if there is MigratableConfig. Is it only useful for handling null values?
I think that something like this, with some polish, could hopefully cover most use cases of custom unmarshaling I can think of while keeping things introspectable.
But it still seems to me like a complicated way to prevent users from calling Decode directly, this time on a list of different types instead of just one. This makes the implementation of unmarshaling and the later implementation of introspection more complicated, and if it turns out there is another pattern of recursive Unmarshal calls we want to handle, we would have a hard time adapting the hook + the introspection logic again to support it.
I would suggest an adaptation of one of my previous proposals: what if we have an internal ValueUnmarshaler interface that allows recursive calls to Decode AND provides whatever methods we want for introspection? We then build utilities like configoptional and configunion on top of that (for the latter, probably with a design similar to your MigratableConfig). This would make it easy for us to build more wrappers to replace Unmarshal usages without adding more and more features to the same hook, and while still allowing for introspectability, with no way for users to "accidentally" make their config unintrospectable. (Well, unless they have an any field, or one tagged with remain...)
Something like this:
// this would be in an internal package so it can't be misused
type ValueUnmarshaler interface {
// cfg is unprocessed YAML input, decoder allows calling Decode recursively while forwarding decode options or whatever else we want to avoid recursion for
UnmarshalValue(cfg any, decoder Decoder) error
// kind of similar to `ScalarType` or `Migrations`. although i think more specific methods would be better, it depends what you want out of introspection
GetConfigSchema() any
}(Note: Since both configopaque and configoptional are stable, I don't think we can remove their Unmarshal implementation, so I think refactoring things to use a generic unmarshaling interface would have to wait until a v2.0 anyway unfortunately...)
(Note 2: I think I mentioned this before, but I think that if we can't come to an agreement in a reasonable time frame, it's best that you go with what you think is best. As long as it fulfills the goal for configoptional, I'm willing to approve it.)
There was a problem hiding this comment.
Sorry, I should have annotated the code better, cleaned it up, or both. Thanks through trudging through it.
It kind of looks like you implemented something like the
configunionwe talked about, but with the requirement that everything maps into a single struct at the end using anUnmarshaler-like interface?
Yes, that's basically it.
It seems to me like the
MigratableConfiginterface is essentiallyScalarUnmarshaler, but whereScalarTypecan return multiple possible types to unmarshal into.
That's an interesting observation. It's possible we could consolidate, but I will have to think about it.
But it looks like there is also support for "intersection types", where you can consume certain fields along the way? (Or at least that's my interpretation; I'm not sure if the current code works properly for intersections without adding
WithIgnoreUnusedin the call toDecode?) That feature seems superfluous [...]
This more or less touches on what I meant by "layers" (maybe "fragments" would be better?). Basically you have a config struct that contains a subset of the keys for the component config struct, or even an entirely different set of keys. You unmarshal into that smaller config struct from the user's YAML input, then map from the small config struct to the final component config struct. Along the way it removes these keys with the assumption that they are either not in the resulting config struct, or have already been taken care of by one of the smaller config structs. In addition to the case with configoptional you called out, this also handles MapList, where the user can put either a list or a map for their YAML config in a MapList-typed field.
(Note: Since both
configopaqueandconfigoptionalare stable, I don't think we can remove their Unmarshal implementation, so I think refactoring things to use a generic unmarshaling interface would have to wait until a v2.0 anyway unfortunately...)
Agreed, I just wanted to show that this new interface can support our existing use cases. It's also worth calling out that the configunion/MigrateableConfig thing isn't tied to this PR and would likely be implemented later.
There was a problem hiding this comment.
I don't have time to look into the internal ValueUnmarshaler interface you proposed, but overall it looks like it could work. I do see the opportunity for simplifying things if we don't try to hide Decode from the user, but I'll have to think about it more to make sure it wouldn't make the implementations feel too low-level. I'll take a closer look once I'm back on 2025-11-17.
This would make it easy for us to build more wrappers to replace Unmarshal usages without adding more and more features to the same hook
I think we're looking at two sides of the same coin. My goal would more or less be to have three interfaces, each with its own hook inside confmap:
ValueUnmarshalerto unmarshal wrapped config types (the currentScalarUnmarshalerinterface)FragmentedConfigto union multiple config types into a single result type (the currentMigrateableConfiginterface, I think I like "fragments" the most so far)MapUnmarshalerto provide the user with a no-limits*confmap.Confexperience.
I would expect config structs to implement these as-needed and that they wouldn't conflict with each other. It sounds like the ValueUnmarshaler type you outlined is a combination of these three, which I think would be a developer UX improvement if we can make it work.
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description
This shows a little bit of what scalar support may look like in configoptional. I am still working through the API and may refine it a little more, or at least offer some options. I also need to test this much more extensively, so far I've only been aiming to resolve the recent issue with using the optional type in exporterhelper.
I've made this so we can keep it experimental since confmap is stable, even if it causes some churn now and will cause more if we stabilize it.
Link to tracking issue
Fixes #13421