Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cty Refinements: Unknown values with tighter range #590

Merged
merged 6 commits into from
May 31, 2023

Conversation

apparentlymart
Copy link
Contributor

@apparentlymart apparentlymart commented Feb 24, 2023

My upstream project cty, which HCL uses as its type system, has a new feature in its v1.13.0 release where it can often refine the range of an unknown result from an operation based either on innate characteristics of that operation (e.g. many operations never return a null value) or based on already-refined input values.

There's more information on how exactly refinements work, and the scope of the initial design, in the cty documentation on Refinements.

Along with updating to the new cty which includes that feature, and updating some of the existing tests to expect the newly-refined unknown results, this also makes HCL itself aware of refinements in some specific cases where my research indicates that it'll make it more likely for applications like Terraform to raise errors sooner (e.g. during validation or planning rather than during apply) and to more often produce known results in count and for_each arguments, which Terraform currently requires to be known during planning.

The specific cases I implemented here are:

  • In an expression like "https://${hostname}", if hostname is unknown then we still know that the final string will be non-null and will start with https://, which in particular means that result != null and result != "" can both produce known true instead of an unknown bool result.
  • In an expression like unknown ? a : b we can propagate some refinements that are shared by both a and b into the result even though we don't know which match arm to choose yet. This isn't super useful on its own but composes with other rules (both here in HCL and down in cty) to increase the likelihood of producing a wholly-known or partially-known result.
  • When processing optional attributes in the typeexpr extension -- which Terraform uses for its input variable type constraint mechanism -- if either the given value or the default value are refined as non-null then we can be sure that the value of that attribute won't be null. Again, this is not super useful on its own but knowing that a value definitely isn't null gives more opportunities for replacing unknown values with wholly-known or partially-known results.
  • When resolving the splat operator [*] we can propagate refinements about the length of the input into the output, as long as we already know the type of the input and so can be sure about whether we're working in the list mode or in the special "null or not" mode.

There's also one extra thing which does not have immediate benefit but may be useful downstream in Terraform:

  • hcldec has a new "spec" that wraps another spec and applies refinements to its value. Terraform uses hcldec to interpret the content of resource blocks (and similar) in terms of the provider schema, and so if Terraform later allows providers to mention certain static refinements as part of the provider schema then Terraform could potentially check and apply those automatically while decoding the configuration.

    In particular, legacy SDK providers are guaranteed to never return a null value for id, and so a hypothetical change to refine that attribute as never being null would make it easier to use the id of a not-yet-created resource as part of a conditional or other similar expression downstream without getting an unknown result.

These changes are all consistent with cty's compatibility rule that Unknown Values can become "more known", which HCL effectively passes on to its own callers as a result of using cty as its type system. Therefore applications adopting a new version of HCL that includes these changes should expect to see more tightly-refined results in their tests involving unknown values, and so should expect to update those tests to match as part of upgrading HCL, as long as the new results are actually correctly-refined for the expected final result.

(If this change is accepted, I have a similar sort of change ready to submit to Terraform itself which would make it refine even more cases, again building on the foundations from both cty and HCL.)

@apparentlymart apparentlymart added enhancement v2 Relates to the v2 line of releases labels Feb 24, 2023
@apparentlymart apparentlymart requested a review from a team February 24, 2023 00:37
@apparentlymart apparentlymart self-assigned this Feb 24, 2023
This new concept allows constraining the range of an unknown value beyond
what can be captured in a type constraint. We'll make more use of this
in subsequent commits.
If we encounter an interpolated unknown value during template rendering,
we can report the partial buffer we've completed so far as the refined
prefix of the resulting unknown value, which can then potentially allow
downstream comparisons to produce a known false result instead of unknown
if the prefix is sufficient to satisfy them.
When ConditionalExpr has an unknown predicate it can still often infer
some refinement to the range of its result by noticing characteristics
that the two results have in common.

In all cases we can test if either result could be null and return a
definitely-not-null unknown value if not.

For two known numbers we can constrain the range to be between those two
numbers. This is primarily aimed at the common case where the two possible
results are zero and one, which significantly constrains the range.

For two known collections of the same kind we can constrain the length
to be between the two collection lengths.

In these last two cases we can also sometimes collapse the unknown into
a known value if the range gets reduced enough. For example, if choosing
between two collections of the same length we might return a known
collection of that length containing unknown elements, rather than an
unknown collection.
If either the given value is refined non-null or if the default value is
refined non-null then the final attribute value after defaults processing
is also guaranteed non-null even if we don't yet know exactly what the
value will be.

This rule is pretty marginal on its own, but refining some types of value
as non-null creates opportunities to deduce further information when the
value is used under other operations later, such as collapsing an unknown
but definitely not null list of a known length into a known list of that
length containing unknown values.
We know that a splat expression can never produce a null result, and also
in many cases we can use length refinements from the source collection to
also refine the destination collection because we know that a splat
expression produces exactly one result for each input element.

This also allows us to be a little more precise in the case where the
splat operator is projecting a non-list/set value into a zero or one
element list and we know the source value isn't null. This refinement is
a bit more marginal since it would be weird to apply the splat operator
to a value already known to be non-null anyway, but the refinement might
come from far away from the splat expression and so could still have
useful downstream effects in some cases.
This new spec type allows adding value refinements to the results of some
other spec, as long as the wrapped spec does indeed enforce the
constraints described by the refinements.
@apparentlymart
Copy link
Contributor Author

The following Terraform issues would either be resolved completely or partially improved by merging this and then adopting it in Terraform:

@apparentlymart
Copy link
Contributor Author

I ended up submitting separately the Terraform-specific parts of adopting refined values: hashicorp/terraform#33234

...but once this PR is merged and released I'll make a new PR in Terraform to adopt the new version of HCL too.

@@ -14,7 +14,7 @@ require (
github.com/mitchellh/go-wordwrap v0.0.0-20150314170334-ad45545899c7
github.com/sergi/go-diff v1.0.0
github.com/spf13/pflag v1.0.2
github.com/zclconf/go-cty v1.12.1
github.com/zclconf/go-cty v1.13.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we bump this to the latest patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes I made only require 1.13.0, so I don't see a strong reason to force it since any calling application can select a newer version if it wants to. This is only here to represent that this new version of HCL literally won't work if you use v1.12 or earlier. 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We could still do it, but I'm inclined to make that a separate change since this particular change doesn't need it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement v2 Relates to the v2 line of releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants