Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform: validate providers' schemas during NewContext #28124

Merged
merged 6 commits into from
Mar 22, 2021

Conversation

mildwonkey
Copy link
Contributor

@mildwonkey mildwonkey commented Mar 17, 2021

Hello! This PR comes with a lead-in question before you look at the code: should we even do this? I think so, but if there's any controversy I can step back and RFC this. [updated with improved error message] If agreed, I will have at least one more commit to improve the error message; I didn't want to spend any more time on it before confirming if this is a good idea in the first place.


The terraform-plugin-sdk has a lovely series of InternalValidate functions that verify the validity of the providers' schemas during Configure (entrypoint here). Terraform assumes that this is run for every provider.

This is not a safe assumption, unfortunately: for example a provider not using our sdk might not implement something like InternalValidate. Most of the mistakes that the validation would catch will result in errors in terraform (for e.g.: if an attribute is both required and computed, we'll get an invalid result after apply error). There's a new potential issue that will result in a panic: an attribute with both a non-nil Type and a NestedType. (this is not yet included in InternalValidate; the sdk doesn't support the new NestedTypes yet).

I started down this path because I wanted Terraform to validate the schema too, so we could catch errors at the beginning of a command instead of during plan or apply (apply being the worst case scenario). I tested this change with a config that used all of our top official providers (aws, azure, google, alicloud, oci, k8s), after which I removed one of the original checks that was in configschema.InternalValidate (only called by our builtin providers), since that caused an error with the oci provider (invalid attribute names - that's a project for another day).

If you prefer to review this commit-by-commit, this commentary on each might be helpful:

  • c5a903b updates the existing InternalValidate test to verify specific errors instead of the number of errors; the Attribute.InternalValidate function in this commit is incomplete (and buggy) so please ignore that.
  • 9f669b5 extracts the attribute validation into a function and refactors Block.InternalValidate to use it. One test error message changed.
  • 02c31cd adds the call to InternalValidate into loadProviderSchemas (called during terraform.NewContext) and fixes a bug with the error message returned by InternalValidate
  • 2dc6960 tweaks the error message so it's closer to the sdk's current error (spoiler: it's not as good)

I don't like this error message as much as the one from the provider, but I am not sure how much I can improve it (I'm sure I can get the specific resource name in!), since in this case the error could be one of a series of diagnostics. I shouldn't just throw everything else away in favor of returning exactly/only the same message as the SDK does (I don't think I'm putting this into words very well; hit me up in slack if you want to talk through it).

Now for the pictures! Here's the current error message from the SDK if a provider has an invalid schema:

Screen Shot 2021-03-16 at 8 30 42 AM

Here's the same error, caught by Terraform during NewContext. UPDATED VERSION It's not as nice, and also notice that the failing resource name is missing:

Screen Shot 2021-03-18 at 9 16 29 AM

This commit adds an attribute-specific InternalValidate which was extracted directly from the block.InternalValidate logic and extended to verify any NestedTypes inside an Attribute. Only one error message changed, since it is now valid to have a cty.NilType for Attribute.Type as long as NestedType is set.
We haven't been able to guarantee that providers are validating their own schemas using (some version of) InternalValidate since providers were split out of the main codebase. This PR adds a call to InternalValidate when provider schemas are initially loaded by NewContext, which required a few other changes:

InternalValidate's handling of errors vs multierrors was a little weird - before this PR, it was occasionally returning a non-nil error which only stated "0 errors occurred" - so I addressed that in InternalValidate. I then tested this with a configuration that was using all of our most popular providers, and found that at least on provider had some invalid attribute names, so I commented that particular validation out. Adding that in would be a breaking change which we would have to coordinate with enablement and providers and (especially in this case) make sure it's well communicated to external provider developers.

I ran a few very unscientific tests comparing the timing with and without this validation, and it appeared to only cause a sub-second increase.
@mildwonkey mildwonkey requested review from a team March 17, 2021 15:01
@codecov
Copy link

codecov bot commented Mar 17, 2021

Codecov Report

Merging #28124 (2bad57f) into main (783936f) will decrease coverage by 0.02%.
The diff coverage is 48.43%.

❗ Current head 2bad57f differs from pull request most recent head 7c7add3. Consider uploading reports for the commit 7c7add3 to get more accurate results

Impacted Files Coverage Δ
configs/configschema/decoder_spec.go 81.60% <ø> (ø)
terraform/schemas.go 62.83% <0.00%> (-2.31%) ⬇️
configs/configschema/internal_validate.go 56.47% <51.66%> (-13.53%) ⬇️
terraform/node_resource_plan.go 96.11% <0.00%> (-1.95%) ⬇️
dag/marshal.go 61.90% <0.00%> (-1.59%) ⬇️
backend/remote/backend_common.go 49.82% <0.00%> (-0.70%) ⬇️
states/statefile/version4.go 58.19% <0.00%> (+0.23%) ⬆️

@mildwonkey
Copy link
Contributor Author

👋 Hi enablement! I don't need a code review from y'all (unless you want to!), but wanted you to see my proposal to run a function similar to InternalValidate before the SDKs would normally run duing ConfigureProvider. (My InternalValidate covers less than yours, so yours is still important).

Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good idea to me, but the SDK team should definitely have a say on this.

@radeksimko radeksimko requested a review from a team March 17, 2021 15:04
Copy link
Contributor

@paddycarver paddycarver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a fan of this.

@radeksimko
Copy link
Member

Maybe one nitpick - is it worth surfacing the provider name too? e.g. change this

... Please report this bug:

to

... Please report this bug to maintainers of hashicorp/random provider:

Pulling the bug report URL out of the Registry would be even cooler, but I think this would at least make it more obvious where exactly to report the bug to? That seems especially relevant when the resource name doesn't hold the prefix of the provider name.

@mildwonkey
Copy link
Contributor Author

Oh geez, I hadn't realized the provider name isn't in there anywhere! Thank you!

Here's the new version. I'll update the top comment, too. I'm not going to include the issues url in this PR since we'd need to do more refactoring than I want to get into right now. We don't have access to the provider installation information at that point, so we wouldn't know if it wasn't a registry provider. It's a great suggestion, though, and we should consider that for future work.

Screen Shot 2021-03-18 at 9 16 29 AM

@radeksimko
Copy link
Member

@mildwonkey I think the ForDisplay value which omits the hostname for the official registry might suffice in this context, but LGTM either way!

if pt.Hostname == DefaultRegistryHost {
return pt.Namespace + "/" + pt.Type
}
return pt.Hostname.ForDisplay() + "/" + pt.Namespace + "/" + pt.Type

@mildwonkey
Copy link
Contributor Author

We generally use the full source for errors and ForDisplay for informational messages only.

@mildwonkey
Copy link
Contributor Author

mildwonkey commented Mar 22, 2021

Here's the most recent version of this error message. I moved the opening paragraph to the end, after the specific error message(s). In this example, I ran terraform plan without running init (I presume, but do not know, that this is the most likely reason a user would see this error message):

Screen Shot 2021-03-22 at 1 04 59 PM

Copy link
Contributor

@pselle pselle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the new message! I hope it leads to more helpful reporting and helping users figure out where their issues lie sooner

// However, we don't raise this here since it's checked by
// InternalValidate.
// This indicates an invalid schema, since it's not valid to define
// both an attribute and a block type of the same name. We assume
Copy link
Contributor

@pselle pselle Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this adjustment of the comment highlighting the assumption (edit: I like this on multiple comments!)

@mildwonkey mildwonkey merged commit b9138f4 into main Mar 22, 2021
@mildwonkey mildwonkey deleted the mildwonkey/validate-me-pls branch March 22, 2021 17:17
@ghost
Copy link

ghost commented Apr 22, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants