Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure a tag containing external-name is present in all external resources supporting tags #408

Open
LCaparelli opened this issue May 24, 2024 · 3 comments · May be fixed by #409
Open

Ensure a tag containing external-name is present in all external resources supporting tags #408

LCaparelli opened this issue May 24, 2024 · 3 comments · May be fixed by #409
Labels
enhancement New feature or request

Comments

@LCaparelli
Copy link

LCaparelli commented May 24, 2024

What problem are you facing?

Many cloud provider external resources are associated with a non-deterministic ID/external-name. To deal with that, Crossplane providers rely on the crossplane.io/external-name annotation for discovering existing external resources that a managed resource controls.

It's not unusual for Crossplane users to set deletionPolicy to Orphan on production environments to prevent accidental deletions of external-resources in the event of human error or catastrophic failure in Kubernetes clusters requiring re-creation of all resources.

During such events, the MR gets re-created without the annotation, and one of these scenarios follow:

  1. The MR breaks and never syncs, because it identifies the resource you're attempting to manage already exists, and it does not get imported automatically. Human intervention is required.
  2. The MR results in the creation of a different resource, effectively duplicating everything, and the provider is now managing a resource which might not even be at use. In this scenario, the failure to import the existing resource is not reported, and it's marked as healthy (even though the original resource is now orphaned and unmanaged). Not only requires manual intervention, but might go unnoticed until other things start to break.

These problems are aggravated when compositions are used, as it makes it all more complex. For example, imagine a composition that manages DNS hosted zones. As well as creating the zone itself, it might also create NS records on the parent zone, to delegate authority to it.

Within the dropdown you'll find a concrete example of a real scenario we faced.

Detailed Example

To make this example more concrete, let's use AWS' Route53. Assume the zone already exists in AWS, previously created by upjet-aws-provider.

flowchart TD
 subgraph subgraph_aws["AWS"]
        hosted_zone(["Hosted Zone zone1 (foo.bar.baz)"])
        ns_record(["NS Record (on zone bar.baz, points to zone1)"])
  end
    mr_zone["Zone Managed Resource (external-name zone1)"]
    mr_record["Record Managed Resource"]
    xr["XR"] --> mr_zone & mr_record
    mr_zone --> hosted_zone
    mr_record --> ns_record

Loading

However, for one reason or another, the claim/XR/MRs that manage it had to be re-created. The resources were left in AWS. So far no disruption to DNS resolution is happening.

flowchart TD
 subgraph subgraph_aws["AWS"]
        hosted_zone(["Hosted Zone zone1 (foo.bar.baz)"])
        ns_record(["NS Record (on zone bar.baz, points to zone1)"])
  end
Loading

Once the claim, XR and MRs are recreated, a duplicate DNS zone will be created, and it will be empty.

flowchart TD
 subgraph subgraph_aws["AWS"]
        hosted_zone(["Hosted Zone zone1 (foo.bar.baz)"])
        duplicated_zone(["Hosted Zone zone2 (foo.bar.baz)"])
        ns_record(["NS Record (on zone bar.baz, points to zone1)"])
  end
    mr_zone["Zone Managed Resource (external-name zone2)"]
    mr_record["Record Managed Resource"]
    xr["XR"] --> mr_zone & mr_record
    mr_zone --> duplicated_zone
    mr_record --> ns_record

Loading

Because it throws no errors to halt the composition (as far as Crossplane is concerned, everything is as it should be), the NS resource also gets updated on the parent DNS zone, delegating authority to the new, empty DNS zone.

From this point forward, all DNS resolution for the original zone start failing, until an engineer updates the external-name annotation on the hosted zone MR to point to the old zone.

How could Upjet help solve your problem?

Upjet could ensure that a tag with the external-name value is inserted into the external resource, along with crossplane-kind and crossplane-name tags it already currently inserts.

This would allow for importing mechanisms to filter existing resources using tags, then setting the external-name annotation on the managed resource if it finds the external-name tag set on it.

Proof of Concept

To deal exactly with the problem I mentioned we wrote a composition function that uses AWS' Resource Groups Tagging API to filter any AWS resource based on the tag Crossplane/Upjet already currently populates (crossplane-kind, crossplane-name).

https://github.com/gympass/function-aws-importer

The nice thing about this tagging API is that you can make the same API calls regardless of the exact type of the external resource, as long as it supports tags (most resources do). It works for Route53 Hosted Zones, EC2 Security Groups, etc.

It currently requires composition authors to ensure the external-name tag is there themselves, which leads to longer reconciliations, weird corner cases, and overall non-ideal user experience. If Upjet ensured this tag was there itself, we could avoid all of this and make the function itself self-contained.

To have a better idea of how this would work on Upjet, we deployed a fork with the following change: a6c0bc9

This is not necessarily the best way to implement this, but it proves the concept. We've been using this fork along with our function in our development environment with no failures to report so far.

Future

In the future, if this concept proves to be robust, Upjet could even use this tag to assemble the tfstate it produces on observe operations, making the function above useless by incorporating its functionality entirely. One step at a time though :-)

@LCaparelli LCaparelli added the enhancement New feature or request label May 24, 2024
@LCaparelli LCaparelli linked a pull request May 24, 2024 that will close this issue
3 tasks
@jbw976
Copy link
Member

jbw976 commented Jun 3, 2024

@LCaparelli presented on this topic at the May 30 2024 community meeting. Here is a timestamped recording link to when Lucas starts presenting: https://youtu.be/nlBur2-ev0Q?si=_Csv-jJ1f5Elh61X&t=1425 (42:14 if the timestamp doesn't work)

@blakeromano
Copy link

Definitely a big +1 on this. The PR seems to implement this in a way that makes sense to somebody who doesn't know the code and I think will allow for automation like in the POC Composition Function that others will find of massive value.

@LCaparelli
Copy link
Author

Just summarizing what we discussed on the SIG-upjet call:

  • the import problem is a large one, and ideally should be fixed by an approach that doesn't require tagging, as not all resources managed by Upjet support them
  • there was some concern about the approach the importer function takes, in the sense it might twist the annotation's role and make tags the authoritative source for the external name. It would not be the case, as the function never touches the annotation if it's already present AND the proposed change to Upjet would ensure to take the value from the annotation and push it to the tag (never the other way around if the annotation is already there)
  • the proposed change, ignoring any value added from the import function, could still be valuable in the sense of labelling external resources

The general orientation I understood from that call was to maybe move this proposal to upstream Crossplane. Regardless of that, the function we're using for import could also be responsible for ensuring this tag at the external-resource. Or even another function could do that, the main point is to make this off-tree for Upjet.

@ulucinar does this summary sound about right?

I wanted to keep on discussing some finer points, but we ended up hijacking the call for too long and there were other topics. If that's ok I'd like to continue it here.

I personally don't see any harm in adding this tag to taggable resources. We're already doing that for other tags (crossplane-name, crossplane-kind, crossplane-provider-config), so I'm having a tough time understanding why this particular tag would be unwelcome.

Especially now that we have a more mature iteration of composition functions, exporting information like this can open more possibilities for functions. Though the import problem was used as first motivation for this proposal, that does not mean it's the only scenario where this tag could be beneficial.

It could be used for generating metrics as well, for example (we often use tags as way to generate data in our data lake).

I understand that this is not a 100% solution for the import problem. However, what this issue proposes is not a solution for that problem, but to provide additional information through labels/tags so that other machinery can rely on them, for whatever reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants