Skip to content

Add status field to Kubernetes clusters to support discovery workflows#62161

Merged
tigrato merged 6 commits intomasterfrom
tigrato/eksautocleanup
Jan 2, 2026
Merged

Add status field to Kubernetes clusters to support discovery workflows#62161
tigrato merged 6 commits intomasterfrom
tigrato/eksautocleanup

Conversation

@tigrato
Copy link
Copy Markdown
Contributor

@tigrato tigrato commented Dec 11, 2025

Adds a new Status field to KubernetesClusterV3 containing discovery-related information. The status includes cloud provider-specific data, starting with AWS which tracks the ARN for access setup, integration name, and the assumed role used during discovery.

This enables the discovery service to persist state about clusters it discovers, which is needed to properly cleanup created access entries when a cluster is removed or no longer matches the label filtering. Without this information, dangling AWS resources would be left behind after discovery changes.

Changelog: Added cleanup of access entries for EKS auto-discovered clusters when they no longer match the filtering criteria and are removed.

Adds a new Status field to KubernetesClusterV3 containing discovery-related
information. The status includes cloud provider-specific data, starting with
AWS which tracks the ARN for access setup, integration name, and the assumed
role used during discovery.

This enables the discovery service to persist state about clusters it discovers,
which is needed to properly cleanup created access entries when a cluster is
removed or no longer matches the label filtering. Without this information,
dangling AWS resources would be left behind after discovery changes.

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
Comment on lines +5162 to +5166
// Status is the resource status.
KubernetesClusterStatus Status = 6 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "status"
];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add gogoproto options if they're unnecessary.

Also, style: protobuf field names and protobuf docstrings should follow protobuf styling.

Suggested change
// Status is the resource status.
KubernetesClusterStatus Status = 6 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "status"
];
// the resource status, intended to be ignored by IaC tools
KubernetesClusterStatus status = 6;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<the reason why I added json tags is mostly because the resource is always marshaled using json.Marshal which might create inconsistent namings

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default encoding/json field name is the protobuf field name, so unless we're trying to match a specific go struct field name for the sake of existing code (which is not the case here) we can just pick the field name for the json and live with whatever field name ends up in the go codegen.

Copy link
Copy Markdown
Contributor

@espadolini espadolini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that fits the "status" field or should it just be in "spec"?

No matter where we put the additional field, what can break if the field is discarded (because the auth might not support it) in the cleanup logic? Will it just keep the existing behavior?

Is it possible for this field to change between writes and thus trigger more writes and reconciliations? Should we change (*KubernetesClusterV3).IsEqual?

@tigrato
Copy link
Copy Markdown
Contributor Author

tigrato commented Dec 18, 2025

Is this something that fits the status field, or should it live in spec?

This belongs in the status field. These values should not be user editable and should not be treated as part of the desired state. They are owned and managed by the discovery process itself. While we allow users to manually create dynamic Kubernetes clusters, we don’t want them to modify these fields, as they are not assumed or consumed unless discovery is attempting to delete the resource.

Regardless of where we place the additional field, what could break if it’s discarded (for example, if auth doesn’t support it) during cleanup? Would the behavior remain the same?

If the field is discarded, the current behavior remains unchanged. The access entry will simply be left in place for future generations.

Could this field change between writes and cause additional writes or reconciliations? Do we need to update (*KubernetesClusterV3).IsEqual?

This field cannot change as part of normal writes. The only way it can change is via updates to the discovery configuration or teleport.yaml. Both cases already trigger reconciliation, so this does not introduce additional churn.

It also does not affect Kubernetes heartbeats, since this field along with other sensitive fields are discarded during heartbeating.

PS: I forgot that our goderive stuff ignores status fields. Updated

Comment on lines +73 to +76
if kc1.GetStatus().IsEqual(kc2.GetStatus()) {
return services.Equal
}
return services.Different
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for consistency

Suggested change
if kc1.GetStatus().IsEqual(kc2.GetStatus()) {
return services.Equal
}
return services.Different
if !kc1.GetStatus().IsEqual(kc2.GetStatus()) {
return services.Different
}
return services.Equal

@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from ryanclark December 18, 2025 15:32
return res
}
// Additionally compare Status field using its IsEqual method.
// This is needed because CompareResources ignores Status field of KubeCluster and for most
Copy link
Copy Markdown
Contributor

@smallinsky smallinsky Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. CompareResources ignores Status field of KubeCluster

Are you sure that this is true ?

based on the code status is ignored only for DatabaseV3 and UserSpecV2 and AccessList types:

func CompareResources[T any](resA, resB T) int {
	var equal bool
	if hasEqual, ok := any(resA).(compare.IsEqual[T]); ok {
		equal = hasEqual.IsEqual(resB)
	} else {
		equal = cmp.Equal(resA, resB,
			ignoreProtoXXXFields(),
			cmpopts.IgnoreFields(types.Metadata{}, "Revision"),
			cmpopts.IgnoreFields(types.DatabaseV3{}, "Status"),
			cmpopts.IgnoreFields(types.UserSpecV2{}, "Status"),
			cmpopts.IgnoreFields(accesslist.AccessList{}, "Status"),
			cmpopts.IgnoreFields(header.Metadata{}, "Revision"),
			cmpopts.IgnoreUnexported(headerv1.Metadata{}),

			// Managed by IneligibleStatusReconciler, ignored by all others.
			cmpopts.IgnoreFields(accesslist.AccessListMemberSpec{}, "IneligibleStatus"),
			cmpopts.IgnoreFields(accesslist.Owner{}, "IneligibleStatus"),

			cmpopts.EquateEmpty(),
		)
	}
	if equal {
		return Equal
	}
	return Different
}

reference:
https://github.com/gravitational/teleport/blob/master/lib/services/compare.go#L35

So the whole kc1.GetStatus().IsEqual(kc2.GetStatus()) call seems to be redundant here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The KubeCluster type implements an IsEqual method, which causes the comparison logic to use hasEqual.IsEqual(resB) instead of falling back to cmp.Equal.

This IsEqual method is generated by Teleport's Go derive plugin, which is responsible for generating comparison code. The plugin intentionally skips status fields during comparison, as shown in this code

Copy link
Copy Markdown
Contributor

@smallinsky smallinsky Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks. The CompareResources ignores Status statment suggested that the ignore logic is implemented in CompareResources like in case of filed like: IgnoreFields(types.DatabaseV3{}, "Status"),

@tigrato tigrato enabled auto-merge January 2, 2026 16:57
@tigrato tigrato added this pull request to the merge queue Jan 2, 2026
Merged via the queue into master with commit 533286b Jan 2, 2026
47 of 49 checks passed
@tigrato tigrato deleted the tigrato/eksautocleanup branch January 2, 2026 19:50
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@tigrato See the table below for backport results.

Branch Result
branch/v17 Failed
branch/v18 Failed

tigrato added a commit that referenced this pull request Jan 5, 2026
#62161)

* Add status field to Kubernetes clusters to support discovery workflows

Adds a new Status field to KubernetesClusterV3 containing discovery-related
information. The status includes cloud provider-specific data, starting with
AWS which tracks the ARN for access setup, integration name, and the assumed
role used during discovery.

This enables the discovery service to persist state about clusters it discovers,
which is needed to properly cleanup created access entries when a cluster is
removed or no longer matches the label filtering. Without this information,
dangling AWS resources would be left behind after discovery changes.

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>

* handle code review comments

* handle code review comments

* add comment

---------

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
tigrato added a commit that referenced this pull request Jan 5, 2026
#62161)

* Add status field to Kubernetes clusters to support discovery workflows

Adds a new Status field to KubernetesClusterV3 containing discovery-related
information. The status includes cloud provider-specific data, starting with
AWS which tracks the ARN for access setup, integration name, and the assumed
role used during discovery.

This enables the discovery service to persist state about clusters it discovers,
which is needed to properly cleanup created access entries when a cluster is
removed or no longer matches the label filtering. Without this information,
dangling AWS resources would be left behind after discovery changes.

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>

* handle code review comments

* handle code review comments

* add comment

---------

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
21KennethTran pushed a commit that referenced this pull request Jan 6, 2026
#62161)

* Add status field to Kubernetes clusters to support discovery workflows

Adds a new Status field to KubernetesClusterV3 containing discovery-related
information. The status includes cloud provider-specific data, starting with
AWS which tracks the ARN for access setup, integration name, and the assumed
role used during discovery.

This enables the discovery service to persist state about clusters it discovers,
which is needed to properly cleanup created access entries when a cluster is
removed or no longer matches the label filtering. Without this information,
dangling AWS resources would be left behind after discovery changes.

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>

* handle code review comments

* handle code review comments

* add comment

---------

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
github-merge-queue bot pushed a commit that referenced this pull request Jan 7, 2026
#62161) (#62599)

* Add status field to Kubernetes clusters to support discovery workflows

Adds a new Status field to KubernetesClusterV3 containing discovery-related
information. The status includes cloud provider-specific data, starting with
AWS which tracks the ARN for access setup, integration name, and the assumed
role used during discovery.

This enables the discovery service to persist state about clusters it discovers,
which is needed to properly cleanup created access entries when a cluster is
removed or no longer matches the label filtering. Without this information,
dangling AWS resources would be left behind after discovery changes.



* handle code review comments

* handle code review comments

* add comment

---------

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
github-merge-queue bot pushed a commit that referenced this pull request Jan 7, 2026
#62161) (#62598)

* Add status field to Kubernetes clusters to support discovery workflows

Adds a new Status field to KubernetesClusterV3 containing discovery-related
information. The status includes cloud provider-specific data, starting with
AWS which tracks the ARN for access setup, integration name, and the assumed
role used during discovery.

This enables the discovery service to persist state about clusters it discovers,
which is needed to properly cleanup created access entries when a cluster is
removed or no longer matches the label filtering. Without this information,
dangling AWS resources would be left behind after discovery changes.



* handle code review comments

* handle code review comments

* add comment

---------

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants