-
Notifications
You must be signed in to change notification settings - Fork 1.5k
*: use a human friendly ID to tag and name cloud resources. #1280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: use a human friendly ID to tag and name cloud resources. #1280
Conversation
pkg/asset/cluster/tfvars.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop this? A non-uniquified name seems like something we'd never use. Edit: except for creating DNS entries, so nevermind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data/data/aws/bootstrap/main.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're cleaning up, I think we can drop -role, which is already implied by the resource type. Basically, every bootstrap resource can use ${var.cluster_id}-bootstrap unless we create several of a single resource type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine if we keep the names as is. it doesn't hurt here as iam roles do not seem to have any limitation on length of name.
data/data/aws/main.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everybidy needs to have kubernetes.io/cluster/... tag so that destroy captures bootstrap resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we still planning to use the kubernetes.io/cluster/... tag for destroy? Since everything is tagged with the Name tag, can't we search on that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we still planning to use the
kubernetes.io/cluster/...tag for destroy? Since everything is tagged with theNametag, can't we search on that instead?
Name is resource name and has nothing to do with the cluster. AWS will use kubernetes.io/cluster/... tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of keeping a tag that Kubernetes ignores (like openshiftClusterID) for tearing down bootstrap resources that are not part of the Kubernetes cluster. But if folks think we don't have to worry about the cluster accidentally adopting bootstrap resources, I'm ok collapsing to just the kubernetes.io/cluster/... tag.
staebler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes make me happy.
-
Should the following use cluster_id?
https://github.com/openshift/installer/blob/e9073fd3dcdef7595b7eca927b7944e7e1e76dcd/data/data/aws/bootstrap/main.tf#L37 -
pkg/asset/machines/worker.gois using the UUID instead of the human ID. -
I don't like that the term cluster ID is overloaded. It would be nice if one of the two uses were qualified in some way so that it is evident from the term which of the cluster IDs is being used.
data/data/aws/bootstrap/variables.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add human-readable to the description to increase the differentiation between the UUID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we differentiate between UUID and HumanID in terraform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within terraform we do not need to differentiate. However, this is at the boundary between terraform and go. The go code is ambiguous with the term cluster ID. As a reader of the code, it is not immediately clear to me here whether the cluster ID that terraform is expecting is the UUID or the human ID. We could either (1) expect the reader of the code to dig further into the code to figure out which one is expected here or (2) add more information to the description of the variable to make it clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the boundary says give me an identifier for the cluster, it doesn't need to care if it was given an UUID or a ID that is human readable... ?
Let me update it to include constraints like length, characters, because that's what it cares any nothing more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/tfvars/tfvars.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be worthwhile to replace this with ClusterDomain. That way the logic for how the cluster domain is constructed is contained in only one place rather than in both go and terraform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done with 3711b75
data/data/aws/vpc/master-elb.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why set Name on resources that have a name property, if we're setting the same value in both? I think it only makes sense when we feel like name must be overly truncated and want the tag for the unabbreviated version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from d6de59a
- All tags include `Name` key with name mathcing the name for the resource.
This allows cluster to find resources using tags based on names.
pkg/asset/installconfig/clusterid.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to trim sensitive characters too, see #1089.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is also imposing AWS-specific length and (eventually) charset restrictions on a nominally-platform-agnostic property. Will this end up as a race to the bottom? Maybe it should be a platform-specific asset? Or do we expect this to be something almost all platforms will need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this same thing with the PR that I put out earlier. My decision was that for the time being there is no reason to make the HumanID characteristics unique to the platform as we do not have any platforms currently that place more onerous restrictions than we face with ELBs from AWS. It should be easy enough to change to platform-specific HumanIDs in the future if and when they are beneficial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should be able to send cluste_ids to different platforms if that becomes a problem.
IMO this HumanID has enough constraints to work through most.
I think we want to trim sensitive characters too, see #1089.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e9073fd to
cdb93ca
Compare
data/data/aws/main.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation behind bringing this here from the route53 module? I would have expected that we would want to go the other way with these route53 resources and bring aws_route53_record.etcd_a_nodes, aws_route53_record.etcd_cluster, and aws_route53_zone.int to the route53 module instead of bringing route53 resources out of the route53 module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with how the current r53 module is setup it takes private zone id and base_domain, and I think pulling the public zone out makes it so that the module takes ids for both for better consistency.
from d6de59a
Also moves the public zone discover to root module and pass the `public_route53_id`, similar to private zone, to `route53` zone.
and for moving the etcd records into the module, that makes sense. let me see and if the change is simple i'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, my contention is that aws_route32_zone.int would be better placed in the route53 module. If that were moved, then there would be consistency in that neither public zone id nor private zone id would be passed into the route53 module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you then turn off either of them (not that you can do them right now..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. How would you turn off either of them if they were in main instead of the route53 module? How would that be different if they were in the route53 module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example if we move public and private zone into module, then the module needs 2 variables... base_domain cluster_domain. And if I want to switch off the public zone records, I will have to pass empty base_domain IMO i don't like this indicator.
on the other hand if we pass zone ids, in the root module i pass empty zone id for public zone and that IMO is better indicator.
thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to make the public zone records optional, then there would need to be a variable coming into terraform to cover that configuration. That same variable would be passed down to the route53 module. I would not suggest overloading the semantics of base_domain to use it to indicate whether the public zone records are off or on. In my opinion, adding an additional variable to the route53 module that explicitly configures whether the public zone records are enabled is a worthwhile trade-off for keeping route53-specific resources contained within the route53 module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done with d4ce44e
cdb93ca to
6e6d71f
Compare
Fixed here d2d9bbc
Fixed here 1a2a37f bda7b94 6e6d71f
I think it is fair to say the clusterID gived to 2 options a UUID that you can use where you absolutely want global uniquenes and HumanID where you care about uniqueness to only small level. |
6e6d71f to
e0614f2
Compare
Fair enough. If you think that it will always be completely obvious which specific kind of cluster ID is required for every parameter passed to every function, then I am satisfied. |
data/data/config.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max length is 27, not 32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done b156efa
data/data/config.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/alphabumeric/alphanumeric/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done with 144ab05a5
pkg/asset/installconfig/clusterid.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is of max length maxNameLen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh Yeah :) will fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done with 9f5c0c0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather see the expected replacements written out explicitly in the test cases. This just ends up running the same code that is used in generateHumanID. If there is a bug in the replacement, this won't capture the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
79a821c to
653a9b2
Compare
653a9b2 to
a5b590d
Compare
|
looks like network operator error e2e-aws: level=info msg="Waiting up to 30m0s for the cluster to initialize..."
level=fatal msg="failed to initialize the cluster: timed out waiting for the condition" {
"apiVersion": "config.openshift.io/v1",
"kind": "ClusterOperator",
"metadata": {
"creationTimestamp": "2019-02-20T23:36:45Z",
"generation": 1,
"name": "network",
"resourceVersion": "29767",
"selfLink": "/apis/config.openshift.io/v1/clusteroperators/network",
"uid": "628e2788-3568-11e9-b802-0efd314e1e72"
},
"spec": {},
"status": {
"conditions": [
{
"lastTransitionTime": "2019-02-20T23:36:53Z",
"status": "False",
"type": "Failing"
},
{
"lastTransitionTime": "2019-02-20T23:36:53Z",
"message": "DaemonSet \"openshift-multus/multus\" is not available (awaiting 1 nodes)\nDaemonSet \"openshift-sdn/ovs\" is not available (awaiting 1 nodes)\nDaemonSet \"openshift-sdn/sdn\" is not available (awaiting 1 nodes)",
"reason": "Deploying",
"status": "True",
"type": "Progressing"
},
{
"lastTransitionTime": "2019-02-20T23:36:53Z",
"message": "DaemonSet \"openshift-multus/multus\" is not available (awaiting 1 nodes)\nDaemonSet \"openshift-sdn/ovs\" is not available (awaiting 1 nodes)\nDaemonSet \"openshift-sdn/sdn\" is not available (awaiting 1 nodes)",
"reason": "Deploying",
"status": "False",
"type": "Available"
}
],
"extension": null,
"version": ""
}
},/retest |
|
/retest |
Currently the hive uses the `.clusterID` field as source for the ID used by telemeter and cluster-version-operator. Adding the `infraID` field allows hive to continue using `clusterID` for cluster UUID purpose and `infraID` for ID used to tag/name cloud resources.
b95be7d to
64d894a
Compare
|
missed dropping variable during rebase :P |
router flaking |
staebler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, staebler The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| provider := provider(clustername, config.Networking.MachineCIDR.String(), platform, userDataSecret) | ||
| name := fmt.Sprintf("%s-%s-%d", clustername, pool.Name, 0) | ||
| provider := provider(clusterID, config.Networking.MachineCIDR.String(), platform, userDataSecret) | ||
| name := fmt.Sprintf("%s-%s-%d", clusterID, pool.Name, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should Hive be passing when we call this function? This looks like it's still cluster ID, but should probably be infraID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
N/M found where it was passed in, we're sending infraID, var could probably use a rename though.
image-registry-operator is using the ID from clusterversion object in the cluster to tag s3 buckets [1].
```
$ AWS_PROFILE=ci aws s3api get-bucket-tagging --bucket image-registry-us-east-1-f23dc729905144e6b844ab3d76a42ed7-a306
{
"TagSet": [
{
"Key": "openshiftClusterID",
"Value": "f23dc729-9051-44e6-b844-ab3d76a42ed7"
},
{
"Key": "expirationDate",
"Value": "2019-02-23T05:39+0000"
}
]
}
```
Therefore the destroy code is leaking registry s3 buckets. Adding the `openshiftClusterID` tag to AWS metadata back to make sure we don't
leak resources.
Installer tried to move to single tag for AWS clusters here [2].
[1]: https://github.com/openshift/cluster-image-registry-operator/blob/7228534c826c92dee38d578445c93f2537f0b775/pkg/storage/s3/s3.go#L288
[2]: openshift#1280
The installer no longer sets this since openshift/installer@56f47611 (data/aws: use cluster_id to name all resources, 2019-02-19, openshift/installer#1280). Preserving the file seems fairly straightforward, so I'm just dropping the reconstruction nodes. We may be able to add something similar back once the format stabilizes again.
The installer no longer sets this since openshift/installer@56f47611 (data/aws: use cluster_id to name all resources, 2019-02-19, openshift/installer#1280).
The installer no longer sets this since openshift/installer@56f47611 (data/aws: use cluster_id to name all resources, 2019-02-19, openshift#1280). The credentials operator still sets it on users, but just leak those for now (there's work in progress to fix them).
…t.io
Mostly, this is replacing cluster-config-v1 (which is deprecated [1])
with the Infrastructure config object. As part of that, I'm dropping
UserTags support (see discussion in [2,3]). I'm also replacing the
openshiftClusterID tag with a "kubernetes.io/cluster/{infraName}:
owned" tag [4,5].
[1]: openshift/installer#680
[2]: openshift/api#231 (comment)
[3]: openshift/api#266
[4]: openshift/installer#1280
[5]: https://bugzilla.redhat.com/show_bug.cgi?id=1685089#c11
…t.io
Mostly, this is replacing cluster-config-v1 (which is deprecated [1])
with the Infrastructure config object. As part of that, I'm dropping
UserTags support (see discussion in [2,3]). I'm also replacing the
openshiftClusterID tag with a "kubernetes.io/cluster/{infraName}:
owned" tag [4,5].
UnversionedRESTClientFor (vs. RESTClientFor) avoids errors like:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1304/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-fc9948cc-b767n_cluster-image-registry-operator.log.gz | gunzip
I0423 05:43:03.599910 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g38fc8f9-dirty
...
I0423 05:43:04.942285 1 bootstrap.go:38] generating registry custom resource
E0423 05:43:04.942775 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: GroupVersion is required when initializing a RESTClient, requeuing
InClusterConfig does not set GroupVersion [6] and RESTClientFor
requires it [7] while UnversionedRESTClientFor can fill it in [8].
[1]: openshift/installer#680
[2]: openshift/api#231 (comment)
[3]: openshift/api#266
[4]: openshift/installer#1280
[5]: https://bugzilla.redhat.com/show_bug.cgi?id=1685089#c11
[6]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L422-L428
[7]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L269-L270
[8]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L331-L333
…t.io
Mostly, this is replacing cluster-config-v1 (which is deprecated [1])
with the Infrastructure config object. As part of that, I'm dropping
UserTags support (see discussion in [2,3]). I'm also replacing the
openshiftClusterID tag with a "kubernetes.io/cluster/{infraName}:
owned" tag [4,5].
The trip through clientcorev1.NewForConfig is because calling
RESTClientFor on the InClusterConfig raises:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1304/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-fc9948cc-b767n_cluster-image-registry-operator.log.gz | gunzip
I0423 05:43:03.599910 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g38fc8f9-dirty
...
I0423 05:43:04.942285 1 bootstrap.go:38] generating registry custom resource
E0423 05:43:04.942775 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: GroupVersion is required when initializing a RESTClient, requeuing
InClusterConfig does not set GroupVersion [6] and RESTClientFor
requires it [7] while UnversionedRESTClientFor can fill it in [8].
But UnversionedRESTClientFor leaves NegotiatedSerializer unset:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1305/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-66d8b56784-5wt9v_cluster-image-registry-operator.log.gz | gunzip
I0423 09:21:42.998300 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g64d5f62-dirty
...
I0423 09:21:44.345677 1 bootstrap.go:38] generating registry custom resource
E0423 09:21:44.346403 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: NegotiatedSerializer is required when initializing a RESTClient, requeuing
So I'm washing through clientcorev1.NewForConfig to get defaults for
both [9].
[1]: openshift/installer#680
[2]: openshift/api#231 (comment)
[3]: openshift/api#266
[4]: openshift/installer#1280
[5]: https://bugzilla.redhat.com/show_bug.cgi?id=1685089#c11
[6]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L422-L428
[7]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L269-L270
[8]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L331-L333
[9]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/kubernetes/typed/core/v1/core_client.go#L144-L155
…t.io
Mostly, this is replacing cluster-config-v1 (which is deprecated [1])
with the Infrastructure config object. As part of that, I'm dropping
UserTags support (see discussion in [2,3]). I'm also replacing the
openshiftClusterID tag with a "kubernetes.io/cluster/{infraName}:
owned" tag [4,5].
The trip through clientcorev1.NewForConfig is because calling
RESTClientFor on the InClusterConfig raises:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1304/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-fc9948cc-b767n_cluster-image-registry-operator.log.gz | gunzip
I0423 05:43:03.599910 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g38fc8f9-dirty
...
I0423 05:43:04.942285 1 bootstrap.go:38] generating registry custom resource
E0423 05:43:04.942775 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: GroupVersion is required when initializing a RESTClient, requeuing
InClusterConfig does not set GroupVersion [6] and RESTClientFor
requires it [7] while UnversionedRESTClientFor can fill it in [8].
But UnversionedRESTClientFor leaves NegotiatedSerializer unset:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1305/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-66d8b56784-5wt9v_cluster-image-registry-operator.log.gz | gunzip
I0423 09:21:42.998300 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g64d5f62-dirty
...
I0423 09:21:44.345677 1 bootstrap.go:38] generating registry custom resource
E0423 09:21:44.346403 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: NegotiatedSerializer is required when initializing a RESTClient, requeuing
So I'm washing through clientcorev1.NewForConfig to get defaults for
both [9].
The RBAC change avoids:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1306/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-7f59899bf6-8c8x8_cluster-image-registry-operator.log.gz | gunzip | head -n7
I0423 10:44:51.121540 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-gfa5b59b-dirty
...
I0423 10:44:52.461691 1 bootstrap.go:38] generating registry custom resource
E0423 10:44:52.464035 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: infrastructures "cluster" is forbidden: User "system:serviceaccount:openshift-image-registry:cluster-image-registry-operator" cannot get resource "infrastructures" in API group "" at the cluster scope, requeuing
[1]: openshift/installer#680
[2]: openshift/api#231 (comment)
[3]: openshift/api#266
[4]: openshift/installer#1280
[5]: https://bugzilla.redhat.com/show_bug.cgi?id=1685089#c11
[6]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L422-L428
[7]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L269-L270
[8]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L331-L333
[9]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/kubernetes/typed/core/v1/core_client.go#L144-L155
…t.io
Mostly, this is replacing cluster-config-v1 (which is deprecated [1])
with the Infrastructure config object. As part of that, I'm dropping
UserTags support (see discussion in [2,3]). I'm also replacing the
openshiftClusterID tag with a "kubernetes.io/cluster/{infraName}:
owned" tag [4,5].
The trip through clientcorev1.NewForConfig is because calling
RESTClientFor on the InClusterConfig raises:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1304/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-fc9948cc-b767n_cluster-image-registry-operator.log.gz | gunzip
I0423 05:43:03.599910 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g38fc8f9-dirty
...
I0423 05:43:04.942285 1 bootstrap.go:38] generating registry custom resource
E0423 05:43:04.942775 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: GroupVersion is required when initializing a RESTClient, requeuing
InClusterConfig does not set GroupVersion [6] and RESTClientFor
requires it [7] while UnversionedRESTClientFor can fill it in [8].
But UnversionedRESTClientFor leaves NegotiatedSerializer unset:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1305/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-66d8b56784-5wt9v_cluster-image-registry-operator.log.gz | gunzip
I0423 09:21:42.998300 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-g64d5f62-dirty
...
I0423 09:21:44.345677 1 bootstrap.go:38] generating registry custom resource
E0423 09:21:44.346403 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: NegotiatedSerializer is required when initializing a RESTClient, requeuing
So I'm washing through clientcorev1.NewForConfig to get defaults for
both [9].
The RBAC change avoids:
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/260/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws/1306/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-7f59899bf6-8c8x8_cluster-image-registry-operator.log.gz | gunzip | head -n7
I0423 10:44:51.121540 1 main.go:18] Cluster Image Registry Operator Version: 4.0.0-142-gfa5b59b-dirty
...
I0423 10:44:52.461691 1 bootstrap.go:38] generating registry custom resource
E0423 10:44:52.464035 1 controller.go:235] unable to sync: unable to get storage configuration from cluster install config: infrastructures "cluster" is forbidden: User "system:serviceaccount:openshift-image-registry:cluster-image-registry-operator" cannot get resource "infrastructures" in API group "" at the cluster scope, requeuing
[1]: openshift/installer#680
[2]: openshift/api#231 (comment)
[3]: openshift/api#266
[4]: openshift/installer#1280
[5]: https://bugzilla.redhat.com/show_bug.cgi?id=1685089#c11
[6]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L422-L428
[7]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L269-L270
[8]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/rest/config.go#L331-L333
[9]: https://github.com/kubernetes/client-go/blob/59781b88d0faa9d51be48b4ca49eb08e05246de4/kubernetes/typed/core/v1/core_client.go#L144-L155
AWS:
All the resources are named
<cluster_id>-<something>and all names use-for separator.This helps with consistently naming all the resources, and keeping the cluster_id in the reduces the probability of collisions.
All tags include
Namekey with name mathcing the name for the resource.This allows cluster to find resources using tags based on names.
Also move some of the
Nametags from aws root module into specific modules because settingNameat root level makes it difficult for overriding theNamelocally.All resources are tagged with
kubernetes.io/cluster/<cluster_id>: ownedThe loadbalancer target group names also have a max length of 32, therefore the target groups are using 4 character long suffixes.
Libvirt:
cluster_idOpenstack:
cluster_idopenshiftClusterID: cluster_idto tag all its resources.The machine-api requires that all machine objects reference back to cluster object as specified by
sigs.k8s.io/cluster-api-clusterlabelon the machine objects to make sure masters are correctly adopted. Therfore the name for cluster object is the id used to tag the resources.