-
Notifications
You must be signed in to change notification settings - Fork 531
Cluster profile: add details about the default profile #414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
enhancements/cvo/cluster-profiles.md
Outdated
| as needed. | ||
|
|
||
| Manifests with no explicit inclusion annotations implicitly belong to the `default` profile. | ||
| If no profile is configured, the cluster-version operator should use the `default` profile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking this over some more, we may need a more involved reword. Above in master we have:
This would make the CVO render this manifest only when
CLUSTER_PROFILE=[identifier]
has been specified.
But that's not actually true, because as the next master paragraph points out, you can list multiple profiles. I suggest we reword this section to say something like:
This environment variable would have to be specified in the CVO deployment.
When noCLUSTER_PROFILE=[identifier]variable is specified, thedefaultcluster profile
is in effect. [NOTE: so I guess no need to echo this with a new line]The following annotation may be used to include manifests for a given profile:
include.release.openshift.io/[identifier]=trueManifests may support inclusion in multiple profiles by including as many of these annotations
as needed.
Manifests with any explicit inclusion annotations only belong to the declared profiles.
Manifests with no explicit inclusion annotations implicitly belong to thedefaultprofile.The cluster-version operator will only reconcile manifests which belong to its configured cluster profile.
Also note that the implicit default here somewhat pushes back against @deads2k's point about avoiding exclusion. It means that you don't need to put in the awareness-raising work he wants for the default profile. But it also means that we don't break compat with the existing default-only profile out of the gate. I'm fine with that. I'd also be ok with "too bad, you need to get everyone to annotate their existing manifests with include.release.openshift.io/[identifier]=true before teaching the CVO anything", if folks feel like the implicit default membership is too magical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that you don't need to put in the awareness-raising work he wants for the default profile. But it also means that we don't break compat with the existing default-only profile out of the gate.
I'd favour first getting the 'magic' default behaviour in CVO, then add 'default' awareness to manifests, and finally remove this magic behaviour when everyone has switched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we specify instead that "Manifests with no explicit inclusion annotations implicitly belong to all profiles."
This would mean that all existing manifests, as well as all newly introduced manifests, are included in all profiles, unless someone specifically adds annotations to indicate otherwise. If an annotation is added to a manifest, then annotations will be needed for all profiles in which that manifest should be included.
This approach will help avoid accidental divergence between what functionality is included in different cluster profiles. Any new manifest will be included in all profiles, unless it has annotations for specific profiles.
It will also mean that a new profile (such as a single-node-cluster profile for CRC, for which we are working on an enhancement PR) can be added and then modified incrementally, rather than requiring immediate PRs to be merged into every repo containing any manifests that should be included in that profile.
My assumption here is that many or most manifests, at least initially, will not need to vary between the small set of profiles that exist, and thus will not require any annotations. But if that assumption proves invalid over time, once we've added explicit annotations to every manifest, we could then change the CVO to require an explicit annotation for any manifest to be included in any profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would mean that all existing manifests, as well as all newly introduced manifests, are included in all profiles, unless someone specifically adds annotations to indicate otherwise... This approach will help avoid accidental divergence between what functionality is included in different cluster profiles
That diverges from David's:
To the common counter argument of "but then it takes a lot of work to create a new profile and all the teams have to aware of it", I say "good!"...
If folks want to push back against all (but the default?) profile being explicitly opt-in for each manifest, I'm not going to stand in the way, but please go fight it out with @deads2k and report back here once you've reached a consensus ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not looking for any fights - just for clarity on what we'll need to do to add and maintain the single-node profile for CRC, for which we are working on a separate enhancement PR. This profile will ideally remain identical to normal OCP deployments, except without HA, with a few features not important for application development left out, and with things tuned to minimize resource requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k 's previous comment seemed to call for explicitly talking new additions into joining each profile. That's obviously more work, but his point is that thinking each manifest addition through is useful. Including by default in the absence of explicit exclusion-action is also internally consistent. CVO can easily implement either. I want some kind of consensus, so we don't flop back and forth depending on who asked most recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still like my reasoning in #200 (comment). Having to modify every manifest you want to include in a profile does a couple things
- Requires a cost to adding a profile. This cost is updating a few hundred file in a few dozen repos. While this seems large initially, the long term cost born by teams for each new topology is a much larger ongoing effort. If it's not worth the few days to find and update, then the profile may not be valuable enough to support.
- This exposes the profile to every component which can be impacted. And because the repos are separate, each component owner will be explicitly aware and cannot claim ignorance in the future. Component owners should question the change and ensure their component/operator can handle it.
To help provide an example and to be sure we can avoid accidental code defaults, I'm in support of actually updating all the existing manifests to include a reference to the default. I'm even willing to help make those PRs and get them merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with @deads2k's reasoning on this, as long as getting patches merged to add annotations to manifests across all needed repositories does not turn out to be an insurmountable obstacle. We plan to have an enhancement PR for adding a single node profile for CRC in review soon. This will show that most manifests in OpenShift will need to be included in this profile, many with no profile-specific changes. I'd expect we'd need to merge a complete set of initial annotations early in development in order to allow the profile to be used to produce a cluster equivalent to the current CRC. Then additional PRs would be merged to further optimize the profile for CRC. Does this seem workable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k : #482 reminded me that this might still be in flight, and seems like rhbz#1871890 still links a bunch of open PRs, mostly in the kube-API sphere of influence. And seems like maybe there aren't even PRs up yet for many components outside the kube-API sphere of influence. Any preliminary thoughts on whether this is something that can be accomplished with a single driving dev, vs. something where you need to call for teams to self-serve in aos-devel@, and then run around and poke folks who don't act on that call?
|
prefer names that map to responsibility matrix. instead of default we use self-managed-ha |
|
@deads2k 👍 the names sound good to me |
|
And yes, agree that all manifests should be modified |
I dislike acronyms, so I would prefer
I'm ok with this. I guess the concern would be that at some point in the future we might have a profile where something else is managed externally, and might want to guard against this with $ oc adm release extract --to manifests quay.io/openshift-release-dev/ocp-release:4.6.0-fc.1-x86_64
$ grep -lr 'exclude.release.openshift.io/internal-openshift-hosted: "true"' manifests | sortlists a bunch of things that are hard for me to fit under one name. For example, the etcd and kube operators fall under "control plane", but we also exclude the config, insights, autoscaling, and machine-API operators, and those aren't things I'd naively expect to fall under "control plane" naming. |
ok
as you found below, it's more than just the control plane |
But it's not everything. Is there a word or phrase that clearly reflects what is still managed? |
|
@wking not sure I could give you one word, but in general for hosted clusters, we are running components that generally do not need to run on master nodes. Components that should be excluded includes machine api and related components, since those need to be running on masters so they can provision workers. There's some exceptions to that specific to IBM's use case such as the oauth server and the insights operator. The config operator so far seems to manage cloud-specific config stuff which is also not needed for IBM, but not sure it shouldn't be included if we had a more general use case for master-node-less clusters. As painful as it sounds, it may make more sense to have a profile specifically for IBM Cloud, and in the future if we have a more general use case for it, add a new profile for that other use case. @derekwaynecarr thoughts? |
|
since this PR is oriented around naming the existing self-managed-highly-available profile that is our present default, I think we can avoid bike-shedding too much on the name of the other variations. |
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
| no `CLUSTER_PROFILE=[identifier]` variable is specified, the `default` cluster profile | ||
| is in effect. | ||
| no `CLUSTER_PROFILE=[identifier]` variable is specified, the `self-managed-highly-available` | ||
| cluster profile is in effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self-managed-highly-available seems like a bad name for the "default" profile unless "who manages it" and "whether it is highly available" are the only axes on which future profiles are allowed to differ from the default.
Like, if you weren't peeking ahead at future plans, why would you call out self-managed and highly-available as the defining attributes of the default OCP experience, but not mention live-upgradeable or rhcos-control-plane or various other things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the known axes, e.g. ROKS moves some control plane components off-cluster (so not self-managed) and CRC is dev-oriented, low resource (so not HA). If we invent more axes later, maybe we'll wish we had been more specific, but 🤷🔮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we invent more axes later, maybe we'll wish we had been more specific
Yeah, I meant we should be less specific; just call it default, rather than trying to guess now all of the ways it's going to be different from other future profiles.
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
This is matches openshift/enhancements#414 and doesn't change existing behavior
|
Everything here has been captured in others PRs. |
This is matches openshift/enhancements#414 and doesn't change existing behavior
This PR better defines the default profile. Previously, there could be a misunderstanding between no-profile, empty-profile and default profile.
More context: openshift/cluster-version-operator#404 (comment)
cc @wking @csrwng