Skip to content

Conversation

@cgwalters
Copy link
Member

See #765 (comment)

MachineConfigPool needs a Spec.Configuration and Status.Configuration
[just like other objects][1] so that we can properly detect state.
Currently there's a race because the render controler may set Status.Configuration
while the pool's Status still has Updated, so one can't check whether the
pool is at a given config.

[1] https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status)

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 16, 2019
@cgwalters
Copy link
Member Author

This doesn't pass units and is definitely broken, just putting up this WIP for early review/tracking.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 16, 2019
@derekwaynecarr
Copy link
Member

be careful you dont break backwards compatibility with 4.1 and 4.2 clients.

@cgwalters
Copy link
Member Author

be careful you dont break backwards compatibility with 4.1 and 4.2 clients.

Hmm...are there any other clients? Maybe the console...I should probably try that out at some point.

@cgwalters
Copy link
Member Author

OK so the console definitely renders MachineConfigs and MachineConfigPools...but it's super primitive. And hum, looking at the console code it's TypeScript/React, interesting. I don't think what we're doing here would break it, conceptually we're just extending the schema.

@cgwalters cgwalters force-pushed the mcp-spec-status branch 4 times, most recently from 9358d23 to 28c04c2 Compare May 21, 2019 22:05
See openshift#765 (comment)

MachineConfigPool needs a `Spec.Configuration` and `Status.Configuration`
[just like other objects][1] so that we can properly detect state.
Currently there's a race because the render controller may set `Status.Configuration`
while the pool's `Status` still has `Updated`, so one can't reliably check whether the
pool is at a given config.

With this, ownership is clear: the render controller sets the spec, and the node controller
updates the status.

[1] https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status)
@cgwalters cgwalters changed the title WIP: Add a Spec.Configuration to MachineConfigPool Add a Spec.Configuration to MachineConfigPool May 22, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 22, 2019
@cgwalters
Copy link
Member Author

OK, did some more fixes and am lifting WIP on this one. Been playing with it interactively now and it is looking good, but let's see what e2e tests say!

@cgwalters
Copy link
Member Author

Hooray, e2e passed! 🎉

machineCount := int32(len(nodes))

updatedMachines := getUpdatedMachines(pool.Status.Configuration.Name, nodes)
updatedMachines := getUpdatedMachines(pool.Spec.Configuration.Name, nodes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the Spec of an object describes the desired state, shouldn't this routine use Status? we're interested in nodes at a given state (not the desired) - I may be missing something and re-reading https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. We're trying to calculate the Updated/Updating conditions here, so we need to know how many nodes are at our desired state.

I'll admit to not being steeped in the art here - can you think of anything else similar to this?

But, it would feel very strange if these values followed Status - the updatedMachineCount would go down as we updated...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit to not being steeped in the art here - can you think of anything else similar to this?

the template controller is the only one I can think of - it uses Status to tell the render ctrl "I'm done"

but I believe you're right and we're just calculating where we are wrt desired state here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to think of this is; we probably should never have had Configuration.Name as part of Status, it should have always been in Spec.

Having parts of Status should be derived from other Status fields seems wrong.

In effect, the Status.Configuration.Name is more like what the CVO is doing with its history field.

Yet another argument; Status.Configuration is almost meaningless given that multiple configurations could have been rolled out. If any API client wants to know the state of the system, there's really either "fully updated" or "updating". Any more detail on "updating" really requires looking at each node for its currentConfig. The Status.Configuration isn't so useful for that.

@runcom
Copy link
Member

runcom commented May 27, 2019

just a question, LGTM otherwise

/approve

@runcom
Copy link
Member

runcom commented May 28, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 28, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 5121620 into openshift:master May 28, 2019
@runcom
Copy link
Member

runcom commented Jun 14, 2019

/cherrypick release-4.1

@openshift-cherrypick-robot

@runcom: #773 failed to apply on top of branch "release-4.1":

Using index info to reconstruct a base tree...
M	pkg/apis/machineconfiguration.openshift.io/v1/types.go
M	pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
M	pkg/controller/kubelet-config/kubelet_config_controller_test.go
M	pkg/controller/node/node_controller.go
M	pkg/controller/node/node_controller_test.go
M	pkg/controller/node/status.go
M	pkg/controller/render/render_controller_test.go
M	pkg/operator/status.go
M	test/e2e/mcd_test.go
Falling back to patching base and 3-way merge...
Auto-merging test/e2e/mcd_test.go
CONFLICT (content): Merge conflict in test/e2e/mcd_test.go
Auto-merging pkg/operator/status.go
Auto-merging pkg/controller/render/render_controller_test.go
Auto-merging pkg/controller/node/status.go
Auto-merging pkg/controller/node/node_controller_test.go
Auto-merging pkg/controller/node/node_controller.go
Auto-merging pkg/controller/kubelet-config/kubelet_config_controller_test.go
Auto-merging pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
Auto-merging pkg/apis/machineconfiguration.openshift.io/v1/types.go
error: Failed to merge in the changes.
Patch failed at 0001 Add Spec.Configuration to MachineConfigPool, render controller writes it

Details

In response to this:

/cherrypick release-4.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@runcom
Copy link
Member

runcom commented Jun 14, 2019

oh gosh

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request Jun 20, 2019
Missed this as part of openshift#773
(Doesn't matter yet, just prep for any future changes)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants