-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add APIs for machine phases #531
Add APIs for machine phases #531
Conversation
pkg/apis/cluster/common/consts.go
Outdated
// MachineRunning should be set when machine has joined the cluster and running successfully. | ||
MachineRunning MachinePhase = "Running" | ||
|
||
// MachineRunning should be set when machine is being deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace MachineRunning with MachineTerminating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks for pointing out.
e3a39cf
to
f4c713c
Compare
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit from me
pkg/apis/cluster/common/consts.go
Outdated
) | ||
|
||
// MachineState is the current status of the last performed operation. | ||
type MachineState string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: MachineOperationState
would be a better name as it represents the state of the operation, not of the machine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense, thanks.
Updating the occurrences wherever needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
5f36ff3
to
7426220
Compare
Background for the failing tests: The timeout for the test controlplane is too low, so it often fails to start: This already happened multiple times on different PRs. For the time being the workaround is to hit I've created a PR in kubebuilder to get this configurable so we can increase the timeout: kubernetes-sigs/controller-runtime#169 It will however take some time until we can use that, because even when that PR is merged we have to wait for the next release. |
/retest |
thanks, @alvaroaleman for the information, I will then retry after some time. |
/retest |
@hardikdr Actually the failures here a real, check out the logs |
7426220
to
d58060c
Compare
@alvaroaleman Yes, I just corrected it and test-check seems to be passing. Thanks for the help. |
/lgtm |
@hardikdr Can you quickly explain what this allows us to do what we can currently not do? Because:
Is there anything I missed in the above list? Is there anything I got wrong in the above list? |
Yes sure. 1. Expressiveness:
It seems it will be great to express all three scenarios in a structured fashion.
2. Consumption:
To answer the question:
Though machine-shared-controller and machine-external-controller would still require a contract to update the machineStatus, but that's a separate thread. Turned out to be a long one, but I hope it makes sense :) Feedback is most welcome. |
Thanks for the detailed answer, @hardikdr !
We can derive all this information from
I don't think it should be the concern of the
Can you explain why this is needed? For users we can use events, for other controllers do you have an example use-case where the information available via the
Basically the same question as above, do you have a sample of a controller who needs this? The |
thanks for the comments @alvaroaleman .
Yes, NodeRef is essential and will provide much info, but I am not sure if there is a way to understand if creation/deletion or other operation has failed or succeeded on a particular machine- via NodeRef's existence. Deriving certain possibilities only based on the availability of NodeRef or pointer to wrong machines- seems less desirable.
I would rather expect machine-controller should mostly be resposible for creation/deletion of the machines and reporting right health-status via MachineStatus field. MachineSet controller should then look for Failed machines and try to replace them. This is more from the perspective that MachienSet controller should only make sure to have # of healthy machines and take necessary steps - and not participate in race-condition with MachineController while recreation of machines.
Fundamentally, NodeRef is bound by the possible values available in the NodeObject. We would defininitely want to expand our usecases to include new phases such as Draining/Standby[for barementals] and more. |
Others have suggested using Events instead. Can events be lost however, if a controller reboots for example? From a user perspective I am not sure there is a more reliable way to answer these questions. Is the machine still upgrading, is it being deleted, etc? This is valuable for user feedback within a GUI. |
As discussed during the meeting today, we will merge this in ~24 hours unless there are objections before then. /approve |
/retest |
@chaosaffe thanks for the suggestions. All of them looks good to me. I also made the necessary changes. |
/lgtm |
// specific machine. It should also convey the state of the latest-operation for example if | ||
// it is still on-going, failed or completed successfully. | ||
// +optional | ||
LastOperation LastOperation `json:"lastOperation,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this a pointer as its an optional struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes sure, done.
45812f1
to
1e19979
Compare
/lgtm |
/retest |
It looks like we've pretty much reached consensus on this PR. Let's make sure there are no further comments or objections at the next meeting and then merge it if all looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor naming suggestion, otherwise lgtm
properties: | ||
description: | ||
type: string | ||
lastUpdateTime: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we change the name lastUpdateTime
to lastUpdated
just to be consistent with the similar fields in other places and objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks.
Co-Authored-By: hardikdr <[email protected]>
1e19979
to
a6c4f92
Compare
/LGTM |
@sidharthsurana: changing LGTM is restricted to assignees, and only kubernetes-sigs/cluster-api repo collaborators may be assigned issues. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm
…On Wed, Nov 28, 2018, 19:51 k8s-ci-robot, ***@***.***> wrote:
@sidharthsurana <https://github.com/sidharthsurana>: changing LGTM is
restricted to assignees, and only kubernetes-sigs/cluster-api repo
collaborators may be assigned issues.
In response to this
<#531 (comment)>
:
/LGTM
Instructions for interacting with me using PR comments are available here
<https://git.k8s.io/community/contributors/guide/pull-requests.md>. If
you have questions or suggestions related to my behavior, please file an
issue against the kubernetes/test-infra
<https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:>
repository.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#531 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APFttC5E71J0j4kN-21iFMb5B1jsAj8wks5uztswgaJpZM4XVdFZ>
.
|
/approve |
/hold cancel |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hardikdr, roberthbailey The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I apologize for the late question, but is there a write-up that summarizes a response from Daniel's feedback here: similar to him, I worry about the use of a single phase based on experience with pods. |
@hardikdr -- can you answer @derekwaynecarr's question? I know that you had a chance to connect with @lavalamp at KubeCon and chat about phases / conditions / etc. |
…doc_update note about capv master
What this PR does / why we need it: This PR adds the necessary machines-phases and states in the machine-api stack. Machine phases and states are a way of representing the lifecycle of the machines. PR is based on the following proposal doc: proposal
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
Release note: