-
Notifications
You must be signed in to change notification settings - Fork 461
daemon: Run once mode #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/cc @sdemos @jlebon @kikisdeliveryservice for visibility as active fellow MCD devs. |
e03c02e to
4281ce6
Compare
|
/cc @crawford If we can pin down the format of the remote service I think we can actually write a card and execute on this. If at all possible I'd like the remote file to be the same as the ignition config ... OR the same as the CRD. The fewer things we have to parse to execute the same/similar operations the better IMHO. |
|
I would lean toward the format being an ignition config (not CRD). My reasoning for this:
If we don't already generate ignition configs for master/worker (and instead just their CRD forms) - we likely should add them as an output format of the installer asset generation phase (cc @abhinavdahiya ) |
|
@aaronlevy the MCD currently consumes MCs (machine config) which wrap an ignition config with more metadata used for defining what version the host should be. I'm alright with ignition being the file we parse if it's a |
|
That's a fair point - and would tip the scales the other direction :) Thinking a bit more - MCD doesn't need to really do anything special for master or worker profiles - we should technically have an api endpoint for all of those (either real api, or bootstrap api). So I think I was jumping incorrectly to assumption that it would need a non-api run-once mode for those profiles. Locally, it would just need to execute bootstrap config. So would this seem reasonable to everyone:
|
👍
That would sound reasonable to me. To reiterate to make sure I'm understanding properly, the api endpoint (full URI with scheme) would be passed to |
|
If my reiteration above is correct I think we have enough to groom a card and do this work. |
|
That sounds good to me. FWIW - naming of flags and such I have no strong feelings about |
|
OK cool. I've created a card based on this and we'll start filling functionality in soon. |
|
@sdemos When you have a few PTAL at what I have so far before I start wiring up a single process mode. |
pkg/daemon/daemon.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean it has to be an absolute path? is there a way we can support relative paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does mean absolute. We could support relative ... I can add that in my next update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either way is fine with me, but if it's just absolute we should make sure to be explicit about it, it might be a confusing error if you specify a relative file path and get told that you need to provide a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: Errors, I'm wondering if we need to validate the URI before proceeding or if we think that the errors from the calls read files/pull configs will be enough?
pkg/daemon/daemon.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obviously you haven't started on this work yet, but it might be worth it to refactor the rest of the daemon a little so the main loop uses the same function internally as whatever is going to get called here so we can feel confident it's the same behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will take some serious refactoring I'm afraid. The process method utilizes a lot of external calls to get content, update status, etc.. all of which are not all available in a run once scenario. But agreed. My first attempt will be to try to decouple some of the functions/methods process calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, makes sense. plus there is the work in #130 which is going to change that logic up even further. maybe we can just work on unifying it moving forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdemos as in encapsulate the logic for runOnce on it's own as to make merging easy and follow up later to unify them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, since it would take some significant work to fully unify them today
cmd/machine-config-daemon/start.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs only mention URI but it can be a straight path too (e.g. not file://)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll clarify that this can be a path to a file or a URL. URI does make it sound like http://, ftp://, file://, etc.. would be supported when really it's http[s]:// and /....
pkg/daemon/daemon.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add these ioutils to fsclient.go?:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd/machine-config-daemon/start.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor typo: "its"
|
(not starting a retest yet as this requires another PR which hasn't merged yet due to yesterdays bot outage. It's being re-reviewed now) |
|
/retest |
|
Will rebase with #134 later today. |
|
Seems like another flake? :( |
We are not seeing this specific error anywhere else yet NAME STATUS ROLES AGE VERSION
ip-10-0-12-35.ec2.internal Ready,SchedulingDisabled master 1h v1.11.0+d4cacc0
ip-10-0-137-214.ec2.internal Ready,SchedulingDisabled worker 1h v1.11.0+d4cacc0
ip-10-0-145-104.ec2.internal Ready,SchedulingDisabled worker 1h v1.11.0+d4cacc0
ip-10-0-175-235.ec2.internal Ready,SchedulingDisabled worker 1h v1.11.0+d4cacc0
ip-10-0-23-254.ec2.internal Ready,SchedulingDisabled master 1h v1.11.0+d4cacc0
ip-10-0-44-102.ec2.internal Ready,SchedulingDisabled master 1h v1.11.0+d4cacc0
Waiting for router to be created ...That makes it seem like it is not a flake. |
|
Ah interesting. Thanks @abhinavdahiya |
|
Agreed, I don't think this is a flake. It does look like a flake we hit previously BUT @yuqi-zhang and I have found cases where nodes are set degraded when they shouldn't be. We're working on this now. |
Signed-off-by: Steve Milner <smilner@redhat.com>
Signed-off-by: Steve Milner <smilner@redhat.com>
Signed-off-by: Steve Milner <smilner@redhat.com>
Signed-off-by: Steve Milner <smilner@redhat.com>
- prepUpdateFromCluster and executeUpdateFromCluster* pulled out of handleNodeUpdate for reuse - triggerUpdateWithMachineConfig added for triggering with a provided desired config - triggerUpdate forwards to triggerUpdateWithMachineConfig(nil) - executeUpdateFromClusterWithMachineConfig added for updateing with a provided desired config - executeUpdateFromCluster forwards to executeUpdateFromClusterWithMachineConfig(nil) Signed-off-by: Steve Milner <smilner@redhat.com>
Signed-off-by: Steve Milner <smilner@redhat.com>
- New: Base instance that works without the cluster. Used in NewClusterDrivenDaemon. - NewClusterDrivenDaemon: Builds on top of New. Works with cluster resources. Signed-off-by: Steve Milner <smilner@redhat.com>
Signed-off-by: Steve Milner <smilner@redhat.com>
Split out the informers creation/start into StartInformer. Idea from @yuqi-zhang. Signed-off-by: Steve Milner <smilner@redhat.com>
When we are in runOnce mode AND the previous MachineConfig does not have a Kind we can assume that there was no previous config to check against. Signed-off-by: Steve Milner <smilner@redhat.com>
Signed-off-by: Steve Milner <smilner@redhat.com>
Remove StartInformer function, as the creation and start must follow the creation - chroot - check state - start workflow. Modify ClientBuilder creation to use old workflow as well. Signed-off-by: Yu Qi Zhang <jerzhang@redhat.com>
|
/hold cancel
|
| func (dn *Daemon) Run(stop <-chan struct{}) error { | ||
| // Catch quickly if we've been asked to run once. | ||
| if dn.onceFrom != "" { | ||
| glog.V(2).Info("Running once per request") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this logging clear enough in the flow of the logs? Would "daemon running once per request" be clearer? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your wording better. How about I rework some of the log strings in a follow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great!
|
Also, thanks for adding comments in this PR @ashcrow , it makes it very easy to read through! 👍 |
|
Some manual testing passes for me both for cluster operation and runOnce with qemu. Will LGTM after logs are cleaned up. Thanks for the work @ashcrow ! |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ashcrow, sdodson, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Thanks @yuqi-zhang and @kikisdeliveryservice! I'll do the log clean up in another PR post merge. |
…copy manifests: specify system-cluster-critical priority and update pull p…
Non RHCOS nodes will need to apply an MC once and exit.
Requires: #139
/cc @aaronlevy @dustymabe @sdodson
Still todo:
onceFromto request resources from the clusteronceFromto be able to run without making requests for state from the cluster