Skip to content

Conversation

@cgwalters
Copy link
Member

This is part of the "enabling OSImageURL" merges sanely.
Still testing this.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2018
@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 11, 2018
@ashcrow
Copy link
Member

ashcrow commented Dec 11, 2018

Makes sense at first glance.

/assign @ashcrow

@abhinavdahiya
Copy link
Contributor

@cgwalters
this looks like first non-empty ?
also update to docs here

@jlebon
Copy link
Member

jlebon commented Dec 11, 2018

Looks sane to me. (Though yeah, this does look like "first non-empty").

@ashcrow
Copy link
Member

ashcrow commented Dec 11, 2018

/test e2e-aws

This is part of the "enabling OSImageURL" merges sanely.
Still testing this.
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cgwalters
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: ashcrow

If they are not already assigned, you can assign the PR to them by writing /assign @ashcrow in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cgwalters
Copy link
Member Author

Rebased and some fixups landed, but I still haven't really tested this in anger since master installer is broken.

@abhinavdahiya
Copy link
Contributor

case:
In a case where a user has added a machineconfig with OSImageURL. Any new updates to the cluster do not affect the OS image as user's is overriding it, leaving the OS images stale.
Is this acceptable?

/cc @crawford

@crawford
Copy link
Contributor

We're going to need to allow the OS to be overridden, but we should make a stink when it happens. Users will inevitably find this and blow their foot off six months down the road when their cluster updates but the OS doesn't. What is the best way to surface that information? Can we propagate it up via the ClusterOperator CR?

@cgwalters
Copy link
Member Author

FWIW I am testing this out via oc create -f of:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 05-walters-osimageurl
spec:
  osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4

@cgwalters
Copy link
Member Author

OK here's my reboot loop that's confusing me now:

$ oc logs -f pods/machine-config-daemon-smhmf -p
I1214 21:59:26.227709    6118 start.go:51] Version: 3.11.0-355-ged663c2c
I1214 21:59:26.228638    6118 start.go:88] starting node writer
I1214 21:59:26.236246    6118 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1214 21:59:26.359659    6118 daemon.go:115] Booted osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4 (47.208)
I1214 21:59:26.384986    6118 start.go:139] Calling chroot("/rootfs")
I1214 21:59:26.488185    6118 update.go:38] Checking reconcilable for config worker-ccc10e5587a977107fdc3f930b017665 to worker-11a7ff532612deb6c24c475126ceb99a
I1214 21:59:26.488352    6118 update.go:195] Updating files
I1214 21:59:26.488457    6118 update.go:383] Writing file "/etc/containers/registries.conf"
I1214 21:59:26.493520    6118 update.go:383] Writing file "/etc/sysconfig/crio-network"
I1214 21:59:26.498424    6118 update.go:383] Writing file "/var/lib/kubelet/config.json"
I1214 21:59:26.503359    6118 update.go:383] Writing file "/etc/kubernetes/ca.crt"
I1214 21:59:26.508207    6118 update.go:383] Writing file "/etc/sysctl.d/forward.conf"
I1214 21:59:26.513706    6118 update.go:383] Writing file "/etc/kubernetes/kubelet.conf"
I1214 21:59:26.521153    6118 update.go:317] Writing systemd unit "kubelet.service"
I1214 21:59:26.521650    6118 update.go:355] Enabling systemd unit "kubelet.service"
I1214 21:59:26.521895    6118 update.go:264] /etc/systemd/system/multi-user.target.wants/kubelet.service already exists. Not making a new symlink
I1214 21:59:26.521970    6118 update.go:216] Deleting stale data
I1214 21:59:26.522038    6118 update.go:486] Updating OS to registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4
I1214 21:59:26.522089    6118 run.go:13] Running: /bin/pivot registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4
pivot version 0.0.2
I1214 21:59:26.538100    6260 run.go:27] Running: rpm-ostree status --json
I1214 21:59:26.648624    6260 root.go:79] Previous pivot: registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4
I1214 21:59:26.648654    6260 run.go:27] Running: skopeo inspect docker://registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4
I1214 21:59:28.464899    6260 root.go:89] Resolved to: registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4
I1214 21:59:28.465294    6260 root.go:92] Already at target pivot; exiting...
E1214 21:59:28.466575    6118 event.go:259] Could not construct reference to: '&v1.Node{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"osiris-worker-0-6fxwf", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.NodeSpec{PodCIDR:"", ProviderID:"", Unschedulable:false, Taints:[]v1.Taint(nil), ConfigSource:(*v1.NodeConfigSource)(nil), DoNotUse_ExternalID:""}, Status:v1.NodeStatus{Capacity:v1.ResourceList(nil), Allocatable:v1.ResourceList(nil), Phase:"", Conditions:[]v1.NodeCondition(nil), Addresses:[]v1.NodeAddress(nil), DaemonEndpoints:v1.NodeDaemonEndpoints{KubeletEndpoint:v1.DaemonEndpoint{Port:0}}, NodeInfo:v1.NodeSystemInfo{MachineID:"", SystemUUID:"", BootID:"", KernelVersion:"", OSImage:"", ContainerRuntimeVersion:"", KubeletVersion:"", KubeProxyVersion:"", OperatingSystem:"", Architecture:""}, Images:[]v1.ContainerImage(nil), VolumesInUse:[]v1.UniqueVolumeName(nil), VolumesAttached:[]v1.AttachedVolume(nil), Config:(*v1.NodeConfigStatus)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'Reboot' 'Node will reboot into config worker-11a7ff532612deb6c24c475126ceb99a'
I1214 21:59:28.466763    6118 update.go:493] machine-config-daemon initiating reboot: Node will reboot into config worker-11a7ff532612deb6c24c475126ceb99a
$ oc get -o yaml machineconfigs/worker-{ccc10e5587a977107fdc3f930b017665,11a7ff532612deb6c24c475126ceb99a} | grep osImageURL
    osImageURL: ""
    osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:36aba8329b7fe0c096aea169f6d0a7bef22d9bfec284337181efbbbd29cf00e4

@cgwalters
Copy link
Member Author

cgwalters commented Dec 14, 2018

Must be isDesiredMachineState() going wrong it feels like. Add more logging there?

@kikisdeliveryservice
Copy link
Contributor

kikisdeliveryservice commented Dec 14, 2018

@cgwalters I ran into this exact problem I think a few days ago!!

The only logs I still have are the following but I was getting errors even tho the currentConfig was definitely NOT the desiredConfig.

I1211 19:00:24.674303    6026 run.go:27] Running: skopeo inspect docker://registry.svc.ci.openshift.org/rhcos/maipo@sha256:2fc5b08e9120637efc439dae6e8c23a532d00a08f50563c5f8ab062b8be9791f

I1211 19:00:27.555196    6026 root.go:89] Resolved to: registry.svc.ci.openshift.org/rhcos/maipo@sha256:2fc5b08e9120637efc439dae6e8c23a532d00a08f50563c5f8ab062b8be9791f

I1211 19:00:27.555744    6026 root.go:92] Already at target pivot; exiting...

E1211 19:00:27.556345    5536 event.go:259] Could not construct reference to: '&v1.Node{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test1-worker-0-9qhvw", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.NodeSpec{PodCIDR:"", ProviderID:"", Unschedulable:false, Taints:[]v1.Taint(nil), ConfigSource:(*v1.NodeConfigSource)(nil), DoNotUse_ExternalID:""}, Status:v1.NodeStatus{Capacity:v1.ResourceList(nil), Allocatable:v1.ResourceList(nil), Phase:"", Conditions:[]v1.NodeCondition(nil), Addresses:[]v1.NodeAddress(nil), DaemonEndpoints:v1.NodeDaemonEndpoints{KubeletEndpoint:v1.DaemonEndpoint{Port:0}}, NodeInfo:v1.NodeSystemInfo{MachineID:"", SystemUUID:"", BootID:"", KernelVersion:"", OSImage:"", ContainerRuntimeVersion:"", KubeletVersion:"", KubeProxyVersion:"", OperatingSystem:"", Architecture:""}, Images:[]v1.ContainerImage(nil), VolumesInUse:[]v1.UniqueVolumeName(nil), VolumesAttached:[]v1.AttachedVolume(nil), Config:(*v1.NodeConfigStatus)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'Reboot' 'Node will reboot into config a0ee1101e5c425362ac7eab6d7c37b17'

I1211 19:00:27.556541    5536 update.go:492] machine-config-daemon initiating reboot: Node will reboot into config a0ee1101e5c425362ac7eab6d7c37b17

@kikisdeliveryservice
Copy link
Contributor

Where is root.go which throws the message "Already at target pivot; exiting..." because I can't find it when I search the codebase?

@cgwalters
Copy link
Member Author

@cgwalters
Copy link
Member Author

The more I think about it, the more I don't like having OSImageURL in the MC at all. It doesn't really have "merge" semantics. Everything would be a lot simpler if we just said that MC objects are Ignition (and that's it).

We'll do something else in #183

@cgwalters cgwalters closed this Dec 21, 2018
cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request Jan 9, 2019
Today each MC will contain both an Ignition fragment and an
`osImageURL`.  Define "merging" as using the first
non-empty `osImageURL` so we don't have to be very picky about
ordering.

This is a smaller version of
openshift#228

Prep for: openshift#273
osherdp pushed a commit to osherdp/machine-config-operator that referenced this pull request Apr 13, 2021
Bug 1804763: add imagestream for oauthproxy image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants