Add extra filter for checking if registries have changed #461

umohnani8 · 2019-02-19T19:16:59Z

A resync happens about every 20 minutes, which sends an updated
event to the image informer even if nothing in the image CR has
changed. Adding and extra filter that checks if there has been
any changes to the registries part of the image CR before syncing
the image handler again.

Helps fix #453 where changes to the image config will not be applied unless and actual change happened in the CR.

Signed-off-by: Urvashi Mohnani [email protected]

umohnani8 · 2019-02-19T19:19:26Z

@runcom @mrunalp @kikisdeliveryservice PTAL

umohnani8 · 2019-02-19T20:09:33Z

/test e2e-aws
/test e2e-aws-op
/test images

mrunalp · 2019-02-19T22:42:35Z

/test e2e-aws

mrunalp · 2019-02-19T22:42:41Z

/test e2e-aws-op

mrunalp · 2019-02-19T22:42:51Z

/test images

kikisdeliveryservice · 2019-02-19T22:44:45Z

FTR: Lots of CI flakes with 504 gateway timeouts hence the retests. Also these seem to be known flakes being worked on. See https://github.com/openshift/release/issues/2905

Taking a look/testing locally.

kikisdeliveryservice · 2019-02-19T23:22:49Z

~~@umohnani8 : is there something I should be testing for other than watching the logs? If so, lmk!~~

Given the below comment, disregard

runcom · 2019-02-19T23:22:51Z

/hold

discussed offline, we'll need something lighter and more state oriented

runcom · 2019-02-20T00:01:11Z

pkg/controller/container-runtime-config/container_runtime_config_controller.go

woops, dropping.

umohnani8 · 2019-02-20T00:02:17Z

@runcom updated code as discussed on slack.
@kikisdeliveryservice yeah you shouldn't see "Applied Image..." in the logs unless you actually go update the image cr (oc edit image.config.openshift.io cluster)

runcom · 2019-02-20T00:05:06Z

/hold cancel
/approve

/assign kikisdeliveryservice

kikisdeliveryservice · 2019-02-20T00:21:35Z

hi @umohnani8 since the docs are pretty thin on using this feature and im less familiar with it, can you please add step by step instructions on how to test this?

umohnani8 · 2019-02-20T00:22:51Z

@kikisdeliveryservice this is what an example image CR looks like with insecure and blocked registries in it

apiVersion: config.openshift.io/v1
kind: Image
metadata:
  creationTimestamp: 2019-02-19T16:01:32Z
  generation: 6
  name: cluster
  resourceVersion: "282342"
  selfLink: /apis/config.openshift.io/v1/images/cluster
  uid: 9ff69432-345f-11e9-8ea5-0a4bee075008
spec:
  additionalTrustedCA:
    name: ""
  registrySources:
    insecureRegistries:
    - blah.io
    blockedRegistries:
    - test.io
status:
  internalRegistryHostname: image-registry.openshift-image-registry.svc:5000

You can add as many or as little you want there and should see the changes on all the nodes once the controller does its magic :)
The CR usually exists already in a new cluster (so I have noticed with all my clusters) so all you need to do is oc edit image.config.openshift.io cluster. Since it is a cluster wide config, we only care for the CR called "cluster".

umohnani8 · 2019-02-20T00:24:00Z

Looks like the test flakes being fixed in #457
/test unit

kikisdeliveryservice · 2019-02-20T00:26:58Z

thanks @umohnani8 :) will try this out.

kikisdeliveryservice · 2019-02-20T00:34:17Z

/assign kikisdeliveryservice

runcom · 2019-02-20T00:46:13Z

pkg/controller/container-runtime-config/container_runtime_config_controller.go

Let's call this applied

A resync happens about every 20 minutes, which sends an updated event to the image informer even if nothing in the image CR has changed. Adding and extra filter that checks if there has been any changes to the registries part of the image CR before syncing the image handler again. Signed-off-by: Urvashi Mohnani <[email protected]>

kikisdeliveryservice · 2019-02-20T01:15:59Z

Testing results:

in a cluster running for a while, no more erroneous logging/syncing
editing oc edit image.config.openshift.io cluster and adding insecure and secure registries:

I0220 00:47:17.343506       1 container_runtime_config_controller.go:600] Applied ImageConfig cluster on MachineConfigPool worker
I0220 00:47:18.649066       1 container_runtime_config_controller.go:600] Applied ImageConfig cluster on MachineConfigPool master
I0220 00:47:22.941391       1 render_controller.go:456] Generated machineconfig worker-cac98dc94bdf965eb43a5bddfb46eb6e from 5 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  00-worker-ssh  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-9446842e-33cd-11e9-8221-02427fa2e0b0-registries  machineconfiguration.openshift.io/v1  }]
I0220 00:47:24.241613       1 render_controller.go:456] Generated machineconfig master-3745275ffcaf2c65dec35fc28e164976 from 5 configs: [{MachineConfig  00-master  machineconfiguration.openshift.io/v1  } {MachineConfig  00-master-ssh  machineconfiguration.openshift.io/v1  } {MachineConfig  01-master-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-master-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-master-94454f94-33cd-11e9-8221-02427fa2e0b0-registries  machineconfiguration.openshift.io/v1  }]
I0220 00:47:28.041363       1 node_controller.go:432] Setting node ip-10-0-138-64.us-west-2.compute.internal to desired config worker-cac98dc94bdf965eb43a5bddfb46eb6e
I0220 00:47:29.441259       1 node_controller.go:432] Setting node ip-10-0-150-226.us-west-2.compute.internal to desired config master-3745275ffcaf2c65dec35fc28e164976
I0220 00:48:46.870972       1 node_controller.go:432] Setting node ip-10-0-160-154.us-west-2.compute.internal to desired config master-3745275ffcaf2c65dec35fc28e164976

Checking MCD:

I0220 00:52:50.479085    4777 update.go:137] Checking reconcilable for config worker-f2f49ab8040cc5907aa3d81afa0be316 to worker-cac98dc94bdf965eb43a5bddfb46eb6e
...
I0220 00:52:51.972143    4777 update.go:663] machine-config-daemon initiating reboot: Node will reboot into config worker-cac98dc94bdf965eb43a5bddfb46eb6e

new config was successfully applied and daemons rebooted.

kikisdeliveryservice · 2019-02-20T01:29:11Z

/lgtm

openshift-ci-robot · 2019-02-20T01:29:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kikisdeliveryservice, runcom, umohnani8

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [kikisdeliveryservice,runcom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kikisdeliveryservice · 2019-02-20T02:30:18Z

Flakes: The bootstrap user should successfully login with password decoded from kubeadmin secret & cluster network timeouts.

/retest

openshift-bot · 2019-02-20T05:49:05Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2019-02-20T07:49:54Z

/retest

Please review the full test history for this PR and help us cut down flakes.

runcom · 2019-02-20T08:26:33Z

/retest

openshift-bot · 2019-02-20T09:50:57Z

/retest

Please review the full test history for this PR and help us cut down flakes.

runcom · 2019-02-20T10:40:41Z

CI errors tracked in slack

/retest

The CRC (container runtime config) controller recently added a check to avoid resyncing and recreating the very same registries config if nothing has changed on the image crd side [1]. While that's correct, during an upgrade, the controllers need to generate the MC fragments using the controller version they're at. Since we weren't checking the versions of the controller that generated the registries config, we wrongly assumed the configurations were equal and never generated a new one with the new controller. This patch fixes that by adding a version check before skipping a regeneration on equal content in the registries configs. Fixes: openshift#487 [1] openshift#461 Signed-off-by: Antonio Murdaca <[email protected]>

openshift-ci-robot requested review from ashcrow and wking February 19, 2019 19:17

openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 19, 2019

umohnani8 mentioned this pull request Feb 19, 2019

"Applied ImageConfig cluster on MachineConfigPool" logs too frequently? #453

Closed

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 19, 2019

umohnani8 force-pushed the reg-fix branch from 6b776b6 to 9311f4d Compare February 20, 2019 00:00

openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 20, 2019

runcom reviewed Feb 20, 2019

View reviewed changes

umohnani8 force-pushed the reg-fix branch from 9311f4d to 0de67ca Compare February 20, 2019 00:02

openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 20, 2019

openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Feb 20, 2019

openshift-ci-robot assigned kikisdeliveryservice Feb 20, 2019

runcom reviewed Feb 20, 2019

View reviewed changes

umohnani8 force-pushed the reg-fix branch from 0de67ca to 215a74f Compare February 20, 2019 00:51

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 20, 2019

openshift-merge-robot merged commit 10520e9 into openshift:master Feb 20, 2019

runcom mentioned this pull request Feb 24, 2019

pkg/controller: fix container runtime registries generation #489

Merged

umohnani8 mentioned this pull request Jul 11, 2019

Mirrors support #805

Merged

Add extra filter for checking if registries have changed #461

Add extra filter for checking if registries have changed #461

Uh oh!

Conversation

umohnani8 commented Feb 19, 2019

Uh oh!

umohnani8 commented Feb 19, 2019

Uh oh!

umohnani8 commented Feb 19, 2019

Uh oh!

mrunalp commented Feb 19, 2019

Uh oh!

mrunalp commented Feb 19, 2019

Uh oh!

mrunalp commented Feb 19, 2019

Uh oh!

kikisdeliveryservice commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kikisdeliveryservice commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

umohnani8 Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

umohnani8 commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

umohnani8 commented Feb 20, 2019

Uh oh!

umohnani8 commented Feb 20, 2019

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

runcom Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

umohnani8 Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

openshift-ci-robot commented Feb 20, 2019

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

kikisdeliveryservice commented Feb 19, 2019 •

edited

Loading

kikisdeliveryservice commented Feb 19, 2019 •

edited

Loading

runcom commented Feb 20, 2019 •

edited

Loading