Skip to content

Conversation

@kikisdeliveryservice
Copy link
Contributor

Looking at @mkenigs PR I realized that the MCBS branch was working off of a very old version of master (from November!).

Updating branch with today's master as a merge commit into mcbs which will let us one day merge the two.

zaneb and others added 30 commits November 9, 2021 17:28
This makes it easier to reuse parts of the function.
With this we let the installer take care of installing the initial
user-data secret and we then take over with our managed secret. if we
are upgrading (and thus no installer at play), we just create the new
(managed) secret.

This is a cherry-pick of the original patch
1f52e48, which was later reverted by
1c65355.

Signed-off-by: Antonio Murdaca <[email protected]>
Co-authored-by: Zane Bitter <[email protected]>
This will allow us to manage a secret to scale up nodes with
ignition v2 binary installed.

Signed-off-by: Yu Qi Zhang <[email protected]>
(cherry picked from commit 22b8b24)
openshift#2752 fixed the 1-1 mapping for kubeletconfig,
this PR fix the machineconfig name and pool name 1-1 mapping for containerruntime config.

Signed-off-by: Qi Wang <[email protected]>
This drop-in that exists for the baremetal and vsphere platforms is
unnecessary.  Cri-o already respects when $CONTAINER_STREAM_ADDRESS is
set in the environment without having to edit the commandline.

Removing this drop-in reduces code fragility without any change in
functionality.

Signed-off-by: Jim Ramsay <[email protected]>
GPG keys were added to the list of reboot exceptions, and inaccurate
documentation was updated. Selected registries.conf changes were
previously listed under the "None" action, but registries.conf changes
always trigger the "Reload Crio" action. Some organizational and wording
changes were made to make this more clear
Bug 2028731: fixes 1 to 1 containerruntime config mapping
All the upgrade candidate nodes in a mcp would be applied
`UpdateInProgress: PreferNoSchedule` taint.
The taint will be removed MCD once the upgrade is complete.
Since kubernetes/kubernetes#104251 landed,
the nodes not having PreferNoSchedule taint will have higher score.
Before the upgrade starts, MCC will taint all the nodes in the
cluster that are supposed to be upgraded. Once the upgrade is
complete since MCD will remove the taint, none of the nodes will
have `UpdateInProgress: PreferNoSchedule` taint. This ensures
the score of the nodes will be equal again.

Why is this needed?
This reduces the pod churn when the cluster upgrade is in progress.
When the non-upgraded nodes in the cluster have `UpdateInProgress:
PreferNoSchedule` taint, they would get lesser score and the pods
would prefer to land onto untainted(upgraded) nodes there by
reducing the chances of landing onto an unupgraded node which
can cause one more reschedule
To better support multiple release architectures, as well as multiple
developer workstation OSes and architectures, outputting the current
GOOS / GOARCH values helps the developer ensure they are building for
the correct target architecture.
This reduces reliance upon the oc command by further leveraging the
Kubernetes API provided by the framework.ClientSet object. In
particular, this eliminates the need to shell out to oc to set / remove
labels on nodes.

In cases where we do have to shell out (e.g., ExecCmdOnNode), the
following assurances will now be made:
1. Make sure that we have the oc command in our $PATH.
2. Ensure that if we set the path to our Kubeconfig file via the
NewClientSet constructor (as opposed to setting $KUBECONFIG), that oc is
aware of that path.

There are cases where we cannot get the Kubeconfig file because we're
either running in-cluster or with a code-defined Kubeconfig object.
Running ExecCmdOnNode will still fail in those cases. However, the error
message will be more explicit about the cause.
This introduces a way of proactively checking for the divergence or
"drift" of the on-disk configuration state from what is specified within
a MachineConfig. Using fsnotify to listen for filesystem events, the
node's on-disk state is validated upon detection of a write event for
any of the files specified by the currently applied MachineConfig.

Files whose contents or mode have changed will cause the node to be
marked Degraded until the cluster admin takes remedial action. This can
be resetting the file back to its known contents / mode, or the creating
the forcefile, which causes the current MachineConfig to be re-applied.
This PR is to resolve a panic when
`PlatformStatus.VSphere` is nil.
…aint

[MCC][MCD]: Introduce in progress taint
…cally-check-config-drift

Proactively detect config drift
The event ordering in the controller is not
guaranteed if there are same operations on the
object, sometimes they can be combined, sometimes
they can be ignored. This commit makes the
TestMakeProgress test more robust by using subtest.

Test output:
go test ./pkg/controller/node  -race -run ^\TestShouldMakeProgress\$ -count 100
ok  	github.com/openshift/machine-config-operator/pkg/controller/node	84.172s
These functions will be needed by both
openshift#2802
and
openshift#2851
so adding them here to avoid merge conflicts later

Moved/renamed newFile -> NewIgnFile
…nits

fix races while syncing node events
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
* On error, clean up any configuration to avoid interference. This
  configuration will be saved on a temporary directory for
  troubleshooting.
* Consolidated logic to rollback any applied configuration, to activate
  a connection profile, reload NM, print network state information and
  exit handling.
* Avoid using `nmcli device connect` as it will generate a persistent
  connection profile if there wasn't any, ipossibly changing the state
  the node was initially deployed with.

Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
The Config Drift Monitor
(openshift#2795) was
previously unaware of compressed files. What would happen is the MCD
would unzip a compressed file payload and write that to disk. However,
the Config Drift Monitor was unaware that the file was compressed, so it
was comparing the compressed contents of the MachineConfig against the
uncompressed contents that were written to disk. Because of that, the
Config Drift Monitor would erroneously degrade the node / MCP.

Fixes: #2032565
Add helper functions to work with Ignition Configs
If a config change does not contain changes to registries.conf, don't
apply checks specific to registries.conf

Also start using helper functions added in
openshift#2870
configure-ovs: improvements & reset openvswitch configuration on every boot
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

27 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 16, 2022

@kikisdeliveryservice: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 4345d4a into openshift:mcbs Jan 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. layering lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.