Enable NodePool controller to apply generated MachineConfigs #1729

dagrayvid · 2022-09-07T00:50:25Z

What this PR does / why we need it:
Add functionality to the NodePool controller getConfig(...) function to look in the control plane namespace for any ConfigMaps containing MachineConfigs, with the label hypershift.openshift.io/operator-generated-config: true and the label hypershift.openshift.io/nodePool: <nodepool name>. This is needed in order for the Node Tuning Operator to apply custom tunings which require setting kernel boot parameters.

For just one simple example, this is required to reserve hugepages.

This change has been tested alongside the change in openshift/cluster-node-tuning-operator#456 which adds the ability to NTO to embed the generated MachineConfigs into ConfigMaps.

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
PSAP-742
See also the enhancement which outlines the plan for this change: openshift/enhancements#1229

Checklist

Subject and description added to both, commit and PR.
Relevant issues have been referenced.
This change includes docs.
This change includes unit tests.

dagrayvid · 2022-09-07T00:50:43Z

/hold

For more manual testing by the NTO team.

netlify · 2022-09-07T00:51:03Z

✅ Deploy Preview for hypershift-docs ready!

Name	Link
🔨 Latest commit	`d18bf30`
🔍 Latest deploy log	https://app.netlify.com/sites/hypershift-docs/deploys/63406fcecae8c80008761324
😎 Deploy Preview	https://deploy-preview-1729--hypershift-docs.netlify.app/how-to/node-tuning
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

hypershift-operator/controllers/nodepool/nodepool_controller.go

jmencak

Thank your for the PR, David. The docs look good, I found only minor nits.

docs/content/how-to/node-tuning.md

jmencak · 2022-09-07T09:18:55Z

docs/content/how-to/node-tuning.md

+          - priority: 20
+            profile: openshift-node-hugepages
+    ```
+    > **_NOTE:_**  The `.spec.recommend.match` field is intentionally left blank. In this case this Tuned will be applied to all Nodes in the NodePool where this ConfigMap is referenced. It is advised to group Nodes with the same hardware configuration into the same NodePool. Not following this practice might result in TuneD operands calculating conflicting kernel parameters for two or more nodes sharing the same NodePool.


This is IMPORTANT, but let's leave it just a NOTE. This bogged me down as the first "user" of the new code. An aside, yesterday I was testing with the .spec.recommend.match field targetting a single node in the node pool. It is probably the reason I was getting this and something that still needs to be addressed on the NTO side.

There should be no possible choice for the user to target particular nodes within a NodePool.

This is behaviour is based off of how NTO works in standalone OCP. There are many cases where a TuneD profile is making in-place changes to node tunables and no MachineConfig is needed. For example, if a user wants to set some sysctl values on one node with particular labels and assign some Pods only to that Node by label.

If we do decide to remove this feature in HyperShift, we can do that, but it would be a change to the NTO code.

Also note that the original issue Jiri hit here was fixed. If a user does use Node label based matching, no MachineConfig will be generated based on that Profile

docs/content/how-to/node-tuning.md

hypershift-operator/controllers/nodepool/nodepool_controller.go

dagrayvid · 2022-09-09T17:55:09Z

/retest

dagrayvid · 2022-09-12T15:32:45Z

openshift/cluster-node-tuning-operator#456 has now merged. From the NTO side, I believe this is ready for merge or further review.

/unhold

dagrayvid · 2022-09-13T16:02:14Z

@sjenning the e2e-aws failures look like flakes to me, can you confirm?

dagrayvid · 2022-09-19T17:24:55Z

/retest

dagrayvid · 2022-09-22T20:02:39Z

/retest

enxebre · 2022-09-23T07:13:24Z

docs/content/how-to/node-tuning.md

+    Example output:
+    ```
+    NAME                           TUNED                      APPLIED   DEGRADED   AGE
+    nodepool-1-worker-1            openshift-node             True      False      132m


what happens if someone modified this profiles?

@enxebre I'm not sure I know what you are asking. Are you wondering what would happen if someone modified the Profile objects from the hosted cluster side?

Are you wondering what would happen if someone modified the Profile objects from the hosted cluster side?

@dagrayvid Yes, is it possible to change something guest cluster side which the NTO watches and reconciles against management side config and so triggering an upgrade?

In theory, yes. This was discussed in some of the earlier design discussions about enabling NTO on HyperShift. Like in standalone OCP, the NTO Operand (containerized TuneD daemon) writes the needed kernel boot parameters calculated by TuneD based on the applied profile to the Profile objects status.bootcmdline field. This field is read by the Operator before creating / updating the NTO-generated MachineConfig.

If the Profile object were edited by someone with admin privileges on the guest cluster, the Operator and Operand would simultaneously reconcile. The Operand would reconcile the Profile, overwriting any change in the status. The Operator would also reconcile the Profile, potentially updating the NTO-generated MachineConfig based on the Profile status.bootcmdline change, in a race with the Operand. If the operator "loses" the race, after the operand does overwrite any admin user changes to the Profile, the operator will reconcile the Profile again, syncing the MachineConfig.

When we discussed this earlier on, the answer was that this should be "okay" as admin users of the hosted cluster already have root access to the nodes (i.e. oc debug).

hypershift-operator/controllers/nodepool/nodepool_controller.go

api/v1alpha1/hostedcluster_types.go

enxebre · 2022-09-23T07:25:41Z

Can we please have e2e tests in place for both in-place / replace validating e.g. what you describe in the docs, similar to https://github.com/openshift/hypershift/blob/main/test/e2e/nodepool_machineconfig_test.go

dagrayvid · 2022-09-27T02:10:39Z

Can we please have e2e tests in place for both in-place / replace validating e.g. what you describe in the docs, similar to https://github.com/openshift/hypershift/blob/main/test/e2e/nodepool_machineconfig_test.go

Thanks @enxebre, I added e2e tests for in-place / replace of the workflow described in the docs.

enxebre · 2022-10-06T14:44:36Z

Thanks! To summarise slack discussion: After considering all the intricacies and possible paths we'll proceed with current approach and follow up to enforce read-only guest cluster resources via cel field immutability and consider any other mechanism driven from management side.
/approve

pending rebase, passing tests and having a closer look the code details.

openshift-ci · 2022-10-06T14:48:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dagrayvid, enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [enxebre]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

dagrayvid · 2022-10-07T18:31:55Z

Thanks @enxebre. I rebased this PR on the latest changes and decided to keep hostedcluster.HostedClusterAnnotation as it was (out of the API) for the sake of simplicity. Tested locally.

dagrayvid · 2022-10-07T21:17:06Z

/retest

dagrayvid · 2022-10-08T15:21:57Z

/retest

jmencak · 2022-10-10T09:07:20Z

/retest

dagrayvid · 2022-10-10T20:05:49Z

/retest

dagrayvid · 2022-10-11T13:43:47Z

/retest

dagrayvid · 2022-10-11T15:45:29Z

/retest

Failures seem unrelated to the PR. Last failure was hitting the 1h timeout without any failed tests

dagrayvid · 2022-10-11T18:39:41Z

/retest

enxebre · 2022-10-12T10:32:21Z

/lgtm
/hold
to make sure @sjenning have the chance to have look

sjenning · 2022-10-12T15:10:43Z

lgtm
/hold cancel
e2e flaked on #1798 before
/retest-required

openshift-ci · 2022-10-12T17:27:33Z

@dagrayvid: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-kubevirt-gcp-ovn	`b0b864b`	link	false	`/test e2e-kubevirt-gcp-ovn`
ci/prow/capi-provider-agent-sanity	`bff593a`	link	false	`/test capi-provider-agent-sanity`
ci/prow/e2e-ibmcloud-iks	`bff593a`	link	false	`/test e2e-ibmcloud-iks`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 7, 2022

openshift-ci bot requested review from alvaroaleman and csrwng September 7, 2022 00:50

dagrayvid commented Sep 7, 2022

View reviewed changes