Skip to content

Add capk user properly when users already defined.#159

Merged
k8s-ci-robot merged 1 commit into
kubernetes-sigs:mainfrom
pjaton:fix-capk-user-add
Jul 29, 2022
Merged

Add capk user properly when users already defined.#159
k8s-ci-robot merged 1 commit into
kubernetes-sigs:mainfrom
pjaton:fix-capk-user-add

Conversation

@pjaton
Copy link
Copy Markdown
Contributor

@pjaton pjaton commented Jul 25, 2022

What this PR does / why we need it:

Cluster-API's Kubeadm bootstrap provider exposes the capability to end-user to specify users to add to control or compute plane nodes. When a end-user does so, then the corresponding users section is generated in the cloud-init bootstrap data. This section conflicts with the one that is currently added by the Kubevirt provider to add the capk user that is used to check each node health.

This PR address this conflict by:

  • adding the capk user to the users section if one exists already
  • if a capk user exists already in this list of users, then, the former is overridden instead
  • if there are not users section in the current cloud-init bootstrap userdata, then add the section with the capk user

Which issue this PR fixes:

fixes #149

Special notes for your reviewer:

Marshaling and unmarshaling yaml can be tricky as comment can easily be lost, ordering modified, etc. To avoid this, this changes relies on Golang yaml.Node to preserve as much of the original format as possible. The only change is on the indentation that might be streamlined (CAPI templates seem to produce inconsistent indentation between different section).

Ignoring the indentation differences, I validated that the difference between the cloud-init generated by the Kubeadm bootstrap provider and the one enhanced by CAPK is only the desired addition of the capk user.

By example, when no user is defined by the end-user:

239a240,246
> users:
>   - name: capk
>     gecos: CAPK User
>     sudo: ALL=(ALL) NOPASSWD:ALL
>     groups: users, admin
>     ssh_authorized_keys:
>       - ecdsa-sha2-nistp384 AAAA...

Release notes:

NONE

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 25, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 25, 2022
@davidvossel
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Jul 25, 2022
Copy link
Copy Markdown
Contributor

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into this a bit more, I question if we should be doing this cloud-init manipulation at all.

I think the kubeadm config is the place this should be occurring, https://capz.sigs.k8s.io/topics/ssh-access.html#provisioning-ssh-keys-using-machine-templates

It feels a bit like we're fighting against the kubeadm bootstrapper here if people can define their own users and cloudinit configuration in the bootstrapper config.

@davidvossel
Copy link
Copy Markdown
Contributor

It feels a bit like we're fighting against the kubeadm bootstrapper here if people can define their own users and cloudinit configuration in the bootstrapper config.

The alternative here would be that we add something to our templates to inject the capk user and ssh keys.

@coveralls
Copy link
Copy Markdown

coveralls commented Jul 25, 2022

Pull Request Test Coverage Report for Build 2733450232

  • 64 of 76 (84.21%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+2.09%) to 48.383%

Changes Missing Coverage Covered Lines Changed/Added Lines %
controllers/kubevirtmachine_controller.go 64 76 84.21%
Totals Coverage Status
Change from base Build 2732899781: 2.09%
Covered Lines: 778
Relevant Lines: 1608

💛 - Coveralls

@agradouski
Copy link
Copy Markdown
Contributor

@davidvossel agree, it is indeed a hack from the start that only worked until we need to add another user for a tenant cluster via kubeadm config, which is the right way to do it ... however, the issue is that currently we also generate ssh keys dynamically when bootstrapping the cluster. we would also need to include a private key for capk user as part of the template, and I'm not sure if that would be a security issue, as users own tenant cluster configs, and capk user is used for cluster management by cluster-api, even though users technically have a role of cluster admin

so, we can delegate ssh key management for capk user to users, and require ssh keys in our template, as you suggest

or, another alternative can be that we include capk user in kubeadm config (and make it required), without ssh keys, but then still manipulate capk user in machine controller reconcile loop, replacing the ssh key in user config with the dynamically generated one

This change takes into account the case where the user configured
additional users already, including the case when a capk user is already
added.
@pjaton
Copy link
Copy Markdown
Contributor Author

pjaton commented Jul 25, 2022

@davidvossel , @agradouski, I agree that managing the cloud-init user-data should, as much as possible, remains the responsibility of Kubeadm bootstrapper. However, until we can update the logic to assess a node health so that it doesn't require SSH access, I feel that, from a user experience point-of-view, it better to have this provider generate the SSH keys pair and add the user.

It would feel odd to me as a end-user to have to specify a required user when I do not need it for my own use.

If we replace the logic and CAPK no longer requires this user, then keeping this logic encapsulated would also make the upgrade more transparent.

@pjaton
Copy link
Copy Markdown
Contributor Author

pjaton commented Jul 26, 2022

As discussed offline with @agradouski and @davidvossel , opened #163 to keep track of the long term fix.

@pjaton
Copy link
Copy Markdown
Contributor Author

pjaton commented Jul 27, 2022

@davidvossel , @agradouski , knowing that we now have #163 to properly address how CAPK checks nodes health, do you agree to move forward with this change to, at least, improve on the current SSH approach?

@agradouski
Copy link
Copy Markdown
Contributor

@pjaton agree
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 28, 2022
Copy link
Copy Markdown
Contributor

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

long term we want to move away from this. But your change here is no worse than what currently exists

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidvossel, pjaton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 29, 2022
@k8s-ci-robot k8s-ci-robot merged commit 58dcd70 into kubernetes-sigs:main Jul 29, 2022
@pjaton pjaton deleted the fix-capk-user-add branch August 1, 2022 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CAPK addition of capk user to Cloud-Init user-data conflict when user(s) specified in Kubeadm configs

5 participants