Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-k8s: pass --hostname-override to kubelet, set it to inst. PrivateDnsName for k8s-1.26 #3033

Merged
merged 5 commits into from
Apr 20, 2023

Conversation

etungsten
Copy link
Contributor

@etungsten etungsten commented Apr 19, 2023

Issue number:

Resolves #3028

Description of changes:

    models,packages: add 'hostname-override' for kubelet, add kubelet option
    
    This adds the 'kubernetes.hostname-override' setting for kubelet's
    '--hostname-override' option.
    pluto: add 'private-dns-name' subcommand
    
    Adds a new command for retrieving the instance's PrivateDnsName.
    
    Refactors out code for setting up proxy for the AWS API clients so it
    can be shared between the EKS module and EC2 module.
    models: generate 'hostname-override' and set cp to external in aws-k8s-1.26
    
    This separates the model for aws-k8s-1.26 so that we generate the
    'hostname-override' setting and set the cloud-provider to 'external' for
    aws-k8s-1.26 variants only.
    packages: set '--cloud-provider' in 1.26 kubelet according to setting
    
    This lets the 'settings.kubernetes.cloud-provider' setting control what
    gets passed to '--cloud-provider' in kubelet options
    migrations: add migrations for new 'hostname-override' setting

Testing done:

On aws-k8s-1.26:

Launched aws-k8s-1.26 node into 1.26 cluster in us-west-2 and the node registers itself fine:

$ kubectl get nodes -o wide
NAME                                             STATUS   ROLES    AGE     VERSION               INTERNAL-IP      EXTERNAL-IP     OS-IMAGE                                KERNEL-VERSION   CONTAINER-RUNTIME
i-044decb26a95a5253.us-west-2.compute.internal   Ready    <none>   2m29s   v1.26.2-eks-b106822   192.168.21.148   52.38.212.231   Bottlerocket OS 1.14.0 (aws-k8s-1.26)   5.15.102         containerd://1.6.19+bottlerocket
i-0beac6e5f9da1f6e2.us-west-2.compute.internal   Ready    <none>   2m27s   v1.26.2-eks-b106822   192.168.10.93    52.35.104.85    Bottlerocket OS 1.14.0 (aws-k8s-1.26)   5.15.102         containerd://1.6.19+bottlerocket

Checking on the host, the kubelet command line arguments has --hostname-override as expected:

# /etc/systemd/system/kubelet.service.d/exec-start.conf
[Service]
ExecStart=
ExecStart=/usr/bin/kubelet \
    --cloud-provider external \
    --kubeconfig /etc/kubernetes/kubelet/kubeconfig \
    --config /etc/kubernetes/kubelet/config \
    --container-runtime=remote \
    --container-runtime-endpoint=unix:///run/containerd/containerd.sock \
    --containerd=/run/containerd/containerd.sock \
    --root-dir /var/lib/kubelet \
    --cert-dir /var/lib/kubelet/pki \
    --hostname-override i-044decb26a95a5253.us-west-2.compute.internal \
    --node-ip ${NODE_IP} \
    --node-labels "${NODE_LABELS}" \
    --register-with-taints "${NODE_TAINTS}" \
    --pod-infra-container-image ${POD_INFRA_CONTAINER_IMAGE}

Running pluto separately:

bash-5.1# pluto private-dns-name
"i-044decb26a95a5253.us-west-2.compute.internal"

On aws-k8s-1.24:

The node joins successfully and the hostname-override setting is not generated on boot as expected:

[root@admin]# sudo sheltie                                                                                                                            
bash-5.1# apiclient get os                                                 
{
  "os": {                                                                                                                                             
    "arch": "x86_64",
    "build_id": "15acf360",                                                                                                                           
    "pretty_name": "Bottlerocket OS 1.14.0 (aws-k8s-1.24)",
    "variant_id": "aws-k8s-1.24",                                          
    "version_id": "1.14.0"
  }             
}                                                                          
bash-5.1# apiclient get settings.kubernetes.hostname-override
{}          

kubelet command line args don't have --hostname-override

# /etc/systemd/system/kubelet.service.d/exec-start.conf
[Service]
ExecStart=
ExecStart=/usr/bin/kubelet \
    --cloud-provider aws \
    --kubeconfig /etc/kubernetes/kubelet/kubeconfig \
    --config /etc/kubernetes/kubelet/config \
    --container-runtime=remote \
    --container-runtime-endpoint=unix:///run/containerd/containerd.sock \
    --containerd=/run/containerd/containerd.sock \
    --root-dir /var/lib/kubelet \
    --cert-dir /var/lib/kubelet/pki \ 
    --node-ip ${NODE_IP} \
    --node-labels "${NODE_LABELS}" \
    --register-with-taints "${NODE_TAINTS}" \
    --pod-infra-container-image ${POD_INFRA_CONTAINER_IMAGE}

aws-k8s-1.26 in cluster with custom domain names for VPC:

VPC has DHCP options set to:
image

Launched nodes with changes and the nodes join fine whereas they could not before:

$ kubectl get nodes -o wide
NAME                               STATUS   ROLES    AGE     VERSION               INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                                KERNEL-VERSION   CONTAINER-RUNTIME
i-0a163988779026c87.ec2.internal   Ready    <none>   29s     v1.26.2-eks-b106822   192.168.61.143   34.228.216.230   Bottlerocket OS 1.14.0 (aws-k8s-1.26)   5.15.102         containerd://1.6.19+bottlerocket
ip-192-168-20-172.ec2.internal     Ready    <none>   6m26s   v1.26.2-eks-b106822   192.168.20.172   44.205.9.113     Bottlerocket OS 1.14.0 (aws-k8s-1.26)   5.15.102         containerd://1.6.19+bottlerocket
ip-192-168-26-166.ec2.internal     Ready    <none>   41s     v1.26.2-eks-b106822   192.168.26.166   35.175.118.209   Bottlerocket OS 1.14.0 (aws-k8s-1.26)   5.15.102         containerd://1.6.19+bottlerocket
ip-192-168-45-118.ec2.internal     Ready    <none>   6m9s    v1.26.2-eks-b106822   192.168.45.118   54.209.24.115    Bottlerocket OS 1.14.0 (aws-k8s-1.26)   5.15.102         containerd://1.6.19+bottlerocket

On i-0a163988779026c87, the hostname is under etung.test as expected:

bash-5.1# cat /proc/sys/kernel/hostname 
i-0a163988779026c87.etung.test

bash-5.1# apiclient get settings.kubernetes.hostname-override
{
  "settings": {
    "kubernetes": {
      "hostname-override": "i-0a163988779026c87.ec2.internal"
    }
  }
}

In my kubelet logs:

Apr 19 17:01:25 i-0a163988779026c87.etung.test kubelet[1185]: I0419 17:01:25.927802    1185 kubelet_node_status.go:73] "Successfully registered node" node="i-0a163988779026c87.ec2.internal"

Migration testing

Started on the aws-k8s-1.26 x86 1.13.3 release and updated to image built with changes:

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "752a994d",
    "pretty_name": "Bottlerocket OS 1.13.3 (aws-k8s-1.26)",
    "variant_id": "aws-k8s-1.26",
    "version_id": "1.13.3"
  }
}
bash-5.1# updog whats
aws-k8s-1.26 1.14.0
bash-5.1# updog update -i 1.14.0 -r -n 
Starting update to 1.14.0
Cannot schedule shutdown without logind support, proceeding with immediate shutdown.
Update applied: aws-k8s-1.26 1.14.0
...

After reboot, the hostname-override setting exists and is set:

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "15acf360",
    "pretty_name": "Bottlerocket OS 1.14.0 (aws-k8s-1.26)",
    "variant_id": "aws-k8s-1.26",
    "version_id": "1.14.0"
  }
}
bash-5.1# apiclient get settings.kubernetes.hostname-override
{
  "settings": {
    "kubernetes": {
      "hostname-override": "i-008076a672d2c0a45.us-west-2.compute.internal"
    }
  }
}

Checked datastore:

bash-5.1# cat /var/lib/bottlerocket/datastore/current/live/settings/kubernetes/hostname-override
"i-008076a672d2c0a45.us-west-2.compute.internal"
bash-5.1# cat /var/lib/bottlerocket/datastore/current/live/settings/kubernetes/hostname-override.affected-services 
["kubernetes"]
bash-5.1# cat /var/lib/bottlerocket/datastore/current/live/settings/kubernetes/hostname-override.setting-generator 
"pluto private-dns-name"

Then I downgrade back to 1.13.3 via signpost rollback-to-inactive

The host boots fine back into 1.13.3. The hostname-override setting and associated metadata are all gone as expected.

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "752a994d",
    "pretty_name": "Bottlerocket OS 1.13.3 (aws-k8s-1.26)",
    "variant_id": "aws-k8s-1.26",
    "version_id": "1.13.3"
  }
}
bash-5.1# ls -al /var/lib/bottlerocket/datastore/current/live/settings/kubernetes/
total 92
drwxr-xr-x.  2 root root 4096 Apr 19 17:23 .
drwxr-xr-x. 14 root root 4096 Apr 19 17:23 ..
-rw-r--r--.  1 root root   74 Apr 19 17:23 api-server
-rw-r--r--.  1 root root    5 Apr 19 17:23 authentication-mode
-rw-r--r--.  1 root root    5 Apr 19 17:23 cloud-provider
-rw-r--r--.  1 root root 1370 Apr 19 17:23 cluster-certificate
-rw-r--r--.  1 root root   13 Apr 19 17:23 cluster-dns-ip
-rw-r--r--.  1 root root   22 Apr 19 17:23 cluster-dns-ip.setting-generator
-rw-r--r--.  1 root root   15 Apr 19 17:23 cluster-domain
-rw-r--r--.  1 root root   14 Apr 19 17:23 cluster-name
-rw-r--r--.  1 root root    2 Apr 19 17:23 max-pods
-rw-r--r--.  1 root root   16 Apr 19 17:23 max-pods.setting-generator
-rw-r--r--.  1 root root   16 Apr 19 17:23 node-ip
-rw-r--r--.  1 root root   15 Apr 19 17:23 node-ip.setting-generator
-rw-r--r--.  1 root root   71 Apr 19 17:23 pod-infra-container-image
-rw-r--r--.  1 root root   27 Apr 19 17:23 pod-infra-container-image.affected-services
-rw-r--r--.  1 root root   57 Apr 19 17:23 pod-infra-container-image.setting-generator
-rw-r--r--.  1 root root   65 Apr 19 17:23 pod-infra-container-image.template
-rw-r--r--.  1 root root   39 Apr 19 17:23 provider-id
-rw-r--r--.  1 root root   19 Apr 19 17:23 provider-id.setting-generator
-rw-r--r--.  1 root root    4 Apr 19 17:23 server-tls-bootstrap
-rw-r--r--.  1 root root    5 Apr 19 17:23 standalone-mode
-rw-r--r--.  1 root root   15 Apr 19 17:23 static-pods.affected-services

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@etungsten etungsten changed the title aws-k8s: pass '--hostname-override' to kubelet, set it to inst. PrivateDnsName for k8s-1.26 aws-k8s: pass --hostname-override to kubelet, set it to inst. PrivateDnsName for k8s-1.26 Apr 19, 2023
@yeazelm yeazelm marked this pull request as ready for review April 19, 2023 00:27
@etungsten etungsten marked this pull request as draft April 19, 2023 00:28
@etungsten
Copy link
Contributor Author

Push above adds a README entry for the new settings.kubernetes.hostname-override setting.

@etungsten
Copy link
Contributor Author

Push above fixes some typos and word changes.

Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits.

README.md Outdated Show resolved Hide resolved
sources/models/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@yeazelm yeazelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@etungsten
Copy link
Contributor Author

Push above address @zmrow and @jpmcb 's comment about the README entry. I added another note to clarify kubernetes.hostname-override does not affect network.hostname in any manner.

@etungsten etungsten requested review from jpmcb and zmrow April 19, 2023 20:42
README.md Outdated Show resolved Hide resolved
Comment on lines +7 to +8
// Limit the timeout for the EC2 describe-instances API call to 5 minutes
const EC2_DESCRIBE_INSTANCES_TIMEOUT: Duration = Duration::from_secs(300);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we align this with the timeout that the in-tree aws cloud provider uses for the same API call?

I am wary about creating a launch time regression for anyone using a private VPC with no access to the EC2 API. If this call timed out previously and the cloud provider just used the system hostname, then it may not have been noticed.

Copy link
Contributor Author

@etungsten etungsten Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-tree aws cloud provider relies on the default retries/timeout in the aws-sdk-go EC2 client: https://github.com/kubernetes/legacy-cloud-providers/blob/74c983775777a7ddd7ebcf5973420adb405cefd4/aws/instances.go#L136-L147

https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/retries-timeouts/#standard-retryer

They don't modify the default client:
https://github.com/kubernetes/legacy-cloud-providers/blob/74c983775777a7ddd7ebcf5973420adb405cefd4/aws/aws.go#L824-L834

A default value of 2 for maximum retry attempts, making a total of 3 call attempts. This value can be overwritten through the max_attempts configuration parameter.
Any retry attempt will include an exponential backoff by a base factor of 2 for a maximum backoff time of 20 seconds.

Do we want to align on that?

EKS Optimized AMI is doing 10 retries instead:
https://github.com/awslabs/amazon-eks-ami/pull/1264/files#diff-049390d14bc3ea2d7882ff0f108e2802ad9b043336c5fa637e93581d9a7fdfc2R488-R490

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should put a time bound on the API call still. If we dont, we'll run into #2370 again where pluto can potentially block boot for 30 mins (default standard retry configuration).

Copy link
Contributor Author

@etungsten etungsten Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wary about creating a launch time regression for anyone using a private VPC with no access to the EC2 API. If this call timed out previously and the cloud provider just used the system hostname, then it may not have been noticed.

From looking at the legacy cloud provider code, I believe the call would have hanged for longer than 5 mins while blocking kubelet from progressing. Every AWS SDK implementation has a different default request timeout value it seems... but most are between 100 seconds to 240 seconds. Which across 3 attempts would put it over the 5 min timeout we're enforcing.

In the EKS optimized AMI, I believe the bootstrapping script is probably blocking for longer than 5 mins as well with the 10 retries (each retry is 1 mins according to aws-cli docs).

Copy link
Contributor

@cbgbt cbgbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty happy with this given that we address @bcressey's comment about timeouts.

) -> Result<impl Into<aws_smithy_client::http_connector::HttpConnector>> {
// Determines whether a request of a given scheme, host and port should be proxied
// according to `https_proxy` and `no_proxy`.
let intercept = move |scheme: Option<&str>, host: Option<&str>, _port| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit gnarly and probably deserves tests, but I respect that we are just doing a move refactor here.

@etungsten
Copy link
Contributor Author

Push above addresses @bcressey 's comments (other than #3033 (comment))

README.md Outdated

**Important note for all Kubernetes variants:** Changing this setting at runtime (not via user-data) can cause issues with kubelet registration, as hostname is closely tied to the identity of the system for both registration and certificates/authorization purposes.

Most users don't need to change this setting. If left unset, the system hostname will be used instead. The `settings.network.hostname` setting can be used to specify the value for both `kubelet` and the host. Only set this override if you intend for the `kubelet` to register with a different name than the host.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: shouldn't each sentence be its own line?

This adds the 'kubernetes.hostname-override' setting for kubelet's
'--hostname-override' option.
Adds a new command for retrieving the instance's PrivateDnsName.

Refactors out code for setting up proxy for the AWS API clients so it
can be shared between the EKS module and EC2 module.
…s-1.26

This separates the model for aws-k8s-1.26 so that we generate the
'hostname-override' setting and set the cloud-provider to 'external' for
aws-k8s-1.26 variants only.
This lets the 'settings.kubernetes.cloud-provider' setting control what
gets passed to '--cloud-provider' in kubelet options
@cbgbt cbgbt merged commit 00f5509 into bottlerocket-os:develop Apr 20, 2023
joebowbeerxealth referenced this pull request Sep 11, 2023
This adds a multicall binary to be used as the common entry point for
all Kubernetes CIS benchmark checks.

Signed-off-by: Sean McGinnis <[email protected]>
@joebowbeer
Copy link

@stmcginnis kubernetes-checks 4.2.7 also needs update?

76b8b54#r127059533

@stmcginnis
Copy link
Contributor

76b8b54#r127059533

I responded to that comment, but I'm not sure I'm understanding what you are asking. Can you elaborate?

@joebowbeer
Copy link

@stmcginnis thanks for response and confirmation!

I was perhaps confused by the wording and status of the test.

I realize that Manual tests are skipped by automation, but they aren't skipped by humans, and it wasn't clear to me that it was the test that was invalid, or that this invalidation was specific to bottlerocket.

It looked to me like this flag was added to EKS Optimized AMI as well.

https://aws.amazon.com/blogs/containers/amazon-eks-now-supports-kubernetes-version-1-26/

The latest version of the EKS Optimized AMI, v20230501, has been updated to include the --hostname-override flag.

@stmcginnis
Copy link
Contributor

Very fair. The (not valid for Bottlerocket) was added to the title as a compromise. It didn't seem right to just completely ignore it and report PASSED when the state is not what the Kubernetes benchmark states it should be. But it is a valid and safe use of it for Bottlerocket, so FAILED definitely wasn't appropriate either.

There's some talk of adding more documentation around these benchmarks on the Bottlerocket website. I will make sure these inconsistencies in the K8s benchmark are called out to make sure that "not valid" statement is understood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nodes with custom domain-name fail to join EKS cluster on Kubernetes 1.26
9 participants