-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coredns pods sometimes fail to start due to trying to bind privileged ports as non-root user #11366
Comments
For now, I work around this issue by pinning These older coredns_version: v1.10.1 |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What happened?
On some of my nodes,
coredns
Pods (currently using thev1.11.1
container image) fail to start with an error:On others, it runs fine.
As far as I could tell, all my nodes are identical (same OS, same kernel version, same containerd version, same sysctl parameter for
net.ipv4.ip_unprivileged_port_start
=1024
).I am not sure why binding on privileged ports works as a non-root user on some nodes and not on others.
What did you expect to happen?
I would expect that
coredns
would reliably run on all my cluster's nodes.How can we reproduce it (as minimally and precisely as possible)?
Since my Kuberspray config yields working & non-working nodes, I was trying to reproduce the issue in another way.
I've used the following
Corefile
(inspired by thecoredns
config map but with thekubernetes
plugin disabled):and I try to run this with:
nerdctl run \ -it \ --rm \ --network=none \ --mount type=bind,src=$(pwd)/Corefile,dst=/etc/coredns/Corefile,ro \ --cap-add=NET_BIND_SERVICE \ registry.k8s.io/coredns/coredns:v1.11.1 \ -conf /etc/coredns/Corefile
On some nodes it works, on others I get the aforementioned error.
It appears that
NET_BIND_SERVICE
does not do anything.Workarounds:
adding
--sysctl net.ipv4.ip_unprivileged_port_start=0
to thenerdctl run
commandDeployment
, because Kubespray does not let me override the coredns Deployment to add this undersecurityContext.sysctls
adding
--user=0:0
to thenerdctl run
commandDeployment
, because Kubespray does not let me override the coredns Deployment to add this undersecurityContext
adjusting the
Corefile
configuration to use a port higher than1023
using an older version of coredns (older than
v1.11.0
), likev1.10.1
As this comment states, coredns was made to run as non-root user since v1.11.0.
It appears that Kubespray sets up the
coredns
Deployment to run as the default user and does not explicitly adjustsysctl
fornet.ipv4.ip_unprivileged_port_start
. It also doesn't provide much control of thesecurityContext
, so applying any of these workarounds is difficult.It would probably be good if one of these workarounds is applied by default.
OS
Linux 5.15.0-113-generic x86_64
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Version of Ansible
Irrelevant
Version of Python
Irrelevant
Version of Kubespray (commit)
v2.25.0
Network plugin used
cilium
Full inventory with variables
My configuration is not customized much - using the containerd runtime, etc.
Command used to invoke ansible
Irrelevant
Output of ansible run
Ansible run is all good
Anything else we need to know
No response
The text was updated successfully, but these errors were encountered: