Zookeeper not able to form cluster in OKE (Oracle Kubernetes Cluster) #10055
Replies: 1 comment · 4 replies
-
The things you shared are unreadable because they are not formatted. But the |
Beta Was this translation helpful? Give feedback.
All reactions
-
I updated the yaml to make it readable. Yes the issue is they are unable to communicate. I am able to reach using IP of the POD. So it is something related to DNS. In OKE they use CoreDNS service. Not sure it is something to do with that. Anyway this same configuration used to work in GKE. |
Beta Was this translation helpful? Give feedback.
All reactions
-
CoreDNS is used on most Kubernetes cluster. But it can be misconfigured, or slow, it can be some issue on a particular worker node etc. |
Beta Was this translation helpful? Give feedback.
All reactions
-
There was an issue with CoreDNS on the nodes. It has been resolved now. However the issue still remains. Below logs are from Operator
It is coming for every instance. I created another pod with networking utilities in same cluster to verify dig.
It is resolving. So not sure, why it is complaining.
Logs of zookeeper-0
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Well, you seem to be still getting |
Beta Was this translation helpful? Give feedback.
-
Kubernetes version v1.29.1
Operator version [0.40.0]
Installed with 3 replicas.
helm install strimzi-operator strimzi/strimzi-kafka-operator -n kafka-operator-ns -f values_ha.yaml
kubectl apply -f kafka-ha.yaml -n kafka-ns
Error
2024-05-02 20:10:12,429 WARN Cannot open channel to 1 at election address kafka-cluster-zookeeper-0.kafka-cluster-zookeeper-nodes.kafka-ns.svc/:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager) [QuorumConnectionThread-[myid=2]-15]
java.net.UnknownHostException: kafka-cluster-zookeeper-0.kafka-cluster-zookeeper-nodes.kafka-ns.svc
at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:572)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
at java.base/java.net.Socket.connect(Socket.java:633)
at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:304)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.initiateConnection(QuorumCnxManager.java:384)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$QuorumConnectionReqThread.run(QuorumCnxManager.java:458)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
All zookeeper PODs has same error.
values_ha.yaml
kafka-ha.yaml
Beta Was this translation helpful? Give feedback.
All reactions