[Bug]: Kafka broker does not start due to zookeeper nameresolution error in operator log #9645
Replies: 3 comments 1 reply
-
Please make sure to format things properly to make them readable before opening anything. Without a proper formatting, nobody can read it and it is not really useful. |
Beta Was this translation helpful? Give feedback.
-
Apologies, I'll try to provide a clearer explanation now:
|
Beta Was this translation helpful? Give feedback.
-
You were right, it was a networkpolicy that prevented access to the DNS. One it was resolved every pod started properly. Thanks for your help |
Beta Was this translation helpful? Give feedback.
-
Bug Description
After install 0.38.0 operator version Kafka broqker nerver start because following error:
2024-02-06 08:12:24 ERROR VertxUtil:155 - Reconciliation #20(watch) Kafka(pos-pre/pos-batch-broker): Exceeded timeout of 300000ms while waiting for ZooKeeperAdmin connection to pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc:2181 to be connected
2024-02-06 08:12:24 ERROR StaticHostProvider:148 - Unable to resolve address: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc/:2181
java.net.UnknownHostException: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
2024-02-06 08:12:24 ERROR AbstractOperator:284 - Reconciliation #20(watch) Kafka(pos-pre/pos-batch-broker): createOrUpdate failed
io.strimzi.operator.cluster.operator.resource.ZookeeperScalingException: Failed to connect to Zookeeper pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc:2181. Connection was not ready in 300000 ms.
at io.strimzi.operator.cluster.operator.resource.ZookeeperScaler.lambda$connect$7(ZookeeperScaler.java:175) ~[io.strimzi.cluster-operator-0.38.0.jar:0.38.0]
at io.vertx.core.impl.future.FutureImpl$3.onSuccess(FutureImpl.java:141) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.vertx.core.impl.future.FutureImpl.tryComplete(FutureImpl.java:211) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.vertx.core.impl.future.PromiseImpl.tryComplete(PromiseImpl.java:23) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.vertx.core.impl.future.PromiseImpl.onSuccess(PromiseImpl.java:49) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:54) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[io.netty.netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at java.lang.Thread.run(Thread.java:840) ~[?:?]
Steps to reproduce
Install operator for one namespace scope
Create Kafka :
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: pos-batch-broker
spec:
kafka:
version: 3.6.0
replicas: 3
logging:
type: inline
loggers:
kafka.root.logger.level: "ERROR"
log4j.logger.kafka.authorizer.logger: "ERROR"
listeners:
port: 9092
type: internal
tls: false
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
inter.broker.protocol.version: "3.6"
storage:
type: jbod
volumes:
type: persistent-claim
size: 20Gi
deleteClaim: false
zookeeper:
replicas: 1
storage:
type: persistent-claim
size: 20Gi
deleteClaim: false
entityOperator:
topicOperator: {}
userOperator:
watchedNamespace: pos-pre
reconciliationIntervalSeconds: 60
Zookeeper starts but kafka broker nerver does
Expected behavior
No response
Strimzi version
0.38.0
Kubernetes version
OpenShift 4.10
Installation method
helm chart
Infrastructure
bare-metal
Configuration files and logs
(⎈|opo:pos-pre)➜ git-ops-strimzi-kafka-operator git:(ops-strimzi-kafka-operator) oc logs -f strimzi-cluster-operator-658944789b-cr7pl
Auto-detection of KUBERNETES_SERVICE_DNS_DOMAIN failed. The default value cluster.local will be used.
2024-02-06 08:01:40 ERROR VertxUtil:155 - Reconciliation Remove lock mechanism with something better #1(watch) Kafka(pos-pre/pos-batch-broker): Exceeded timeout of 300000ms while waiting for Pods resource pos-batch-broker-zookeeper-0 in namespace pos-pre to be ready
2024-02-06 08:01:40 ERROR VertxUtil:155 - Reconciliation Remove lock mechanism with something better #1(watch) Kafka(pos-pre/pos-batch-broker): Exceeded timeout of 300000ms while waiting for Pods resource pos-batch-broker-zookeeper-1 in namespace pos-pre to be ready
2024-02-06 08:01:40 ERROR VertxUtil:155 - Reconciliation Remove lock mechanism with something better #1(watch) Kafka(pos-pre/pos-batch-broker): Exceeded timeout of 300000ms while waiting for Pods resource pos-batch-broker-zookeeper-2 in namespace pos-pre to be ready
2024-02-06 08:01:40 ERROR AbstractOperator:284 - Reconciliation Remove lock mechanism with something better #1(watch) Kafka(pos-pre/pos-batch-broker): createOrUpdate failed
io.strimzi.operator.common.operator.resource.TimeoutException: Exceeded timeout of 300000ms while waiting for Pods resource pos-batch-broker-zookeeper-0 in namespace pos-pre to be ready
at io.strimzi.operator.common.VertxUtil$1.lambda$handle$1(VertxUtil.java:156) ~[io.strimzi.operator-common-0.38.0.jar:0.38.0]
at io.vertx.core.impl.future.FutureImpl$3.onFailure(FutureImpl.java:153) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.vertx.core.impl.future.FutureBase.lambda$emitFailure$1(FutureBase.java:69) ~[io.vertx.vertx-core-4.4.6.jar:4.4.6]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[io.netty.netty-transport-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.100.Final.jar:4.1.100.Final]
at java.lang.Thread.run(Thread.java:840) ~[?:?]
2024-02-06 08:02:43 ERROR StaticHostProvider:148 - Unable to resolve address: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc/:2181
java.net.UnknownHostException: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) ~[?:?]
at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:934) ~[?:?]
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1543) ~[?:?]
at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:852) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
2024-02-06 08:02:44 ERROR StaticHostProvider:148 - Unable to resolve address: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc/:2181
java.net.UnknownHostException: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper.zookeeper-3.8.3.jar:3.8.3]
2024-02-06 08:02:45 ERROR StaticHostProvider:148 - Unable to resolve address: pos-batch-broker-zookeeper-0.pos-batch-broker-zookeeper-nodes.pos-pre.svc/:2181
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions