Kafka Liveness probe not working for some cases #10332
Replies: 2 comments 1 reply
-
That is correct and it is by design. You do not want to spam the broker with connections on all the different ports (among other reasons because it would be spamming the logs with errors as well). You also cannot check any other connections because you cannot assume what connections should or should not be there. |
Beta Was this translation helpful? Give feedback.
-
OK, understood. In my specific case, kafka was stuck (as we see in nc response code) and liveness probe was not able to catch it, so pod was Ready but unresponsive. Thanks a lot for your inputs!! |
Beta Was this translation helpful? Give feedback.
-
HI - Starting this discussion to get some views on current liveness probe in strimzi. We are facing one problem in some cases, when kafka pod gets into ready state but somehow does not respond to requests on any port.
Kafka - 3.2.0
Strimzi - 0.29
`
[kafka@kafka-cluster-kafka-0 kafka]$ netstat -antp | grep LIST
tcp6 0 0 :::9404 :::* LISTEN 143/java
tcp6 51 0 :::9091 :::* LISTEN 143/java
tcp6 0 0 :::9090 :::* LISTEN 143/java
tcp6 51 0 :::9093 :::* LISTEN 143/java
tcp6 51 0 :::9092 :::* LISTEN 143/java
tcp6 0 0 :::42095 :::* LISTEN 143/java
[kafka@kafka-cluster-kafka-0 kafka]$ netstat -lnt | grep -Eq 'tcp6?[[:space:]]+[0-9]+[[:space:]]+[0-9]+[[:space:]]+[^ ]+:9091.LISTEN[[:space:]]'
[kafka@kafka-cluster-kafka-0 kafka]$ echo $?
0
[kafka@kafka-cluster-kafka-0 kafka]$ nc -z ::1 9092
[kafka@kafka-cluster-kafka-0 kafka]$ echo $?
1
`
So, I looked at the liveness probe to understnd why it did not restarted. It seems we are checking 9091 in netstat LISTEN state, but we are not checking its actual connection.
Any recommendations, why it may be happening and is okay to modify liveness probe to have additional checks to have kubernetes restart them based on situations like above.
Beta Was this translation helpful? Give feedback.
All reactions