Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafkamirror2 conf not working #7713

Closed
ervikrant06 opened this issue Nov 27, 2022 · 1 comment
Closed

kafkamirror2 conf not working #7713

ervikrant06 opened this issue Nov 27, 2022 · 1 comment
Labels

Comments

@ervikrant06
Copy link

Describe the bug

  • Created two clusters in same k8s cluster in different namespaces.
  • tried to setup replication between these two clusters using KafkaMirrorMaker2 resource.
  • KafkaMirrorMaker2 resource POD stuck in crashpoolback with the following call trace.
Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching kafka-test1.kube2.example.com found.
        at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:353)
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:296)
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:291)
        at java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1357)
        at java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1232)
        at java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1175)
        at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
        at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
        at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1074)
        at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1061)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1008)
        at org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
        at org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
        at org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
        at org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
        at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
        at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1415)
        at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1346)
        at java.base/java.lang.Thread.run(Thread.java:829)
        at java.base/java.lang.Thread.run(Thread.java:829)

Expected behavior
It should have come up automatically as certs are managed by operator itself.

Environment (please complete the following information):

  • Strimzi operator version: 0.32.0
  • Installation method: YAML files.
  • Kubernetes cluster: Kubernetes v1.24.2

YAML files and logs

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaMirrorMaker2
metadata:
  name: my-mirror-maker-2
spec:
  version: 3.3.1
  replicas: 1
  connectCluster: "my-target-cluster"
  clusters:
  - alias: "my-source-cluster"
    bootstrapServers: kafka-test.kube2.example.com:9094
    tls:
      trustedCertificates:
        - secretName: test-kafka-cluster-ca-cert
          certificate: ca.crt
  - alias: "my-target-cluster"
    bootstrapServers: kafka-test1.kube2.example.com:9094
    tls:
      trustedCertificates:
        - secretName: test-kafka1-cluster-ca-cert
          certificate: ca.crt
    config:
      # -1 means it will use the default replication factor configured in the broker
      config.storage.replication.factor: -1
      offset.storage.replication.factor: -1
      status.storage.replication.factor: -1
  mirrors:
  - sourceCluster: "my-source-cluster"
    targetCluster: "my-target-cluster"
    sourceConnector:
      config:
        replication.factor: 1
        offset-syncs.topic.replication.factor: 1
        sync.topic.acls.enabled: "false"
    heartbeatConnector:
      config:
        heartbeats.topic.replication.factor: 1
    checkpointConnector:
      config:
        checkpoints.topic.replication.factor: 1
    topicsPattern: ".*"
    groupsPattern: ".*"


$ kubectl get KafkaMirrorMaker2
NAME                DESIRED REPLICAS   READY
my-mirror-maker-2   1

$ kubectl get pod
NAME                                              READY   STATUS             RESTARTS      AGE
my-mirror-maker-2-mirrormaker2-55cbff5d9d-vlf8j   0/1     CrashLoopBackOff   9 (29s ago)   21m

Kafka deployment partial snippet relevant to the issue.

  clientsCa:
    generateCertificateAuthority: true
    validityDays: 1825
  clusterCa:
    generateCertificateAuthority: true
    validityDays: 365
  kafka:

      configuration:
        brokers:
        - advertisedHost: test-kafka1-kafka-0.kube2.example.com
          advertisedPort: 9094
          broker: 0
          annotations:
            external-dns.alpha.kubernetes.io/hostname: test-kafka1-kafka-0.kube2.example.com
        - advertisedHost: test-kafka1-kafka-1.kube2.example.com
          advertisedPort: 9094
          broker: 1
          annotations:
            external-dns.alpha.kubernetes.io/hostname: test-kafka1-kafka-1.kube2.example.com
        - advertisedHost: test-kafka1-kafka-2.kube2.example.com
          advertisedPort: 9094
          broker: 2
          annotations:
            external-dns.alpha.kubernetes.io/hostname: test-kafka1-kafka-2.kube2.example.com
    livenessProbe:
      initialDelaySeconds: 1500
    replicas: 3
    resources:
      limits:
        cpu: "8"
        memory: 16Gi
      requests:
        cpu: "4"
        memory: 8Gi
    storage:
      class: storageclass-wekafs-dir-api
      deleteClaim: false
      size: 100Gi
      type: persistent-claim
    template:
      externalBootstrapService:
        metadata:
          annotations:
            external-dns.alpha.kubernetes.io/hostname: kafka-test1.kube2.example.com

I don't see any SAN name embedded in certs.

Using p-dns/e-dns setup on k8s.

@ervikrant06
Copy link
Author

Added SAN using alternativeNames property.

$ kubectl get secret test-kafka1-kafka-brokers -o jsonpath='{.data.test-kafka1-kafka-0\.crt}' | base64 --decode | openssl x509 -noout -text | grep DNS | tr , '\n' | cut -d: -f2 | grep 'example.com'
kafka-test1.kube2.example.com
test-kafka1-kafka-0.kube2.example.com

This time it shows me different error in kafka mirror2 pod logs.

2022-11-27 10:55:52,493 ERROR [AdminClient clientId=adminclient-1] Connection to node -1 (kafka-test1.kube2.example.com/10.xx.xx.xx:9094) fail
ed authentication due to: Failed to process post-handshake messages (org.apache.kafka.clients.NetworkClient) [kafka-admin-client-thread | adminclient-1]

2022-11-27 10:55:52,494 WARN [AdminClient clientId=adminclient-1] Metadata update failed due to authentication error (org.apache.kafka.clients.admin.internal
s.AdminMetadataManager) [kafka-admin-client-thread | adminclient-1]
org.apache.kafka.common.errors.SslAuthenticationException: Failed to process post-handshake messages
Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate
        at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)
        at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117)
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:340)
        at java.base/sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:293)
        at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:186)
        at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:172)
        at java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681)
        at java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636)
        at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454)
        at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433)
        at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637)
        at org.apache.kafka.common.network.SslTransportLayer.read(SslTransportLayer.java:576)
        at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:95)
        at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:452)
        at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:402)
        at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:674)
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:576)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
        at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1415)
        at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1346)
        at java.base/java.lang.Thread.run(Thread.java:829)



2022-11-27 10:55:52,499 ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed) [main]
org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
        at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:77)
        at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:58)
        at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:97)
        at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:80)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.SslAuthenticationException: Failed to process post-handshake messages
        at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
        at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
        at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:71)
        ... 3 more

@strimzi strimzi locked and limited conversation to collaborators Nov 27, 2022
@scholzj scholzj converted this issue into discussion #7714 Nov 27, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

1 participant