Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Broker] Terminate JVM when initialize-cluster-metadata command fails #16039

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Jun 13, 2022

  • the script gets stuck unless the JVM is terminated explicitly

Fixes an issue where the bin/pulsar initialize-cluster-metadata command gets stuck with these log lines as the last entries:

2022-06-13T11:33:41,805+0000 [main-SendThread(pulsar-luna-uswest1-staging-zookeeper-ca.pulsar.svc.cluster.local:2181)] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server pulsar-luna-uswest1-staging-zookeeper-ca.pulsar.svc.cluster.local/10.56.0.47:2181.
2022-06-13T11:33:41,805+0000 [main-SendThread(pulsar-luna-uswest1-staging-zookeeper-ca.pulsar.svc.cluster.local:2181)] INFO  org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
2022-06-13T11:33:42,814+0000 [main-SendThread(pulsar-luna-uswest1-staging-zookeeper-ca.pulsar.svc.cluster.local:2181)] WARN  org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send thread for session 0x0.
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777) ~[?:?]
	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:344) ~[org.apache.zookeeper-zookeeper-3.8.0.jar:3.8.0]
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1282) [org.apache.zookeeper-zookeeper-3.8.0.jar:3.8.0]
2022-06-13T11:33:42,921+0000 [main] INFO  org.apache.zookeeper.ZooKeeper - Session: 0x0 closed
2022-06-13T11:33:42,921+0000 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x0
Exception in thread "main" org.apache.pulsar.metadata.api.MetadataStoreException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
	at org.apache.pulsar.metadata.impl.ZKMetadataStore.<init>(ZKMetadataStore.java:108)
	at org.apache.pulsar.metadata.impl.MetadataStoreFactoryImpl.newInstance(MetadataStoreFactoryImpl.java:56)
	at org.apache.pulsar.metadata.impl.MetadataStoreFactoryImpl.createExtended(MetadataStoreFactoryImpl.java:36)
	at org.apache.pulsar.metadata.api.extended.MetadataStoreExtended.create(MetadataStoreExtended.java:40)
	at org.apache.pulsar.PulsarClusterMetadataSetup.initMetadataStore(PulsarClusterMetadataSetup.java:380)
	at org.apache.pulsar.PulsarClusterMetadataSetup.main(PulsarClusterMetadataSetup.java:238)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
	at org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase.waitForConnection(ZooKeeperWatcherBase.java:159)
	at org.apache.pulsar.metadata.impl.PulsarZooKeeperClient$Builder.build(PulsarZooKeeperClient.java:259)
	at org.apache.pulsar.metadata.impl.ZKMetadataStore.<init>(ZKMetadataStore.java:100)
	... 5 more

The PR will terminate the JVM when an exception happens to ensure that the command completes.

- the script gets stuck unless the JVM is terminated explicitly
Copy link
Contributor

@MMirelli MMirelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@gaoran10 gaoran10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari merged commit eb9e0aa into apache:master Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants