-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue 8093]Fix client lookup hangs when broker restarts #8101
Conversation
@@ -400,6 +400,11 @@ public boolean registerNamespace(String namespace, boolean ensureOwned) throws P | |||
private void searchForCandidateBroker(NamespaceBundle bundle, | |||
CompletableFuture<Optional<LookupResult>> lookupFuture, | |||
LookupOptions options) { | |||
if( null == pulsar.getLeaderElectionService() || ! pulsar.getLeaderElectionService().isElected()) { | |||
LOG.warn("The leader election has not yet been completed! NamespaceBundle[{}]", bundle); | |||
lookupFuture.completeExceptionally(new IllegalStateException("The leader election has not yet been completed!")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great contribution, I think it's better to return a retryable exception to the client? So that the client can reconnect later. Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed with @4onni, this will resultin a client lookup timeout, and the client will relookup later, so it's not a problem here.
/pulsarbot run-failure-checks |
4 similar comments
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
4 similar comments
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
Fixes apache#8093 ### Motivation Client hangs forever when all brokers stop and then restart. There are several steps need to be finished before the broker can be fully started, as illustrated in the pseudo code below: ``` PulsarService#start(): broker.start(); // Step 1 webService.start(); // Step 2 leaderElectionService.start(); //Step 3 ``` If a lookup request gets in between Step 2 and Step 3, a NPE would be thrown, which will block all other coming requests from getting processed properly. ### Modifications Client can only connect to the broker after the election service started successfully ### Verifying this change This change added tests and can be verified as follows: - * Added 2 test cases under `LeaderElectionServiceTest` (cherry picked from commit 65cf9c0)
Fixes apache#8093 Client hangs forever when all brokers stop and then restart. There are several steps need to be finished before the broker can be fully started, as illustrated in the pseudo code below: ``` PulsarService#start(): broker.start(); // Step 1 webService.start(); // Step 2 leaderElectionService.start(); //Step 3 ``` If a lookup request gets in between Step 2 and Step 3, a NPE would be thrown, which will block all other coming requests from getting processed properly. Client can only connect to the broker after the election service started successfully This change added tests and can be verified as follows: - * Added 2 test cases under `LeaderElectionServiceTest` (cherry picked from commit 65cf9c0)
Fixes apache#8093 ### Motivation Client hangs forever when all brokers stop and then restart. There are several steps need to be finished before the broker can be fully started, as illustrated in the pseudo code below: ``` PulsarService#start(): broker.start(); // Step 1 webService.start(); // Step 2 leaderElectionService.start(); //Step 3 ``` If a lookup request gets in between Step 2 and Step 3, a NPE would be thrown, which will block all other coming requests from getting processed properly. ### Modifications Client can only connect to the broker after the election service started successfully ### Verifying this change This change added tests and can be verified as follows: - * Added 2 test cases under `LeaderElectionServiceTest`
already merged into branch-2.6 for 2.6.2 release |
…)" This reverts commit a4a363c.
Fixes #8093 ### Motivation Client hangs forever when all brokers stop and then restart. There are several steps need to be finished before the broker can be fully started, as illustrated in the pseudo code below: ``` PulsarService#start(): broker.start(); // Step 1 webService.start(); // Step 2 leaderElectionService.start(); //Step 3 ``` If a lookup request gets in between Step 2 and Step 3, a NPE would be thrown, which will block all other coming requests from getting processed properly. ### Modifications Client can only connect to the broker after the election service started successfully ### Verifying this change This change added tests and can be verified as follows: - * Added 2 test cases under `LeaderElectionServiceTest` (cherry picked from commit 65cf9c0)
Fixes apache#8093 ### Motivation Client hangs forever when all brokers stop and then restart. There are several steps need to be finished before the broker can be fully started, as illustrated in the pseudo code below: ``` PulsarService#start(): broker.start(); // Step 1 webService.start(); // Step 2 leaderElectionService.start(); //Step 3 ``` If a lookup request gets in between Step 2 and Step 3, a NPE would be thrown, which will block all other coming requests from getting processed properly. ### Modifications Client can only connect to the broker after the election service started successfully ### Verifying this change This change added tests and can be verified as follows: - * Added 2 test cases under `LeaderElectionServiceTest`
Fixes #8093
Motivation
Client hangs forever when all brokers stop and then restart.
There are several steps need to be finished before the broker can be fully started, as illustrated in the pseudo code below:
If a lookup request gets in between Step 2 and Step 3, a NPE would be thrown, which will block all other coming requests from getting processed properly.
Modifications
Client can only connect to the broker after the election service started successfully
Verifying this change
This change added tests and can be verified as follows:
LeaderElectionServiceTest