-
Notifications
You must be signed in to change notification settings - Fork 7.3k
ZOOKEEPER-3574: Close quorum socket asynchronously to avoid server sh… #1115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2c8dc40 to
44fc0e7
Compare
|
retest maven build |
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
|
retest maven build |
cb8ac1b to
939a486
Compare
|
Refer to this link for build results (access rights to CI server needed): |
939a486 to
0473b08
Compare
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1PreCommit-ZOOKEEPER-github-pr-build-maven/org.apache.zookeeper:zookeeper: 1 |
|
retest maven build |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Manually restarted the job on Jenkins and it passed. |
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
eb6add0 to
5d233e2
Compare
anmolnar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nitpicks
|
|
||
| protected Socket sock; | ||
| protected MultipleAddresses leaderAddr; | ||
| protected AtomicBoolean sockBeingClosed = new AtomicBoolean(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think socketClosed would be a better name for this variable, because it's not only guarding the "closing" method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is ONLY guarding the closing method. When it is set, the socket might not be closed for a long time, e.g. 30 seconds. But I don't want to name it "socketClosed" because it doesn't mean the socket is closed. Thoughts?
| void closeSockSync() { | ||
| try { | ||
| long startTime = Time.currentElapsedTime(); | ||
| sock.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to add 2 more things here:
- double check if sock is still not
null - make sock = null after the close to prevent further usage of the object and let GC collect it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
Can you please introduce a https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool() |
|
Also, better than using a conditional statement in the close method: Always use a ThreadPool, but use a single-thread thread pool or a 'direct' thread pool: |
This is for closing the quorum socket once in a while, not the client sockets. I check the link you provide and it says the thread will be terminated if not being used for 60 seconds--I would expect the life time of a learner much longer than that. So I don't see any benefit of using a thread pool, or do I miss anything? |
The conditional statement is to preserve the current "blocking" behavior in case it's needed. How do I do that using a thread pool? |
ed3a9d5 to
c70965a
Compare
lvfangmin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks @jhuan31.
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will merge soon
Thank you guys!
ztzg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but 1. untested, and 2. see the documentation question.
| public static final String LEARNER_CLOSE_SOCKET_ASYNC = "learner.closeSocketAsync"; | ||
| public static final boolean closeSocketAsync = Boolean.getBoolean(LEARNER_CLOSE_SOCKET_ASYNC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not trying to block this PR, and things can be improved later, but:
I see that such properties, including learner.asyncSending above it, are not documented. Is there a specific policy regarding which properties are to be mentioned in zookeeperAdmin.md? I am wondering how one can learn about all these knobs, and their raison d'être? (I know the corresponding ticket ID can be found via Git, but that incurs some overhead.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments. I will add the flag to zookeeperAdmin. As for test, I find it is hard to construct a unit test for this. But this feature has been enabled in our production for months.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhuan31: By "untested," I meant: I have looked into the code, but I haven't tested it :) I understand that it is not always easy to create meaningful tests for socket behavior. Thank you for taking care of the documentation.
…utdown stalled by long socket closing time
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ztzg as rule of thumb we have to document each know.
Usually once we introduce a know we are no more going to drop it.
@eolivelli Does this PR need more changes? Is it ready for merge? I refer "learner.closeSockeAsync" in a later PR #1301 and @symat points out that he can't find it in the code (since this PR is not merged yet) :) |
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry for the delay.
I can merge as soon as I back to work tomorrow
…utdown stalled by long socket closing time …utdown stalled by long socket closing time Author: Jie Huang <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Fangmin Lyu <[email protected]> Closes apache#1115 from jhuan31/ZOOKEEPER-3574
…utdown stalled by long socket closing time …utdown stalled by long socket closing time Author: Jie Huang <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Fangmin Lyu <[email protected]> Closes apache#1115 from jhuan31/ZOOKEEPER-3574
…utdown stalled by long socket closing time …utdown stalled by long socket closing time Author: Jie Huang <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Fangmin Lyu <[email protected]> Closes apache#1115 from jhuan31/ZOOKEEPER-3574
…utdown stalled by long socket closing time …utdown stalled by long socket closing time Author: Jie Huang <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Fangmin Lyu <[email protected]> Closes apache#1115 from jhuan31/ZOOKEEPER-3574
…utdown stalled by long socket closing time …utdown stalled by long socket closing time Author: Jie Huang <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Fangmin Lyu <[email protected]> Closes apache#1115 from jhuan31/ZOOKEEPER-3574
…utdown stalled by long socket closing time