-
Notifications
You must be signed in to change notification settings - Fork 1.9k
IGNITE-13012 Make node connection checking rely on the configuration. Simplify node ping routine. #7835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Please check the correctness of the jira issue number in the PR heading. It looks like it is incorrect. IGNITE-13021 is about the new SQL engine, not about the connectivity. |
Fixed on IGNITE-13012. Thanks! |
# Conflicts: # modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java
modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java
Outdated
Show resolved
Hide resolved
.../core/src/test/java/org/apache/ignite/internal/GridFailFastNodeFailureDetectionSelfTest.java
Outdated
Show resolved
Hide resolved
modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java
Outdated
Show resolved
Hide resolved
modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java
Show resolved
Hide resolved
modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java
Show resolved
Hide resolved
modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/TcpDiscoveryImpl.java
Outdated
Show resolved
Hide resolved
modules/core/src/test/java/org/apache/ignite/spi/discovery/tcp/ConnectionCheckTest.java
Outdated
Show resolved
Hide resolved
modules/core/src/test/java/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySelfTest.java
Outdated
Show resolved
Hide resolved
|
@sergey-chugunov-1985 , I find you have good experience in TcpDiscoverySpi. Could you take a look at this ticket too? |
| hasRemoteSrvNodes = ring.hasRemoteServerNodes(); | ||
|
|
||
| if (hasRemoteSrvNodes) { | ||
| if (hasRemoteSrvNodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to call updateLastSentMessageTime method here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to call
updateLastSentMessageTimemethod here as well?
We hasn't successfully sent message here, we hasn't received RES_OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you can see, we call updateLastSentMessageTime() after successful reading spi.readReceipt or proper TcpDiscoveryHandshakeResponse. These are the places where we are sure the message was sent and connection is OK.
… Simplify node ping routine. (apache#7835)
This PR is first step of improvement and quickening of node failure detection. We should obtain simple, predictable and configurable node pinging.
Fixes:
Connection failure is kept within IgniteConfiguration.failureDetectionTimeout instead of 500ms + IgniteConfiguration.failureDetectionTimeout.
Interval of connection checking in TCP discovery made rely on configured failure detection timeout. Previous 500ms is the minimal interval at now. This is done to get robust node pinging and keep failure detection timeout accurate.
Removed additional connection checking. This premature node ping relied also on any received message. Imagine: if node 2 receives no message from previous node 1 within some time, it decides to do extra ping next node 3 not waiting for regular ping. This brought mess, confusion and gave no considerable guaranties.
Behavior changes:
TcpDiscoveryConnectionCheckMessage is not sent if there is a message traffic within actual failure detection timeout because any message checks connection.
Failure detection timeout is now overal timeout since last message sent. Not a timeout on current message exchange.