Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/rc active peers #1120

Merged
merged 3 commits into from
Jul 6, 2018
Merged

Fix/rc active peers #1120

merged 3 commits into from
Jul 6, 2018

Conversation

mkalinin
Copy link
Contributor

@mkalinin mkalinin commented Jul 5, 2018

No description provided.

It occurred quite rare but turned into a ghost peer connection which is actually dead,
but triggered in tx and blocks propagation threads,
thus messages were queued in memory with no chance for release.
ChannelManager#newPeers is a type of CopyOnWriteArrayList and when notifyDisconnect was called
peer had been simply removed from a copy of newPeers list
while processNewPeers execution added it into activePeers at the same time.
@mkalinin mkalinin requested a review from zilm13 July 5, 2018 15:18
@mkalinin mkalinin requested a review from Nashatyrev July 5, 2018 15:18
@coveralls
Copy link

coveralls commented Jul 5, 2018

Coverage Status

Coverage decreased (-0.02%) to 56.02% when pulling 051fd73 on fix/rc-active-peers into 5913b62 on develop.

Copy link
Collaborator

@zilm13 zilm13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@@ -153,7 +153,7 @@ public void connect(Node node) {
return ids;
}

private void processNewPeers() {
private synchronized void processNewPeers() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such things like channel.disconnect() under lock make me nervous :(
I.e. we are calling method which can potentially traverse many components. This is very deadlock prone approach imho

@@ -344,7 +344,7 @@ public void add(Channel peer) {
newPeers.add(peer);
}

public void notifyDisconnect(Channel channel) {
public synchronized void notifyDisconnect(Channel channel) {
logger.debug("Peer {}: notifies about disconnect", channel);
channel.onDisconnect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here. There is no need to call those notify methods under the lock. They can potentially have deadlock side effects

@Nashatyrev
Copy link
Member

Nashatyrev commented Jul 6, 2018

IMHO synchronizing such central manager class should be done very carefully.

Don't like outer calls under locks in the original fix. They are potentially deadlock prone. Even though the current implementation may not deadlock minor harmless change on the other side may cause the problem.

The only thing we need to assure here is that on notifyDisconnect call the peer would be 100% deleted from either newPeers or activePeers. To achieve this we would better synchronize only those fields access

@mkalinin mkalinin added this to the 1.9.0 milestone Jul 6, 2018
@mkalinin mkalinin merged commit f93d484 into develop Jul 6, 2018

processed.add(peer);
if (addCnt > 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addCnt is not used anymore in this code

@mkalinin mkalinin deleted the fix/rc-active-peers branch December 26, 2018 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants