Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-69890] Prevent deadlock on websocket agents #595

Merged
merged 5 commits into from
Oct 26, 2022

Conversation

amuniz
Copy link
Member

@amuniz amuniz commented Oct 19, 2022

More information in JENKINS-69890

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@amuniz amuniz requested review from jglick and Vlatombe October 19, 2022 09:48
Comment on lines +628 to +630
// making this call async to avoid potential deadlocks when some thread is holding a lock on the
// channel object while this thread is trying to acquire it to call Transport#terminate
ch.get().executor.submit(() -> transport.terminate(new ChannelClosedException(ch.get(), null)));
Copy link
Member

@Vlatombe Vlatombe Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect a similar change would be needed in onError method if it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

I just added it as well.

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs at least a manual sanity check. You can also check in core WebSocketAgentsTest and JNLPLauncherRealTest.webSocket.

@basil
Copy link
Member

basil commented Oct 21, 2022

I wonder if this could be one cause of occasional flakes in JnlpSlaveRestarterInstallerTest#webSocketReconnection like https://ci.jenkins.io/job/Core/job/jenkins/job/PR-7283/1/testReport/junit/jenkins.slaves.restarter/JnlpSlaveRestarterInstallerTest/Windows_jdk11___Windows_Build___Test___webSocketReconnection/ — I've never seen that test flake with TCP reconnection.

@aneveux
Copy link
Member

aneveux commented Oct 25, 2022

@jglick Good idea!

On top of running the automated tests of that project, I also built it and used it in core in order to run the tests you mentioned, and they went fine.

@amuniz
Copy link
Member Author

amuniz commented Oct 26, 2022

Two spotbugs issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For changelog: Fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants