-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove queue locking requirement when adding new nodes #5450
Conversation
} | ||
fireOnFailure(f, cause); | ||
} | ||
Jenkins jenkins = Jenkins.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire change is removing locking and changing 8 spaces of indent
protected void updateNewComputer(final Node n, boolean automaticSlaveLaunch) { | ||
final String nodeName = n.getNodeName(); | ||
final Map<Node, Computer> computers = getComputerMap(); | ||
if (computers.containsKey(n)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity checking although technically not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, any specific testing been done?
I've only managed some basic testing, I don't have the facilities to do a good enough stress test. |
@jglick it is years since I looked at this code. I have forgotten most everything I knew. Removing myself from the reviewers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comment
🤷 Thought I would try! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks sensible 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see anything wrong in this pull request. I suggest to go ahead with merging, but indeed we should be monitoring the Jenkins issue tracker for reported regressions. We have plenty of time until the next baseline, so it is a good time to take some risks
We may merge it in 24 hours if there is no negative feedback. Please see the merge process documentation for more information about the merge process
}); | ||
old.set(nodes.put(node.getNodeName(), node)); | ||
jenkins.updateNewComputer(node); | ||
jenkins.trimLabels(); | ||
// TODO there is a theoretical race whereby the node instance is updated/removed after lock release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no longer a lock above, is this comment obsolete?
With this PR, |
Indeed:
Although the config screen for |
I suspect this somehow broke jenkins/core/src/main/java/hudson/model/Computer.java Lines 882 to 895 in 20082d4
|
Not sure how this causes a problem, but the test issue seems to indicate as much |
Appears to cause JENKINS-69534 |
@@ -141,17 +141,10 @@ public void addNode(final @NonNull Node node) throws IOException { | |||
|
|||
Node oldNode = nodes.get(node.getNodeName()); | |||
if (node != oldNode) { | |||
// TODO we should not need to lock the queue for adding nodes but until we have a way to update the | |||
// computer list for just the new node | |||
AtomicReference<Node> old = new AtomicReference<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the inner class gone, there is no more need for an AtomicReference
so this could be simplified.
See JENKINS-65501.
While investigating I saw that technically addNode can be made lock-free so this is a stab at it. This would greatly help cloud implementations when provisioning a large number of nodes but of course this will not help removal as that will still require locking.
It can be made non locking because a Node with no computers will not have any executors, this will ensure that it does not confuse Queue#maintain(). But because updateComputerList deals with other computers as well this method needs to lock. Adding a separate method for a single new computer should remove this locking requirement.
References:
Queue#maintain()
Proposed changelog entries
Proposed upgrade guidelines
N/A
Submitter checklist
Proposed changelog entries
section only if there are breaking changes or other changes which may require extra steps from users during the upgradeDesired reviewers
@mention
Maintainer checklist
Before the changes are marked as
ready-for-merge
:Proposed changelog entries
are correctupgrade-guide-needed
label is set and there is aProposed upgrade guidelines
section in the PR title. (example)lts-candidate
to be considered (see query).