-
Notifications
You must be signed in to change notification settings - Fork 98
Deflake and speed up parts of ExecutorStepTest
#224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
calling setLabels on an agent will not persist the node. in older versions of Jenkins the tests would be less flaky as adding any Node would cause all labels to be re-evaluated, so when creating a few agents and adding labels in a loop the last one created would at least deterministically ensure that all previous agents labels where correct. However since 2.332 (jenkinsci/jenkins#5882) only labels part of a node added or removed would be updated, and when creating the agents they where created without labels, which where added later. This caused tests to be flaky depending on when the periodic `trimLabels` was called (or at least on other timing related things) THis was discovered by enabling a loggerrule for hudson.model.queue and observing that the builds would timeout as not all the agents would have the expected nodes. e.g. ``` 12.023 [id=141] FINEST hudson.model.Queue#maintain: JobOffer[ jenkinsci#1] rejected part of demo jenkinsci#1: ?Jenkins? doesn?t have label ?foo? 12.023 [id=141] FINEST hudson.model.Queue#maintain: JobOffer[slave0 #0] is a potential candidate for task part of demo jenkinsci#1 12.024 [id=141] FINEST hudson.model.Queue#maintain: JobOffer[slave2 #0] rejected part of demo jenkinsci#1: ?slave2? doesn?t have label ?foo? 12.024 [id=141] FINEST hudson.model.Queue#maintain: JobOffer[slave1 #0] rejected part of demo jenkinsci#1: ?slave1? doesn?t have label ?foo? 12.024 [id=141] FINEST hudson.model.Queue#maintain: JobOffer[ #0] rejected part of demo jenkinsci#1: ?Jenkins? doesn?t have label ?foo? 12.024 [id=141] FINEST hudson.model.Queue#maintain: JobOffer[slave3 #0] rejected part of demo jenkinsci#1: ?slave3? doesn?t have label ?foo? ``` from `reuseNodesWithSameLabelsInParallelStages` Additionally creating agents and waiting for them to come oneline is slow. A pipeline will start and then wait for the node to be available, so we can do other things whilst the agent is connecting. For the case where we need a number of agents connected before we start to run the pipeline, we now create iall the agents before waiting for them to connect.
5f726ea
to
47e6c43
Compare
(could switch label to |
These are currently the most flaky of all, so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One apparent mistake, otherwise looks good.
@@ -139,10 +140,8 @@ public class ExecutorStepTest { | |||
*/ | |||
@Test public void buildShellScriptOnSlave() throws Throwable { | |||
sessions.then(r -> { | |||
DumbSlave s = r.createOnlineSlave(); | |||
s.setLabelString("remote quick"); | |||
DumbSlave s = r.createSlave("remote quick", null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Note: for Jenkins versions > 2.65, the number of agents must be increased to 5. | ||
// This is due to changes in the Load Balancer (See JENKINS-60563). | ||
int totalAgents = Jenkins.getVersion().isNewerThan(new VersionNumber("2.265")) ? 5 : 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW the comment does not match the version number string, and both are older than the baseline…could be cleaned up (but best left to another PR I suppose).
src/test/java/org/jenkinsci/plugins/workflow/support/steps/ExecutorStepTest.java
Outdated
Show resolved
Hide resolved
ExecutorStepTest
…cutorStepTest.java Co-authored-by: Jesse Glick <[email protected]>
calling
setLabels
on an agent will not persist the node.in older versions of Jenkins the tests would be less flaky as adding any
Node would cause all labels to be re-evaluated, so when creating a few
agents and adding labels in a loop the last one created would at least
deterministically ensure that all previous agents labels where correct.
However since 2.332 (jenkinsci/jenkins#5882)
only labels part of a node added or removed would be updated, and when
creating the agents they where created without labels, which where added
later.
This caused tests to be flaky depending on when the periodic
trimLabels
was called (or at least on other timing related things)THis was discovered by enabling a loggerrule for hudson.model.queue and
observing that the builds would timeout as not all the agents would have
the expected nodes.
e.g.
from
reuseNodesWithSameLabelsInParallelStages
Additionally creating agents and waiting for them to come oneline is
slow. A pipeline will start and then wait for the node to be available,
so we can do other things whilst the agent is connecting.
For the case where we need a number of agents connected before we start
to run the pipeline, we now create iall the agents before waiting for them to connect.