Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover gracefully when a PlaceholderTask is in the queue but the associated build is complete #185

Merged
merged 6 commits into from
Dec 6, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
<useBeta>true</useBeta>
<gitHubRepo>jenkinsci/${project.artifactId}-plugin</gitHubRepo>
<hpi.compatibleSinceVersion>2.40</hpi.compatibleSinceVersion>
<jenkins-test-harness.version>1664.ve9ed23f5e0f2</jenkins-test-harness.version> <!-- TODO: https://github.com/jenkinsci/jenkins-test-harness/pull/353 -->
<jenkins-test-harness.version>1666.vd1360abbfe9e</jenkins-test-harness.version>
</properties>
<dependencyManagement>
<dependencies>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,22 @@ public String getCookie() {
}

@Override public CauseOfBlockage getCauseOfBlockage() {
Run<?, ?> run = runForDisplay();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is safe. If we get here for a step that is just starting or resuming, then the run is already loaded and so this should complete quickly. The only time this should be slow is if this is after a Jenkins restart and the build has already completed so we end up here without the build having been loaded via some other route and we trigger the cancellation path.

if (!stopping && run != null && !run.isLogUpdated()) {
stopping = true;
LOGGER.warning(() -> "Refusing to build " + this + " and cancelling it because associated build is complete");
Timer.get().execute(() -> {
Queue.getInstance().cancel(this);
});
}
if (stopping) {
return new CauseOfBlockage() {
@Override
public String getShortDescription() {
return "Stopping " + getDisplayName();
}
};
Comment on lines +432 to +437
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if an anonymous class or hard-coded text is ok here. My thought was that this cause should not usually be around long enough for anyone to see it, but I guess we should set up localization just in case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An anonymous class should be fine here. As you say, it ought not appear in the GUI for more than a moment if at all.

}
return null;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,6 @@
import static org.junit.Assert.fail;
import org.junit.Assume;
import org.junit.ClassRule;
import org.junit.Ignore;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;
Expand Down Expand Up @@ -1268,8 +1267,8 @@ public void accessPermittedOnlyFromCurrentBuild() throws Throwable {
});
}

@Ignore("TODO safe fix still TBD")
@Test public void placeholderTaskInQueueButAssociatedBuildComplete() throws Throwable {
logging.record(ExecutorStepExecution.class, Level.FINE).capture(50);
Path tempQueueFile = tmp.newFile().toPath();
sessions.then(r -> {
WorkflowJob p = r.createProject(WorkflowJob.class, "p");
Expand All @@ -1289,6 +1288,8 @@ public void accessPermittedOnlyFromCurrentBuild() throws Throwable {
// Create a node with the correct label and let the build complete.
DumbSlave node = r.createOnlineSlave(Label.get("custom-label"));
r.assertBuildStatusSuccess(r.waitForCompletion(b));
// Remove node so that tasks requiring custom-label are stuck in the queue.
Jenkins.get().removeNode(node);
});
// Copy the temp queue.xml over the real one. The associated build has already completed, so the queue now
// has a bogus PlaceholderTask.
Expand All @@ -1298,7 +1299,10 @@ public void accessPermittedOnlyFromCurrentBuild() throws Throwable {
WorkflowRun b = p.getBuildByNumber(1);
assertFalse(b.isLogUpdated());
r.assertBuildStatusSuccess(b);
assertThat(Queue.getInstance().getItems(), emptyArray()); // This assertion fails.
while (Queue.getInstance().getItems().length > 0) {
Thread.sleep(100L);
}
assertThat(logging.getMessages(), hasItem(startsWith("Refusing to build ExecutorStepExecution.PlaceholderTask{runId=p#")));
});
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the reason there was no newline here before is that this step was a @TestExtension of the test formerly above it, now with the new test intervening.

Expand Down