-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding RemainingActivityListener
for diagnosis
#643
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good to me.
As far as I can tell, this is an "always on" extension that will be run on every test. Let me know if I've misunderstood.
Correct; the default behavior is to print a warning about a potential detected problem, but otherwise take no action. |
ICYMI jenkinsci/copyartifact-plugin#206 (comment) (in that case |
This flags builds that are executed with |
Not that I am aware of; do you have an example? |
Try running the core test suite with |
Possibly. Is this output visible somewhere? Some PR? Any specific test cases for example? |
That doesn't seem to help. Even with diff --git a/src/main/java/org/jvnet/hudson/test/JenkinsRule.java b/src/main/java/org/jvnet/hudson/test/JenkinsRule.java
index 424bad72..9c460f96 100644
--- a/src/main/java/org/jvnet/hudson/test/JenkinsRule.java
+++ b/src/main/java/org/jvnet/hudson/test/JenkinsRule.java
@@ -1442,6 +1442,7 @@ public class JenkinsRule implements TestRule, MethodRule, RootAction {
* Asserts that the outcome of the build is a specific outcome.
*/
public <R extends Run> R assertBuildStatus(Result status, R r) throws Exception {
+ waitForCompletion(r);
if(status==r.getResult())
return r; I still get e.g.
where that test is only running FreeStyleBuild build = rule.buildAndAssertSuccess(project); I get dozens of similar errors. I am afraid this utility is not very useful with this high rate of false positives. Note that I am running with |
I will check if I can reproduce, and adjust accordingly. |
I see some failures using diff --git test/pom.xml test/pom.xml
index 75035b367d..19e08b39d0 100644
--- test/pom.xml
+++ test/pom.xml
@@ -116,7 +116,7 @@ THE SOFTWARE.
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>jenkins-test-harness</artifactId>
- <version>2058.va_7b_41a_286207</version>
+ <version>2062.v3efc79721e45</version>
<scope>test</scope>
<exclusions>
<exclusion>
@@ -322,6 +322,7 @@ THE SOFTWARE.
<systemPropertyVariables>
<hudson.maven.debug>${mavenDebug}</hudson.maven.debug>
<buildDirectory>${project.build.directory}</buildDirectory>
+ <org.jvnet.hudson.test.RemainingActivityListener.fatal>true</org.jvnet.hudson.test.RemainingActivityListener.fatal>
</systemPropertyVariables>
<reuseForks>false</reuseForks>
</configuration> mostly of queue items, which may or may not matter depending on the circumstances, and some executors. For now, |
I'm not sure I would really consider this change complete until it is proven that the fatal mode can be configured permanently in some significant test suite. I would personally not have approved this PR in its current form. Another problem is these false positives:
|
|
||
private static String problem() { | ||
for (Computer c : Jenkins.get().getComputers()) { | ||
for (Executor x : c.getAllExecutors()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is incorrect. QueueTaskFuture#get
(which is what JenkinsRule
uses to determine build completion) waits until AsyncFutureImpl#set
is called to complete the future before notifying consumers. That call is done by WorkUnitContext#synchronizeEnd
via Executor#finish1
. At that point, Executor#isIdle
is still false and the executor is still in Computer#getAllExecutors
(which is what the code in this PR is checking). Executor#finish2
is called after Executor#finish1
and calls Computer#removeExecutor
which (possibly asynchronously!) calls executors.remove(e)
to remove the executor, at which point Computer#getAllExecutors
will finally stop returning the non-idle executor. If Executor#finish1
completes before the executor is removed by Executor#finish2
(possible in certain scheduling scenarios) then the code in this PR will result in a false positive. Similarly, if Executor#finish2
decides to run the removal asynchronously (possible in certain scheduling scenarios) then the code in this PR will also result in a false positive.
There are really two things that need to be solved to make this code correct:
- The code in this PR needs to wait until both
Executor#finish1
andExecutor#finish2
have been called before callingComputer#getAllExecutors
. Currently this code runs afterQueueTaskFuture#get
which means thatExecutor#finish1
has been called butExecutor#finish2
has not necessarily been called. - If
Executor#finish2
has been called but has placed its task ontoComputer#threadPoolForRemoting
(which happens only sometimes) then the code in this PR needs to wait until that task has completed, for example by havingComputer#removeExecutor
keep track of how many pending removals are in progress.
Helpful to see cases where #198 revealed something nondeterministic about test execution.