-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Implement parallel executor service without ForkJoinPool
#5060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* More precise clock * Fixed width thread names * Abbreviated package names
To avoid race conditions with other workers that lead to stalling.
var allForkedChildren = forkConcurrentChildren(testTasks, isolatedTasks::add, sameThreadTasks); | ||
executeAll(sameThreadTasks); | ||
var remainingForkedChildren = stealWork(allForkedChildren); | ||
waitFor(remainingForkedChildren); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name remainingForkedChildren
isn't quite right. These can be children that are picked up by other threads or children that are currently blocked by a locked resource.
So before waiting this thread could execute any remaining unexecuted children in a blocking fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a proposal? concurrentChildren
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not immediately. I'd rework the logic first then see what falls out.
The logic would have to make a distinction between children that executed by another thread and children that were skipped over to due to resource locks. The latter can be executed blockingly before calling waitFor
. While looping through the resourceLockedChildren
(?) the remainingForkedChildren
would also have to be updated as other threads pick them up.
So maybe name remainingForkedChildren
to childrenExecutedOnAnotherThread
, but that is unwieldy. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I should have prefaced this by noting that the current thread starts to wait to early, there is potentially still work left to be done. But the naming is hiding that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good suggestion fell out in 1ffca7b, not included on this branch yet.
...rg/junit/platform/engine/support/hierarchical/ConcurrentHierarchicalTestExecutorService.java
Show resolved
Hide resolved
...rg/junit/platform/engine/support/hierarchical/ConcurrentHierarchicalTestExecutorService.java
Outdated
Show resolved
Hide resolved
if (result == 0) { | ||
result = Boolean.compare(this.isContainer(), that.isContainer()); | ||
if (result == 0) { | ||
result = Integer.compare(that.attempts, this.attempts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think attempt
should be sorted on first.
---
config:
layout: elk
---
erDiagram
A ||--o{ AA: ""
A ||--o{ AB: ""
A ||--o{ AC: ""
AA ||--o{ AAA: ""
AA ||--o{ AAB: ""
AB ||--o{ ABA: ""
AB ||--o{ ABB: ""
Given this tree. Suppose
- thread 1 is working on AA, claiming resource A,
- thread 2 is working on AB and ABA, no resources needed.
- thread 3 wants to work on AC, and would need resource A.
Item ABB would be available for work stealing, but isn't picked up by thread 3 because every time item AC is considered it readded to the front of the queue with an incremented attempt counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. No it won't because the work stealer will use executeTask
which blocks. So it hangs on ABB
. This doesn't seem efficient.
I need to think about this a bit. It feels like there should be a more efficient way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could work steal across subtrees but then would have to make sure the resource locks are compatible. My intention was to avoid that complication at the expense of needing an extra thread in these cases. I think that tradeoff is acceptable as long as we guarantee that no more than parallelism
tests are running concurrently which is now accomplished via worker leases. Happy to discuss ideas, though, of course!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking about stealing work across subtrees. That's probably not necessary (yet?). Rather, I'm looking to keep the maximum number of created threads low and reuse existing once as much as possible - or when the maximum pool size is limited, optimize utilization.
This is a useful property, for example when using a web driver, it is reused between tests by using a thread locals. So fewer threads means fewer drivers are created.
if (!queueEntries.isEmpty()) { | ||
if (sameThreadTasks.isEmpty()) { | ||
// hold back one task for this thread | ||
sameThreadTasks.add(queueEntries.poll().task); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm observing that invoking poll()
does not preserve queueEntries
insertion order. This means that we put children, a, b and c in the queue they'll come out a, c, b.
To ensure our tests are stable and understandable is probably not what we want to happen. And while there is no guaranteed ordering during test execution, when users do order their tests, I would expect that we pick them up in that order if possible.
Perhaps each entry needs an index in addition to its level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make a test for this too. There are at least a few (2+) entries needed to make the flip happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var claimed = workQueue.remove(entry); | ||
if (claimed) { | ||
LOGGER.trace(() -> "stole work: " + entry); | ||
var executed = tryExecute(entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an intuition that the current thread should await any locks while the queue processing threads should skip over locked tasks. I.e. the reverse of the current setup.
If implemented correctly, the invariant of the current thread is it will finish all nodes of the sub-tree it working on unless they're stolen away. Because the current thread can't steal any work, it is not wasted if it has to await a lock and there is nothing else to do. The queue processing nodes however are wasted, they could be stealing something else.
We can then refine that further by making the current thread also skip over locked tasks and come back to them later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to give that a try on another branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though I don't know how much work you've already got in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now pushed everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, pushed a few more commits. I hope they are not throwing a wrench in your plans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This delays blocking until there is nothing else to do #5078
This PR introduces a new implementation of
HierarchicalTestExecutorService
that runs rests in parallel and has limited work stealing capabilities but is not based onForkJoinPool
. It avoids its pitfalls (such as #3945) and #3108 but may require additional threads because its work stealing is limited to direct children. Contrary to theForkJoinPool
implementation, the new executor service guarantees that no more thanparallelism
test nodes are executed in parallel.My intention is to initially ship this implementation as an opt-in feature (via the new
junit.jupiter.execution.parallel.executor
configuration parameter) in 6.1, make it an opt-out feature in 6.2, and drop support for theForkJoinPool
-based implementation in a later to-be-determined release.The PR is not yet finished but feedback is already welcome! If you use parallel test execution in your projects (or other test engines), it would be great if you could try out the new implementation and report your observations.
Resolves #3108.
I hereby agree to the terms of the JUnit Contributor License Agreement.
Definition of Done
@API
annotations