-
Notifications
You must be signed in to change notification settings - Fork 84
Description
@jerhard and I were toying around with this idea a week or so ago, so I thought I'd write down some of our thoughts so they don't get lost.
The idea is that with the advent of Multicore OCaml one should take another look at whether we can meaningfully parallelize our solvers.
An idea that we had was that a natural point for decomposition would be the globals. I.e, there could be one thread T_main that is in charge of maintaining globals that all side-effects are communicated to. This seems a particular good point as values at globals only grow anyway.
Each sover thread T_solve would still contain a "normal" TD3-style solver, and also keep a map of the values of globals it has observed thus far.
When T_solve accesses a global [g] during solving, it would consult T_main to retrieve the most up-to-date value of the global in T_main and compare with what it has stored internally:
- If the value this thread has for the global internally is greater or equal to the one coming from T_main, no extra destabilization because of new contributions of other solver threads is necessary.
- If it is not, one does a destabilization as if there had been a new local contribution to the global, and the value of the global in the local map is updated.
By the same token, all side-effects a solver thread T_solve makes would also be communicated to T_main so they can be recorded there and potentially be passed on to other threads.
What I am still not sure about is how the work would be distributed to the different solver threads.
- One possibility would be to have a solver thread per (abstract) thread in the program, but that doesn't seem so generic. The advantage would be that one expects the analysis of different threads to be relatively separate from each other.
- The other idea would be that solver threads handle sets of calls (from return to their start point). The thing that would have to change here is how we handle returns from procedure calls. Currently, we simply query the local of the return node. If we potentially want different threads to handle different calls, such communication would need to go via T_main again.
Of course one would also have to think about termination of this modified solver. Naively, one would take it for granted, as updates to globals only happen finitely often, and each T_solve should internally have the same termination properties as the single threaded TD3, but I assume that view is too naive.
Whether this would actually pay off in times of runtime is also something I don't really have an intuition for, but I think it is an interesting thing to think about. It may be a challanging Master's thesis for a very good student, or we could do this ourselves.