You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have failed test results using ElementWiseProblem with dask, ray, starmap multiprocessing, and future process pool executor in jupyter notebook. All of it except starmap have the same outcome, stuck forever and not utilizing all CPU cores (just using one core). Even the starmap configured with >1 core (interpolated to 24 cores), only makes the execution longer in duration and just uses 1 core. What is left is only using the default runner, LoopedElementwiseEvaluation. Unexpectedly, the default runner is the fastest and works compared to all parallelized runners (still only utilizes 1 core). I already tested future executor, dask, and ray separately using a similar Pymoo runner implementation. Unknowingly, ray is too slow and does not utilize all CPU cores, dask can utilize all CPU cores but slower than the future executor, and future executor is the fastest.
Looking at your code my assumption is that the parallelization with dask and ray introduces a signifcant amount of overhead (because of serialization). In my opinion the main advantages is using parallelization with a cloud service (e.g. AWS lambda functions) on a larger scale. For instance, launching 200 instances in parallel for sure will beat running this on 4 cores.
Can you confirm your results with a computation heavier problem as well? Let us say a (time-discrete) simulation that requires 1 minute or so? Happy to discuss this a little more here.
I think it wasn't caused by overhead. Unknowingly, when using ray, it works when I do not configure the computing resources (using default ray init configuration at 32 logical cores)*. However, sometimes it crashes. Click here for details.
I have failed test results using ElementWiseProblem with dask, ray, starmap multiprocessing, and future process pool executor in jupyter notebook. All of it except starmap have the same outcome, stuck forever and not utilizing all CPU cores (just using one core). Even the starmap configured with >1 core (interpolated to 24 cores), only makes the execution longer in duration and just uses 1 core. What is left is only using the default runner, LoopedElementwiseEvaluation. Unexpectedly, the default runner is the fastest and works compared to all parallelized runners (still only utilizes 1 core). I already tested future executor, dask, and ray separately using a similar Pymoo runner implementation. Unknowingly, ray is too slow and does not utilize all CPU cores, dask can utilize all CPU cores but slower than the future executor, and future executor is the fastest.
The text was updated successfully, but these errors were encountered: