Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced user-configured sleep time with condition variables #23

Closed
wants to merge 1 commit into from

Conversation

SandSnip3r
Copy link

Describe the changes

I thought it would be useful to do away with the arbitrary sleeps in this library. I've replaced the sleeps with std::condition_variables.

  • In the case when a worker thread is idle and waiting for more work, the worker thread will wait on a condition variable (task_available_condition_variable) to be notified once work is added.
  • In the case of wait_for_tasks, the thread will wait on a different condition variable (tasks_done_condition_variable). Each time a worker thread finishes some work, it will notify on this condition variable, possibly waking up wait_for_tasks if all of the work is actually completed.
  • In the case of parallelize_loop, yet another condition variable (loop_done_condition_variable; this time, a local variable) is waited on to synchronize the blocks of the loop. Once a block of the loop is completed, if it was the last block, it will notify the condition variable.

paused as a public data member of the class posed an issue with the condition variables. If the user un-pauses execution, the pool should wake up all worker threads. For this reason, I made paused private and provided public functions to modify the value.

Previously, tasks was protected by an std::mutex (now named tasks_mutex) and tasks_total was an std::atomic. Since tasks_total is tested under the condition variable, it made sense to make it no longer atomic and to move it under the protection of the tasks_mutex.

I adapted thead_pool_test.cpp for the changes to the interface.

The file thead_pool_test.cpp used carriage returns and linefeeds for line endings, while thread_pool.hpp only used linefeeds. I removed the carriage returns from thread_pool_test.cpp to make them consistent.

Testing

Have you tested the new code using the provided automated test program thread_pool_test.cpp and/or performed any other tests to ensure that it works correctly? If so, please provide information about the test system(s):

Yes, I built thead_pool_test.cpp and ran it hundreds of times (to hopefully cover some different possible thread-execution-order possibilities) built by two different compilers:

  • Compiler g++ version 8.4.0, flags -std=c++17 -pthread -O3
  • Compiler MSVC 14.29, with flag \std:c++17 in default debug mode (\Od)
  • Compiler MSVC 14.29, with flag \std:c++17 in default release mode (\O2)

I also created my own little load testing tool which creates a bunch of tasks (which just sleep) with random delays.

Machine details:

  • Intel I7-8700K
  • Windows 10 (and Ubuntu 18.04 running in WSL 1)

…d public "paused" variable with pause()/resume() functions. Adapted tests for changes.
@bshoshany
Copy link
Owner

Hi @SandSnip3r and thanks a lot for the comprehensive PR! :) This is a very substantial change, which is also not backwards-compatible. It will take me some time (perhaps even a few weeks) to go over it and decide whether to merge it or not, and I may make a few changes before merging.

I do like the idea of eliminating the user-configured sleep time since it makes the library easier to use, but on the other hand, there is also benefit in having the sleep time as a parameter that can be tweaked to optimize performance. In my testing, I found that tweaking the sleep time can indeed result in a moderate performance boost (depending on the specific system and the specific operation being performed). If we use condition variables instead, then we rely on the implementation to decide how to implement the waiting, in a way that we have no control over. Have you compared the performance of the thread pool with sleep vs. condition variables?

@simsong
Copy link

simsong commented Sep 16, 2021

I do not think that the "user-configured sleep time makes the library easier to use." I think that it is easier-to-use if the library uses condition variables and doesn't employ sleeping and polling. This is a good proposed change, assuming it does not introduce any deadlocks.

@bshoshany
Copy link
Owner

@simsong: I think you misunderstood me, I wrote that "I do like the idea of eliminating the user-configured sleep time since it makes the library easier to use", i.e. it's the elimination which makes it easier to use since there is one less parameter the user has to configure manually :)

@simsong
Copy link

simsong commented Sep 16, 2021

Yes, I misread your message. I guess I was thrown by the second part. Optimizing performance with "sleep time" is only giving a performance boost because you are playing off the amount of time spent polling vs. the amount of time spent waiting. With condition variables there is no trade-off. It will always give the best performance because no time is spent polling or waiting, and the over head of condition variables is quite minimal in comparison.

@bshoshany
Copy link
Owner

Very well, but I'm not sure about the statement that "condition variables will always give the best performance". I do agree that it may be true in principle, on purely theoretical grounds, but I want to test it quantitatively on different CPUs, OSs, compilers, etc. and see if this statement is indeed true. Might take me some time because the academic year just started and I'm teaching 4 courses at the moment.

@bshoshany
Copy link
Owner

Hi, sorry for disappearing for so long! I've been really busy with work over the last few months, and unfortunately will still be busy for the next couple of months as well since I'm preparing a brand new course for the spring term. As soon as I have time to work on this project, I plan to incorporate condition variables instead of sleep (whether the way @SandSnip3r suggested or some other way, I need to do some testing first) plus many other changes that I have on my TODO list (which currently has 32 items!). I estimate that a new release of this library will be posted around March or April. Meanwhile I'm closing this pull request. Thanks again for your contribution :)

@bshoshany bshoshany closed this Dec 22, 2021
@bshoshany
Copy link
Owner

Update: v3.0.0, to be released in the next few days, will incorporate the changes discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants