You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#include <time.h>
#include <dmlc/logging.h>
#include <dmlc/thread_group.h>
#include <dmlc/omp.h>
#include <mxnet/c_api.h>
#include <mxnet/engine.h>
#include <mxnet/ndarray.h>
#include <dmlc/timer.h>
#include <cstdio>
#include <thread>
#include <chrono>
#include <vector>
struct A {
std::vector<int*> a;
};
static std::mutex api_lock;
int ThreadSafetyTest() {
std::unique_lock<std::mutex> lock(api_lock);
A *ret = dmlc::ThreadLocalStore<A>::Get();
std::vector<int*> tmp_inputs;
tmp_inputs.reserve(10);
for (int i = 0; i < 10; ++i) {
tmp_inputs.push_back(new int(i));
}
ret->a.clear();
ret->a.reserve(10);
for (int i = 0; i < 10; ++i) {
ret->a.push_back(tmp_inputs[i]);
}
LOG(INFO) << dmlc::BeginPtr(ret->a);
}
int main(int argc, char const *argv[]) {
auto func = [&](int num) {
ThreadSafetyTest();
};
std::vector<std::thread> worker_threads(5);
int count = 0;
for (auto&& i : worker_threads) {
i = std::thread(func, count);
count++;
}
for (auto&& i : worker_threads) {
i.join();
}
}
If you look at the printed value for BeginPtr(ret->a) it has the same value between different threads, which means its points to the same address and can cause issues. This issue is not reproduced when using MX_THREAD_LOCAL
The text was updated successfully, but these errors were encountered:
This may not specifically be an issue with thread safety, but because of the difference in the behavior of destructor between the two thread local implementations. Projects like MXNet depend on the lifetime of the thread_local data extending beyond the threads lifetime. I have opened a PR for dependent projects to choose the implementation: #573
MXNet heavily uses ThreadLocalStore to keep the C APIs thread safe. I observed that ThreadLocalStore itself has thread safety issues probably because of gcc bug with thread_local ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60673, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81880 ) .
Please see the below minimal reproducible step:
If you look at the printed value for BeginPtr(ret->a) it has the same value between different threads, which means its points to the same address and can cause issues. This issue is not reproduced when using MX_THREAD_LOCAL
The text was updated successfully, but these errors were encountered: