Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

fix naive engine for multi-threaded inference #15574

Merged
merged 3 commits into from
Aug 21, 2019
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion src/engine/naive_engine.cc
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,11 @@ class NaiveEngine final : public Engine {
static_cast<NaiveEngine*>(engine)->req_completed_ = true;
}
// whether action is completed
bool req_completed_;
#if DMLC_CXX11_THREAD_LOCAL
static thread_local bool req_completed_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#else
static MX_THREAD_LOCAL bool req_completed_;
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces an additional overhead for thread local, and the number of ops in mxnet model can be in the scale of 100s. I am not sure we should add this without a build flag since we are adding overhead for single threaded use cases too.

Copy link
Contributor

@marcoabreu marcoabreu Jul 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please not another build flag - at some point we're gonna spend more money on CI than the NASA :D

Could we make some benchmarks to determine the impact here?

I'm not that deep inside the engine code, so excuse me if that's a stupid question. But why are we declaring the internal variables of the engine as thread local instead of creating thread-local instantiations of the engines. Each thread gets its own engine (or multiple engines or a single engine is shared across threads, synchronized with locks) and we switch all static variables to private ones. The current design of the engine doesn't seem to have the creation of multiple instantiations in mind so we should fix that instead of making all the static variables threadlocal.

/*! \brief whether it is during shutdown phase*/
std::atomic<bool> shutdown_phase_{false};
// CPU stream
Expand All @@ -261,5 +265,12 @@ class NaiveEngine final : public Engine {
Engine *CreateNaiveEngine() {
return new NaiveEngine();
}

#if DMLC_CXX11_THREAD_LOCAL
thread_local bool NaiveEngine::req_completed_ = false;
#else
MX_THREAD_LOCAL bool NaiveEngine::req_completed_ = false;
#endif

} // namespace engine
} // namespace mxnet