Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ThreadPool] Solve thread transitions issue #4344

Merged
merged 10 commits into from
Nov 20, 2019
Merged

Conversation

FrozenGene
Copy link
Member

@FrozenGene FrozenGene commented Nov 15, 2019

This pr solves thread transitions issue.

When we use OpenCV + TVM, we will produce much thread transitions so that OpenCV + TVM is slow. Auto tuning has issue too. The reason is we bind task 0 to master cpu by default.

More detail / background and reproduceable test case: https://discuss.tvm.ai/t/use-tvm-darknet-to-detect-vidoes-after-relay-build-module-build-cv2-ops-cost-much-more-time/4730/

After this PR, we don't bind task 0 to master cpu by default but export one environment TVM_EXCLUDE_WORKER0 to set. If users set TVM_EXCLUDE_WORKER0 be 1, we will bind task 0 to master as previous.

@vinx13 @yidawang @tqchen

Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am positive with adding an ENV VAR to specify exclude_work0_. But we should try to reason what default value we should set. We may have use cases which run on only one or two threads. In such a scenario, setting exclude_work0_ to be false would not be ideal.

@tqchen
Copy link
Member

tqchen commented Nov 17, 2019

Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment. Thanks.

@tqchen tqchen added the status: need update need update based on feedbacks label Nov 17, 2019
@tqchen
Copy link
Member

tqchen commented Nov 17, 2019

@FrozenGene please act on the review comments

@FrozenGene
Copy link
Member Author

FrozenGene commented Nov 17, 2019

sorry for the late reply because of other things. have changed the comments.

@FrozenGene
Copy link
Member Author

ping @vinx13 @yidawang @tqchen have any comments?

@tqchen tqchen added status: need review and removed status: need update need update based on feedbacks labels Nov 19, 2019
Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest setting the default value to be true, due to the following reason: "We may have use cases which run on only one or two threads. In such a scenario, setting exclude_work0_ to be false would not be ideal."

@FrozenGene
Copy link
Member Author

Hmm...I think Auto Tuning and OpenCV + TVM is very common and many users will forget this things, so I prefer it be false by default. However, we could add comment on exclude_worker_0 and suggest to be true when to use single thread. Also open to listen more people's comment. @vinx13 @tqchen guys could tag anyone interested.

@tqchen
Copy link
Member

tqchen commented Nov 19, 2019

I think there seems to be two concerns:

  • Whether we want to use cpu as worker0
  • Whether we want bind the master thread to CPU0

The problematic case seems can be resolved by not binding the master thread, but still include as part of the worker thread. By default, we bind the rest of the threads but not the master one.

@tqchen
Copy link
Member

tqchen commented Nov 19, 2019

Another way to resolve the problem, which might be better, would be register the fork handler, which reset the affinity at fork time.

https://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html

@yidawang
Copy link
Contributor

Hmm...I think Auto Tuning and OpenCV + TVM is very common and many users will forget this things, so I prefer it be false by default. However, we could add comment on exclude_worker_0 and suggest to be true when to use single thread. Also open to listen more people's comment. @vinx13 @tqchen guys could tag anyone interested.

At least the Auto tuning case is about binding the master thread or not, which we have discussed here. After reviewing all these use cases, I think by default we probably 1) don't bind the master thread; 2) exclude work0, i.e. bind the first task to core 0.

@FrozenGene
Copy link
Member Author

@vinx13 @yidawang @tqchen I use pthread_atfork function to avoid master thread affinity derived by child threads. I find this solution suggested by @tqchen more elegant. But open to listen more comments.

Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the minor editing issues inline.

src/runtime/threading_backend.cc Outdated Show resolved Hide resolved
src/runtime/threading_backend.cc Outdated Show resolved Hide resolved
}
pthread_atfork(nullptr, nullptr, ThreadGroup::Impl::BindMasterThreadToCore0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function may not be available under android. let us only do it under linux

src/runtime/threading_backend.cc Show resolved Hide resolved
@FrozenGene
Copy link
Member Author

@vinx13 @tqchen @yidawang I wish I have understood your concerns / comments completely.

@tqchen re Android, I don't provide binding master thread env like Linux. does it make sense?

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some final nits

src/runtime/threading_backend.cc Outdated Show resolved Hide resolved
src/runtime/threading_backend.cc Show resolved Hide resolved
Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will keep an eye on this today to catch up the 0.6 release.

src/runtime/threading_backend.cc Show resolved Hide resolved
src/runtime/threading_backend.cc Outdated Show resolved Hide resolved
src/runtime/threading_backend.cc Show resolved Hide resolved
src/runtime/threading_backend.cc Outdated Show resolved Hide resolved
@FrozenGene
Copy link
Member Author

@yidawang @tqchen update code according to your comments.

Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am signing off. Just one side note, as we have more and more ENV VARs, each of which has its default value and a number of possible configurations, we probably should have a place to summarize them.

@yidawang
Copy link
Contributor

Thanks @FrozenGene for the work

@FrozenGene
Copy link
Member Author

I am signing off. Just one side note, as we have more and more ENV VARs, each of which has its default value and a number of possible configurations, we probably should have a place to summarize them.

I agree it. Like this: https://mxnet.apache.org/api/faq/env_var

@vinx13 vinx13 merged commit 6040b6f into apache:master Nov 20, 2019
@vinx13
Copy link
Member

vinx13 commented Nov 20, 2019

Thanks @FrozenGene @yidawang @tqchen this is merged

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Nov 26, 2019
* [ThreadPool] Solve thread transitions issue

* Use pthread_atfork to avoid master thread affinity be derived by child.

* Code Format

* comment of exclude_worker0_

* set full cpu affinity

* Redundant blank line

* CPPLint

* CPPLint namespace

* CPPLint

* Fix the wrong logic of bind master thread.
yongwww pushed a commit to neo-ai/tvm that referenced this pull request Nov 26, 2019
* [ThreadPool] Solve thread transitions issue

* Use pthread_atfork to avoid master thread affinity be derived by child.

* Code Format

* comment of exclude_worker0_

* set full cpu affinity

* Redundant blank line

* CPPLint

* CPPLint namespace

* CPPLint

* Fix the wrong logic of bind master thread.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants