-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Multi-threaded inference broken with MKLDNN #15576
Comments
Hey, this is the MXNet Label Bot. |
MXNet does not support multithreading from the interface level. Not even locking based access. The only way to use MXNet in a multi-threaded fashion is by using a jobqueue that is consumed by a sticky thread. |
@wuxun-zhang @ZhennanQin please help take a look for this issue, thanks. |
@pengzhao-intel @wuxun-zhang @ZhennanQin The difference between
It's very strange, I think the model parameters are read-only, will it affect MKLDNN calling? |
For MKLDNN, it doesn't support multi-threading before v1.0 because it shares internal scratch memory for all operators. So simultaneously running 2 mkldnn operators in same process can't guarantee to provide correct result. Currently we suggest to switch to multi-instance with shared memory for multi-threading purpose. |
@ZhennanQin I think MKLDNN is OK in multi-threading, the possible reason is that calling mkldnn related method like |
Similar race condition on ndarray: #9862 |
Looks like the @zheng-da, can we remove the memcpy in these two methods and make model weights truly read-only? It will fix the broken parallel inference example. |
@arcadiaphy did you local solution work for this issue? |
@pengzhao-intel My local solution works fine, but it's not suitable for an official PR. |
Description
I want to do multi-threaded inference with shared model parameters, so I'm testing the
MXPredCreateMultiThread
API in cpp example. I find that the example is broken with more than 1 thread on MKLDNN build: the output of model inference is not deterministic. If I run it with openblas build, everything is normal.Environment info (Required)
Build info (Required if built from source)
MXNet commit hash:
latest commit
4d07d78
Minimum reproducible example
I've slightly modified the cpp example to print the first 10 numbers in the output ndarray:
download the code and replace the example/image-classification/predict-cpp folder.
In order to run the example, the changes in PR #15574 is needed to patch the mxnet code.
The output
openblas 2 threads:
mkldnn 2 threads:
The results change randomly with every execution.
The text was updated successfully, but these errors were encountered: