-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Significant slowdown in some DGL models #16603
Comments
which commit on 1.6 ? |
If there is no slowdown on operators, but the models suffer significantly, what's your guess? What might be the cause? Any direction we can start deep diving? |
@zheng-da can you share the output of |
@zachgk , please assign to accesstorohit@? |
@access2rohit , @ChaiBapchya are you guys looking at this ? |
Waiting on @zheng-da for some details |
@ChaiBapchya it would be worth checking if this PR : #16526 caused some slowdown. |
Tried reproducing Build flags
DGL & MXNet versions
Log
...
|
@zheng-da can you share the flags for mxnet 1.6.x/master branch that you used ? |
@anirudh2290 We need to know the exact build flags used for 1.6.x before we start our deep dive. |
I tested with nightly build. so the build flags are what nightly build usually has. |
@zheng-da which version was it ? cu100 or 101. Was it w/ or w/o MKL ? It would be much easier if you could simply give us the thr output of this: |
I just tried the experiment again and there is no problem. The command to run the experiment:
You can use the following commands to install MXNet. The problem is very easy to reproduce. You can install the MKLDNN version if you want. It makes no difference.
|
I recently compare the performance of DGL KGE models on MXNet 1.5 and the current master branch. I noticed significant slowdown. On MXNet 1.5, it takes 12 seconds to run 1000 batches, and now takes 20 seconds. It seems there is no slowdown on operators after some profiling.
To reproduce the problem, please install DGL 0.4, download the DGL KGE package by cloning the DGL repo. The DGL KGE package is under apps/kg. Run the following command:
The text was updated successfully, but these errors were encountered: