-
Notifications
You must be signed in to change notification settings - Fork 6.8k
CachedOp performance regression #15067
Comments
Hey, this is the MXNet Label Bot. |
We have a PR to improve CachedOP recently, #14931, I am not sure if this cause the issue. |
So same benchmark with |
Yeah, I suspect it is the problem is coming along with GPU. |
Will do a test run on it |
Has the issue been solved? |
Recently I am running benchmark on the cachedOp performance and get some regression on the result. Please see the table below:
I would like to highlight the GPU performance comparison. You can see on P2 there is a performance gain with the flag being set but regression in P3.
In theory, it is expected the performance boost if you set these two flags since memory is reused. However, on large GPU it seemed not performing fine.
I used nightly build
Benchmark Script
The text was updated successfully, but these errors were encountered: