-
Notifications
You must be signed in to change notification settings - Fork 6.8k
MKLDNN fallback when not recording gradients and calling backwards #12411
Conversation
This PR is waiting merge of #12019 |
1ccef95
to
a56a306
Compare
still working in progress. working with customer on different issue. will fix this next. |
@mxnet-label-bot [pr-work-in-progress] |
@azai91 requesting an update on this PR. |
@azai91 Any update on this PR? |
@azai91 Thanks for the contribution! |
@azai91 any update on this PR. you could close it and reopen the PR once the changes are ready. |
@azai91 could you address the CI failure? |
@mxnet-label-bot add [pr-awaiting-testing] |
still investigating. this PR goes beyond the the original issue I was addressing. |
tests/python/unittest/test_gluon.py
Outdated
check_hybrid_static_memory() | ||
check_hybrid_static_memory(static_alloc=True) | ||
check_hybrid_static_memory(static_alloc=True, static_shape=True) | ||
check_hybrid_static_memory(train_mode=[True, False]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldnt it be train_modes ?
@azai91 Thanks for taking the time to dive into the issue, could you resolve the conflict and trigger CI if still working on this? |
@azai91 - can you please rebase and fix CI issues? @pengzhao-intel - You may be interested in this PR? |
Actually, this is a system-level issue that how do we handle the situation the backward is called with the We will have the offline discussion first and back later @TaoLv . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems I missed some context of this issue. What's the expectation in mxnet if backward is called with is_train=false
? What's the grad_output for backward function and if needed, what's the workspace for backward function?
@azai91 Can you please rebase this PR? |
@azai91 ping again! thanks |
@apeforest - @TaoLv raised a good question here. Can you please help us answer -
|
ping, any update @azai91 |
@azai91 Could you please rebase and fix the CI issues? |
@mxnet-label-bot update [pr-awaiting-response, pr-work-in-progress] |
Description
PR to address (#10994). There is a case where users may want to run the backwards pass but not record any gradients (https://mxnet.incubator.apache.org/api/python/autograd/autograd.html#mxnet.autograd.record - this should be addressed in a later PR as it does not make sense). MKLDNN does not handle this case and instead we will fallback.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments