-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback _embedding_bag_backward
and force sparse=false
.
#7584
Conversation
call(grad, indices_, offsets_, offset2bag, bag_size_, max_indices_, | ||
num_weights, scale_grad_by_freq, mode, /*sparse=*/false, | ||
per_sample_weights_opt, padding_idx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a test to test/test_operations.py
, run the embedding bag fwd and bwd and make sure it doesn't crash? this way we can prevent it from regressing.
cf3a4a3
to
6e5131a
Compare
6e5131a
to
5ffaadd
Compare
fresh = tensor.clone() | ||
# Make this tensor a leaf tensor by detaching and reseting its | ||
# requires_grad property. | ||
fresh = fresh.detach() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool. I wonder why we need to make the tensor a leaf tensor. What would happen if we don't do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so we can grab its tensor.grad
and compare.
Fix: #6008
This PR, in addition to pytorch/pytorch#129691, add support for embedding bag calls when its
sparse
parameter is true.Problem: when
sparse=true
,_embedding_bag_backward
calledat::_sparse_coo_tensor_unsafe_symint
function which returns a sparse tensor. Since PyTorch/XLA does not support sparse tensors, this resulted in a dispatch error (see the original issue).Solution: although, ideally we should support sparse tensors, in the short-term we decided (in an offline discussion) to fallback to the dense backwards function.
cc @miladm @JackCaoG