Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch-DirectML Layer Norm Produces Incorrect Result with Non-contiguous Input #588

Closed
NullSenseStudio opened this issue May 31, 2024 · 2 comments
Labels
pytorch-directml Issues in PyTorch when using its DirectML backend

Comments

@NullSenseStudio
Copy link

torch-directml: 0.2.1.dev240521
python: 3.11.7

import torch
import torch_directml
from torch.nn.functional import layer_norm

device = torch_directml.device()

input = torch.randn(4, 4, 4)

weight = torch.randn(4)
bias = torch.randn(4)

cpu_output = layer_norm(input.permute(0, 2, 1), [4], weight, bias)

dml_output = layer_norm(input.to(device).permute(0, 2, 1), [4], weight.to(device), bias.to(device)).cpu()
print((cpu_output-dml_output).std().item(), (cpu_output-dml_output).abs().max().item())

0.6974620223045349 3.6198389530181885
The result isn't close to what it is on CPU or other devices.

dml_output = layer_norm(input.to(device).permute(0, 2, 1).contiguous(), [4], weight.to(device), bias.to(device)).cpu()
print((cpu_output-dml_output).std().item(), (cpu_output-dml_output).abs().max().item())

7.381072464340832e-08 2.384185791015625e-07
But it will work as expected as long as it is made contiguous.

dml_output = layer_norm(input.to(device)[::2], [4], weight.to(device), bias.to(device)).cpu()

This non-contiguous input will cause an error instead.

File "...\torch\nn\functional.py", line 2546, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: m_device->CreateOperator(&opDesc, IID_PPV_ARGS(&op))
@joshjkim
Copy link

joshjkim commented Jun 7, 2024

Hi @NullSenseStudio , thanks for your feedback. We'll be including a fix to address non-contiguous inputs in layer_norm in our upcoming torch-directml build releasing soon.

@joshjkim joshjkim added the pytorch-directml Issues in PyTorch when using its DirectML backend label Jun 17, 2024
@joshjkim
Copy link

@NullSenseStudio We just released our new build that addresses the layer_norm issue. Please pip install torch-directml --upgrade to update to torch-directml 0.2.2.dev240614

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pytorch-directml Issues in PyTorch when using its DirectML backend
Projects
None yet
Development

No branches or pull requests

2 participants