Torch-DirectML Layer Norm Produces Incorrect Result with Non-contiguous Input #588

NullSenseStudio · 2024-05-31T19:08:56Z

torch-directml: 0.2.1.dev240521
python: 3.11.7

import torch
import torch_directml
from torch.nn.functional import layer_norm

device = torch_directml.device()

input = torch.randn(4, 4, 4)

weight = torch.randn(4)
bias = torch.randn(4)

cpu_output = layer_norm(input.permute(0, 2, 1), [4], weight, bias)

dml_output = layer_norm(input.to(device).permute(0, 2, 1), [4], weight.to(device), bias.to(device)).cpu()
print((cpu_output-dml_output).std().item(), (cpu_output-dml_output).abs().max().item())

0.6974620223045349 3.6198389530181885
The result isn't close to what it is on CPU or other devices.

dml_output = layer_norm(input.to(device).permute(0, 2, 1).contiguous(), [4], weight.to(device), bias.to(device)).cpu()
print((cpu_output-dml_output).std().item(), (cpu_output-dml_output).abs().max().item())

7.381072464340832e-08 2.384185791015625e-07
But it will work as expected as long as it is made contiguous.

dml_output = layer_norm(input.to(device)[::2], [4], weight.to(device), bias.to(device)).cpu()

This non-contiguous input will cause an error instead.

File "...\torch\nn\functional.py", line 2546, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: m_device->CreateOperator(&opDesc, IID_PPV_ARGS(&op))

The text was updated successfully, but these errors were encountered:

joshjkim · 2024-06-07T22:02:49Z

Hi @NullSenseStudio , thanks for your feedback. We'll be including a fix to address non-contiguous inputs in layer_norm in our upcoming torch-directml build releasing soon.

joshjkim · 2024-06-17T16:50:34Z

@NullSenseStudio We just released our new build that addresses the layer_norm issue. Please pip install torch-directml --upgrade to update to torch-directml 0.2.2.dev240614

joshjkim added the pytorch-directml Issues in PyTorch when using its DirectML backend label Jun 17, 2024

joshjkim closed this as completed Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch-DirectML Layer Norm Produces Incorrect Result with Non-contiguous Input #588

Torch-DirectML Layer Norm Produces Incorrect Result with Non-contiguous Input #588

NullSenseStudio commented May 31, 2024

joshjkim commented Jun 7, 2024

joshjkim commented Jun 17, 2024

Torch-DirectML Layer Norm Produces Incorrect Result with Non-contiguous Input #588

Torch-DirectML Layer Norm Produces Incorrect Result with Non-contiguous Input #588

Comments

NullSenseStudio commented May 31, 2024

joshjkim commented Jun 7, 2024

joshjkim commented Jun 17, 2024