Skip to content

Conversation

@crashbussy
Copy link
Contributor

PR Category

Execute Infrastructure

PR Types

New features

Description

isclose Tensor.isclose支持0-Size。

修改历程介绍如下:

在PaddleAPITest report/0size_tensor中检索paddle.Tensor.matmul的错误日志,发现[accuracy error]报错。分析可能是前向过程出错。!!!注意到(shapes (0, 100, 1, 40), (0, 100, 40) mismatch),也是就shape不匹配的问题
`2025-03-05 15:27:11.945992 test begin: paddle.Tensor.matmul(Tensor([0, 100, 1],"float64"), Tensor([0, 1, 40],"float64"), )

[accuracy error] paddle.Tensor.matmul(Tensor([0, 100, 1],"float64"), Tensor([0, 1, 40],"float64"), )

Not equal to tolerance rtol=0.01, atol=0.01

(shapes (0, 100, 1, 40), (0, 100, 40) mismatch)
x: array([], shape=(0, 100, 1, 40), dtype=float64)
y: array([], shape=(0, 100, 40), dtype=float64)
前向修复: a. 在Paddle代码中检索def matmul,发现matmul的核心实现调用的是_C_ops的matmul b. 以_C_ops的matmul在paddle/phi/ops/yaml中检索,发现matmul的InferMeta函数使用到一个: MatmulInferMeta- op: matmul
args : (Tensor x, Tensor y)
output : Tensor(out)
infer_meta :
func : MatmulInferMeta
param: [x, y, false, false]`

c. 在代码中检索MatmulInferMeta,并检查其dims(shape)的推导是否正确(在matmul中推导是正确因此不用修改)
d. 在paddle/phi/kernels中检索matmul,找全所有matmul的实现Kernel。发现共有五个涉及matmul的文件,分别为:

  • paddle/phi/kernels/cpu/matmul_kernel.cc
  • paddle/phi/kernels/gpu/matmul_kernel.cu
  • paddle/phi/kernels/impl/matmul_kernel_impl.h
  • paddle/phi/kernels/matmul_kernel.h
  • paddle/phi/kernels/gpu/weight_only_linear_grad_kernel.cu
    其中cc和cu文件均将前两个.h文件设为头文件,因此只用修改.h文件即可。而matmul_kernel.h和matmul_kernel_impl.h中不需要(不能)重复定义,故只修改了matmul_kernel_impl.h

注意,下面的文件仅以 paddle/phi/kernels/matmul_kernel.h为头文件,并不含有matmulkernel字段,所以不予以考虑

  • paddle/phi/kernels/impl/svdvals_grad_kernel_impl.h
  • paddle/phi/kernels/impl/eigvalsh_grad_kernel_impl.h
  • paddle/phi/kernels/impl/lstsq_kernel_impl.h
  • paddle/phi/kernels/impl/eigh_grad_kernel_impl.h
  • paddle/phi/kernels/impl/qr_grad_kernel_impl.h
  • paddle/phi/kernels/sparse/gpu/fused_attention_kernel.cu

在paddle/phi/kernels/impl/matmul_kernel_impl.h中删除了原来的代码(因为有错误),加入以下代码,完成修复
` if (x.numel() == 0 || y.numel() == 0) {
auto x_dims = x.dims();
auto y_dims = y.dims();

if (transpose_x && x_dims.size() >= 2) {
  std::swap(const_cast<DDim&>(x_dims)[x_dims.size() - 1],
            const_cast<DDim&>(x_dims)[x_dims.size() - 2]);
}

if (transpose_y && y_dims.size() >= 2) {
  std::swap(const_cast<DDim&>(y_dims)[y_dims.size() - 1],
            const_cast<DDim&>(y_dims)[y_dims.size() - 2]);
}

std::vector<int64_t> x_batch_dims(x_dims.data(), x_dims.data() + x_dims.size() - 2);
std::vector<int64_t> y_batch_dims(y_dims.data(), y_dims.data() + y_dims.size() - 2);

std::vector<int64_t> bcast_dims;
if (!funcs::BroadcastTwoVec(x_batch_dims, y_batch_dims, &bcast_dims)) {
  PADDLE_THROW(phi::errors::InvalidArgument(
      "Failed to broadcast input batch dimensions."));
}

std::vector<int64_t> out_shape(bcast_dims.begin(), bcast_dims.end());

int64_t m = transpose_x ? x_dims[x_dims.size() - 1] : x_dims[x_dims.size() - 2];
int64_t n = transpose_y ? y_dims[y_dims.size() - 2] : y_dims[y_dims.size() - 1];

out_shape.push_back(m);
out_shape.push_back(n);

DDim out_dims = make_ddim(out_shape);
out->Resize(out_dims);
ctx.template Alloc<T>(out);
return;

}`

添加单测:

在test/legacy_test/test_matmul_op.py中添加0 size tensor输入的单测:
`import paddle
import numpy as np

所有测试用例

test_cases = [
# 格式: (x_shape, y_shape, expected_out_shape, dtype)
((0, 100, 1), (0, 1, 40), (0, 100, 40), "float64"),
((0, 100, 1), (0, 1, 4), (0, 100, 4), "float64"),
((0, 100, 1), (1, 1, 40), (0, 100, 40), "float64"),
((0, 100, 1), (1, 1, 4), (0, 100, 4), "float64"),
((0, 12, 197, 197), (0, 12, 197, 64), (0, 12, 197, 64), "float16"),
((0, 12, 197, 197), (0, 12, 197, 64), (0, 12, 197, 64), "float32"),
((1, 0, 1), (1, 1, 40), (1, 0, 40), "float64"),
((1, 0, 1), (1, 1, 4), (1, 0, 4), "float64"),
((1, 100, 1), (0, 1, 40), (0, 100, 40), "float64"),
((1, 100, 1), (0, 1, 4), (0, 100, 4), "float64"),
((1, 100, 1), (1, 1, 0), (1, 100, 0), "float64"),
((112, 0, 197, 197), (112, 0, 197, 64), (112, 0, 197, 64), "float16"),
((112, 0, 197, 197), (112, 0, 197, 64), (112, 0, 197, 64), "float32"),
((112, 12, 0, 197), (112, 12, 197, 64), (112, 12, 0, 64), "float16"),
((112, 12, 0, 197), (112, 12, 197, 64), (112, 12, 0, 64), "float32"),
((112, 12, 197, 197), (112, 12, 197, 0), (112, 12, 197, 0), "float16"),
((112, 12, 197, 197), (112, 12, 197, 0), (112, 12, 197, 0), "float32"),
]

def run_all_matmul_tests():
for idx, (x_shape, y_shape, expected_shape, dtype) in enumerate(test_cases):
print(f"\nTest {idx+1} begin: paddle.Tensor.matmul(Tensor({x_shape}), Tensor({y_shape}), dtype={dtype})")
try:
x = paddle.zeros(x_shape, dtype=dtype)
y = paddle.zeros(y_shape, dtype=dtype)
result = x.matmul(y)

        if result.shape != expected_shape:
            raise AssertionError(
                f"[accuracy error] paddle.Tensor.matmul(Tensor({x_shape}), Tensor({y_shape}), dtype={dtype})\n\n"
                f"Not equal to tolerance rtol=0.01, atol=0.01\n\n"
                f"(shapes {result.shape}, {expected_shape} mismatch)\n"
                f" x: array([], shape={x.shape}, dtype={dtype})\n"
                f" y: array([], shape={y.shape}, dtype={dtype})"
            )
        else:
            print(f"[PASS] Shape is correct: {result.shape}")
    except Exception as e:
        print(f"{e}")

if name == "main":
run_all_matmul_tests()`

备注:原来的代码存在错误,
std::vector<std::int64_t> out_dims(x_dims.size() - 1 + y_dims.size() - 1);
对于二维 matmul 是对的,但对于多维(batched)matmul 来说,这会导致错误。
正确做法应是:batch 维度取广播后的结果;最后两维做矩阵乘法;所以输出 rank 应与广播后的 batch rank 相同 + 2(最后两个维度)
重写后支持任意 rank 的输入张量(>=2)

下面提供一个最简单的示例测试
`import paddle

测试数据

x = paddle.randn([0, 100, 1]) # shape: [0, 100, 1]
y = paddle.randn([1, 1, 4]) # shape: [1, 1, 4]

result = x.matmul(y)
print(result.shape) # 输出: [0, 100, 4], 而不是原来的[0, 100, 1, 4]`

@paddle-bot
Copy link

paddle-bot bot commented May 19, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label May 19, 2025
@crashbussy
Copy link
Contributor Author

这是我提交的关于同一个任务的第二个pr,因为第一个pr([0-size Tensor No.318] Add 0-size Tensor support for paddle.Tensor.matmul API.
#72780)中出现了令人困惑的CI日志报错,想再尝试一次,具体内容都没有变化,只是再次pr。

@crashbussy crashbussy closed this May 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant