-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[0-size Tensor No.271] Add 0-size Tensor support for paddle.take_along_axis API. #73499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[0-size Tensor No.271] Add 0-size Tensor support for paddle.take_along_axis API. #73499
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
|
我在本地补充了所有必要的验证测试。
结论: |
|
请签署CLA、修复CodeStyle流水线 |
README.md
Outdated
|
|
||
| PaddlePaddle is provided under the [Apache-2.0 license](LICENSE). | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件不需要修改,请恢复
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
收到!十分抱歉!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已完成修改!
080875a to
8f0cf6a
Compare
8f0cf6a to
b128ba7
Compare
Codecov ReportAttention: Patch coverage is
❌ Your patch status has failed because the patch coverage (25.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #73499 +/- ##
==========================================
Coverage ? 25.00%
==========================================
Files ? 1
Lines ? 8
Branches ? 0
==========================================
Hits ? 2
Misses ? 6
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f6d123a to
d67f794
Compare
| if (index.numel() == 0) { | ||
| dev_ctx.template Alloc<T>(out); | ||
| return; | ||
| } | ||
| if (x.numel() == 0) { | ||
| phi::Full<T, Context>( | ||
| dev_ctx, common::vectorize(out->dims()), static_cast<T>(0), out); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同学好!现在一般 0size 的修复方法 be like:
if (out && out->numel() == 0) {
dev_ctx.template Alloc<T>(out);
return;
}
同学可以参考 PR 示例😊~:#72821
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为输出 out 形状已经被设置好了,如果 infermeta 无误,可以直接为其分配内存然后返回即可~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
但是当输入x是0-size,但index不是0-size时,输出out的元素个数是不为0的,所以会跳出保护,则会导致测试失败,所以还是原方案更全面。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
但是当输入x是0-size,但index不是0-size时,输出out的元素个数是不为0的,所以会跳出保护,则会导致测试失败,所以还是原方案更全面。
可行👍那可以参考我之前的pr:#71961
|
|
||
|
|
||
| # --- 场景一: 输入x为0-size, index不为0-size --- | ||
| class TestTakeAlongAxis0Size(OpTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同学可以尝试继承 TestTakeAlongAxisOp 类,然后重写 init_data() 方法,不必再新建一个类😉
当然下面的写法也没有错~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
通过继承TestTakeAlongAxisOp后,两个新写的测试类出现了报错,其中计算numpy结果和广播问题等无法兼容0-size的特殊情况,因此放弃继承TestTakeAlongAxisOp的方案。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
通过继承TestTakeAlongAxisOp后,两个新写的测试类出现了报错,其中计算numpy结果和广播问题等无法兼容0-size的特殊情况,因此放弃继承TestTakeAlongAxisOp的方案。
是 numpy 对 0size 的支持有问题吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
通过继承TestTakeAlongAxisOp后,两个新写的测试类出现了报错,其中计算numpy结果和广播问题等无法兼容0-size的特殊情况,因此放弃继承TestTakeAlongAxisOp的方案。
是 numpy 对 0size 的支持有问题吗?
好像是的,在通过继承父类方法,其中通过numpy.take_along_axis来预先计算一个答案,但是输入数组为维度为0时,NumPy也无法处理这种情况,直接报了“索引越界”的错误(IndexError: index 0 is out of bounds for axis 1 with size 0)。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好滴~
344a067 to
5ec5f82
Compare
|
@luotao1 流水线尚未触发😶🌫️ |
4f0be74 to
7498d1b
Compare
|
@luotao1 麻烦您帮我触发流水线 |
你尝试提交一个commit触发,比如merge一下最新的develop |
| self.check_output(check_pir=self.check_pir) | ||
|
|
||
| def test_check_grad(self): | ||
| self.grad = np.zeros_like(self.inputs['Input']).astype(self.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处的self.grad是不是没有被用到?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实是,将更新将其删除。
7498d1b to
e4d74a7
Compare
可能是因为首次提交PR的缘故,需要维护者审核。 |
|
我还是触发不起来。
|
收到,我将尽快完成修改。 |
|
触发起来了😓,你先等等结果 |
e4d74a7 to
9e28676
Compare
|
部分CI还是没有触发起来,比如 PR-CI-Windows-OPENBLAS (这个不需要approval)。你可以试试再提交一个commit,如果不行的话,重新提交一个PR |
好滴,收到 |
|
已被新的PR #73736 代替。此PR的提交历史较为混乱且遇到了CI问题,因此关闭。所有修复已转移到新的PR中。 |


PR Category
Execute Infrastructure
PR Types
Improvements
Description
paddle.take_along_axisAPI 支持0-Size Tensor。修改历程介绍如下:
问题复现与分析:
PaddleAPITest复现BUG。在PaddleAPITest的--accuracy=True精度对比模式下,由于paddle.take_along_axis(arr, index, axis)与torch.take_along_dim(input, indices, dim)的参数名不统一,反复出现TypeError: missing a required argument的参数绑定错误,无法准确定位问题。PaddleAPITest的--paddle_only=True模式后,成功触发了Python层的TypeError: take_along_axis() got an unexpected keyword argument 'index'报错,这暴露了API前后端参数名不一致的问题。unittest的OpTest框架编写单元测试。在未修复Kernel的情况下,成功复现了底层的C++错误:前向修复 (Forward Fix):
a. 定位API: 在
Paddle/python/paddle/tensor/manipulation.py中找到了def take_along_axis(...)的Python定义,其核心实现调用了_C_ops.take_along_axis。b. 定位算子定义: 使用
grep发现,该算子没有独立的.yml文件,其定义位于paddle/phi/ops/yaml/ops.yaml中。c. 检查InferMeta: 根据
ops.yaml的指引,在paddle/phi/infermeta/binary.cc中找到了TakeAlongAxisInferMeta函数。经分析,其out->set_dims(index.dims())逻辑能正确推导0-size Tensor的输出形状,无需修改。d. 修改Kernel:
* 根据
grep结果,定位到CPU Kernel文件为paddle/phi/kernels/cpu/take_along_axis_kernel.cc。* 参照标准修复范式,在
TakeAlongAxisKernel函数开头加入了对0-size情况的保护。核心逻辑是判断index.numel()是否为0,因为输出的形状完全由index决定。* 修复代码如下:
d. 依照以上原则修改CPU、GPU、XPU Kernel
反向修复 (Backward Fix):
添加单测 (Add Unit Test):
test/legacy_test/test_take_along_axis_op.py文件中,添加了新的TestTakeAlongAxis0Size测试类,它继承自op_test.OpTest。arr为0-size(shape为[2, 0, 5]),而index不为0-size(shape为[2, 3, 5])的广播场景。setUp方法如下:check_output和check_grad,确保了前后向的正确性。测试结果
feature/fix_take_along_axis_0size分支上运行添加的OpTest单元测试,结果为OK,证明修复成功。--accuracy模式无法使用。在--paddle_only模式下,修复后可顺利通过。