bert模型的embedding层在X86+英伟达平台上使用opencl的结果和cpu的结果对不上 #2424

tzhang2014 · 2023-06-12T02:40:23Z

平台(如果交叉编译请再附上交叉编译目标平台):

Platform(Include target platform as well if cross-compiling):

X86

Github版本:

Github Version:

2.5.1
直接下载ZIP包请提供下载日期以及压缩包注释里的git版本(可通过7z l zip包路径命令并在输出信息中搜索Comment 获得，形如Comment = bc80b11110cd440aacdabbf59658d630527a7f2b)。 git clone请提供 git commit 第一行的commit id

Provide date (or better yet, git revision from the comment section of the zip. Obtainable using 7z l PATH/TO/ZIP and search for Comment in the output) if downloading source as zip,otherwise provide the first commit id from the output of git commit

编译方式:

Compiling Method

请在这里粘贴cmake参数或使用的cmake脚本路径以及完整输出
Paste cmake arguments or path of the build script used here as well as the full log of the cmake proess here or pastebin
cmake .. \
-DMNN_BUILD_TEST=ON \
-DMNN_CUDA=ON \
-DMNN_CUDA_PROFILE=ON \
-DMNN_OPENCL=ON \
-DMNN_BUILD_QUANTOOLS=ON \
-DMNN_BUILD_DEMO=ON \
-DMNN_BUILD_CONVERTER=ON \
-DMNN_BUILD_BENCHMARK=ON \
-DMNN_SEP_BUILD=OFF \
-DMNN_BUILD_OPENCV=ON \
-DMNN_IMGCODECS=ON

编译日志:

Build Log:

粘贴在这里
Paste log here or pastebin

demo_embed_debug.zip

The text was updated successfully, but these errors were encountered:

jxt1234 · 2023-06-14T12:24:52Z

收到

Qxinyu · 2023-06-16T07:42:00Z

可以修改下source\backend\opencl\execution\image\EltwiseExecution.cpp文件，将210行的return new EltwiseExecution(inputs, "in0-sign(in1)*in0/(fabs(in1)>(FLOAT4)((FLOAT)0.0000001)?fabs(in1):(FLOAT4)((FLOAT)0.0000001))", op, backend);改为：
return new EltwiseExecution(inputs, "in0-floor(sign(in1)*in0/(fabs(in1)>(FLOAT4)((FLOAT)0.0000001)?fabs(in1):(FLOAT4)((FLOAT)0.0000001)))*in1", op, backend);
再重新编译试试。

tzhang2014 · 2023-06-16T09:03:09Z

可以修改下source\backend\opencl\execution\image\EltwiseExecution.cpp文件，将210行的return new EltwiseExecution(inputs, "in0-sign(in1)*in0/(fabs(in1)>(FLOAT4)((FLOAT)0.0000001)?fabs(in1):(FLOAT4)((FLOAT)0.0000001))", op, backend);改为： return new EltwiseExecution(inputs, "in0-floor(sign(in1)*in0/(fabs(in1)>(FLOAT4)((FLOAT)0.0000001)?fabs(in1):(FLOAT4)((FLOAT)0.0000001)))*in1", op, backend); 再重新编译试试。

你好，替换后，提交的模型是可以对上，但是实际我有三个输入，另外一个输入权重超过25M，我就裁了，三个输入结果还是对不上。

Qxinyu · 2023-06-16T09:08:50Z

能给下三输入的模型吗

tzhang2014 · 2023-06-16T09:31:28Z

能给下三输入的模型吗

git上传不了，可以给个邮箱吗，我把测试样例和模型一起发过去

Qxinyu · 2023-06-16T10:37:07Z

发送到这个邮箱吧
[email protected]

tzhang2014 · 2023-06-17T03:41:36Z

发送到这个邮箱吧 [email protected]

已发，请查收哈

Qxinyu · 2023-06-21T03:26:24Z

这个模型我本地测试结果opencl与cpu是一致的。

tzhang2014 · 2023-06-21T04:00:15Z

这个模型我本地测试结果opencl与cpu是一致的。

附件是我这边的打印信息，和您那边的是一样的吗？我用的是2.5.1的MNN
log.zip

Qxinyu · 2023-06-21T04:04:11Z

我这边复现问题了，我先定位下。

Qxinyu · 2023-06-26T04:41:09Z

我这边把输入弄对，现在cpu和opencl结果一致了，但与你给的log结果显示不一样。你那边可以试试MNNV2Basic.cpp这个文件，改下里面的输入，看下结果对不对。

tzhang2014 · 2023-06-26T06:00:11Z

我这边把输入弄对，现在cpu和opencl结果一致了，但与你给的log结果显示不一样。你那边可以试试MNNV2Basic.cpp这个文件，改下里面的输入，看下结果对不对。

好的，我我试试哈，感谢

tzhang2014 · 2023-06-26T09:01:45Z

我这边把输入弄对，现在cpu和opencl结果一致了，但与你给的log结果显示不一样。你那边可以试试MNNV2Basic.cpp这个文件，改下里面的输入，看下结果对不对。

你好，我用basic把里面的filename << pwd << "input0.txt" 改成filename << pwd << "input_ids.txt"后，cpu和opencl还是不一样，不改的话是因为代码里面给输入都是送0,输出是一样的，

Qxinyu · 2023-06-26T09:21:24Z

使用basic，你可以把runMask参数置为2，然后在pwd目录下创建output文件夹，这个会在运行后将每一层的输入输出dump下来，你可以看下cpu和opencl的dump结果是否一致。

tzhang2014 · 2023-06-26T11:25:15Z

使用basic，你可以把runMask参数置为2，然后在pwd目录下创建output文件夹，这个会在运行后将每一层的输入输出dump下来，你可以看下cpu和opencl的dump结果是否一致。

我就是这么测的呢，两个结果不一样。。。

Qxinyu · 2023-06-26T11:29:02Z

dump显示第一层的输入是一样的，但是经过第一个binary之后结果就不一致了吗？我这边之前不对就是第一次的输入就不一致了，后面改好后，结果就变得一样了。

tzhang2014 · 2023-06-26T11:36:14Z

dump显示第一层的输入是一样的，但是经过第一个binary之后结果就不一致了吗？我这边之前不对就是第一次的输入就不一致了，后面改好后，结果就变得一样了。

我是直接看最后的output.txt的呢

tzhang2014 · 2023-06-26T11:50:46Z

dump显示第一层的输入是一样的，但是经过第一个binary之后结果就不一致了吗？我这边之前不对就是第一次的输入就不一致了，后面改好后，结果就变得一样了。

为啥opencl的输入和cpu的不一样呢，都是读的input_ids

Qxinyu · 2023-06-26T11:59:23Z

opencl涉及到数据从cpu到gpu的拷贝，你有阿里钉账号吗，我们可以私聊下

tzhang2014 · 2023-06-26T12:01:45Z

opencl涉及到数据从cpu到gpu的拷贝，你有阿里钉账号吗，我们可以私聊下

有，我在MNN钉钉群里，怎么加您

Qxinyu · 2023-06-26T12:14:23Z

你在群里的名称是什么

tzhang2014 · 2023-06-26T12:16:33Z

你在群里的名称是什么

我在1群发消息了，看到没

Qxinyu · 2023-06-26T12:18:58Z

你搜下共进吧，我是在三群

tzhang2014 · 2023-06-26T12:26:27Z

你搜下共进吧，我是在三群

我进三群了，加您了看到了么

jxt1234 added the bug Something isn't working label Jun 14, 2023

jxt1234 closed this as completed Jun 29, 2023

This was referenced Jul 5, 2023

[MNN:Sync] Sync Internal 2.6.0 #2469

Closed

2.6 版本同步 #2470

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bert模型的embedding层在X86+英伟达平台上使用opencl的结果和cpu的结果对不上 #2424

bert模型的embedding层在X86+英伟达平台上使用opencl的结果和cpu的结果对不上 #2424

tzhang2014 commented Jun 12, 2023 •

edited

Loading

jxt1234 commented Jun 14, 2023

Qxinyu commented Jun 16, 2023

tzhang2014 commented Jun 16, 2023

Qxinyu commented Jun 16, 2023

tzhang2014 commented Jun 16, 2023

Qxinyu commented Jun 16, 2023

tzhang2014 commented Jun 17, 2023

Qxinyu commented Jun 21, 2023

tzhang2014 commented Jun 21, 2023

Qxinyu commented Jun 21, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023 •

edited

Loading

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

bert模型的embedding层在X86+英伟达平台上使用opencl的结果和cpu的结果对不上 #2424

bert模型的embedding层在X86+英伟达平台上使用opencl的结果和cpu的结果对不上 #2424

Comments

tzhang2014 commented Jun 12, 2023 • edited Loading

平台(如果交叉编译请再附上交叉编译目标平台):

Platform(Include target platform as well if cross-compiling):

Github版本:

Github Version:

编译方式:

Compiling Method

编译日志:

Build Log:

jxt1234 commented Jun 14, 2023

Qxinyu commented Jun 16, 2023

tzhang2014 commented Jun 16, 2023

Qxinyu commented Jun 16, 2023

tzhang2014 commented Jun 16, 2023

Qxinyu commented Jun 16, 2023

tzhang2014 commented Jun 17, 2023

Qxinyu commented Jun 21, 2023

tzhang2014 commented Jun 21, 2023

Qxinyu commented Jun 21, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023 • edited Loading

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

Qxinyu commented Jun 26, 2023

tzhang2014 commented Jun 26, 2023

tzhang2014 commented Jun 12, 2023 •

edited

Loading

tzhang2014 commented Jun 26, 2023 •

edited

Loading