<gpu> and <cpu> generates totally different results #50

DrustZ · 2015-09-21T15:48:10Z

Help.
I improved the /guide/neuralnet/convnet.cu, add my own function
when I use -cpu parameter, things go well , the error rate is declining
however, -gpu generates totally different results: the error just stays at 0.9 or so(didn't change a little bit)...
It may seems there's a bug in mshadow's gpu implementation, but I don't know.

I didn't add my own gpu code, just use mshadow's(like <xpu>).

for cpu , I used blas lib. And my cuda version is 7.0.
Please help.
@tqchen @antinucleon

The text was updated successfully, but these errors were encountered:

DrustZ · 2015-09-21T15:51:03Z

besides, when I change the configuration of stride, ksize and pad, the outputs are also not the same.
some configuration work well on both cpu and gpu, but some only on cpu.

DrustZ · 2015-09-22T03:50:47Z

Is there any probability that GPU and CPU performs differently on = and Copy?
I use a lot of = between tensor and expression, following the original convent version.
And I also use Copy, which is used when = can't work (like tensor[i] = tensor2.Slice(a, a+1) or tensor[i] = tensor2[i]
What's more, I used expression like tensor.Slice(i, i+1) = some expression, is this valid? or do I must turn it into Copy(tensor.Slice(i, i+1) , exp, stream_)?

Please help.
@tqchen @antinucleon

DrustZ · 2015-09-22T03:51:49Z

I set all tensor's stream, and in GPU mode it didn't report error.
Is there a bug or other reason that GPU and CPU performs differently?

DrustZ · 2015-09-22T03:52:44Z

Besides, I used a std::vector to store tensors,
like vector[i] = tensor

DrustZ · 2015-09-22T03:53:14Z

I'll be really really grateful if someone has some idea to help me.

antinucleon · 2015-09-22T03:57:59Z

Could post you code with gist so I can have a glance?

tqchen · 2015-10-03T03:56:33Z

Note there is a special semantics on = in mshadow. So when you assign a tensor to another, it is a pointer copy instead of a assignment. So if you intended it be be an assignment, use

Copy(dst, src);

Or

dst = F<op::identity>(src);

Or

dst =1.0f * src;

[BUGFIX] concurrency

szha · 2019-08-04T00:52:12Z

This code base has been donated to the Apache MXNet project per #373, and repo is deprecated. Future development and issue tracking should continue in Apache MXNet.

DrustZ mentioned this issue Sep 22, 2015

add stride_x_ and stride_y_ for more flexible configuration #51

Merged

yajiedesign pushed a commit to yajiedesign/mshadow that referenced this issue Oct 6, 2015

Merge pull request dmlc#50 from tqchen/master

35c7762

[BUGFIX] concurrency

sxjscience mentioned this issue May 27, 2016

sym.Sqrt gradient inf ? apache/mxnet#2261

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

<gpu> and <cpu> generates totally different results #50

<gpu> and <cpu> generates totally different results #50

DrustZ commented Sep 21, 2015

DrustZ commented Sep 21, 2015

DrustZ commented Sep 22, 2015

DrustZ commented Sep 22, 2015

DrustZ commented Sep 22, 2015

DrustZ commented Sep 22, 2015

antinucleon commented Sep 22, 2015

tqchen commented Oct 3, 2015

szha commented Aug 4, 2019

<gpu> and <cpu> generates totally different results #50

<gpu> and <cpu> generates totally different results #50

Comments

DrustZ commented Sep 21, 2015

DrustZ commented Sep 21, 2015

DrustZ commented Sep 22, 2015

DrustZ commented Sep 22, 2015

DrustZ commented Sep 22, 2015

DrustZ commented Sep 22, 2015

antinucleon commented Sep 22, 2015

tqchen commented Oct 3, 2015

szha commented Aug 4, 2019