Skip to content
This repository has been archived by the owner on Aug 11, 2020. It is now read-only.

<gpu> and <cpu> generates totally different results #50

Open
DrustZ opened this issue Sep 21, 2015 · 8 comments
Open

<gpu> and <cpu> generates totally different results #50

DrustZ opened this issue Sep 21, 2015 · 8 comments

Comments

@DrustZ
Copy link
Contributor

DrustZ commented Sep 21, 2015

Help.
I improved the /guide/neuralnet/convnet.cu, add my own function
when I use -cpu parameter, things go well , the error rate is declining
however, -gpu generates totally different results: the error just stays at 0.9 or so(didn't change a little bit)...
It may seems there's a bug in mshadow's gpu implementation, but I don't know.

I didn't add my own gpu code, just use mshadow's(like <xpu>).

for cpu , I used blas lib. And my cuda version is 7.0.
Please help.
@tqchen @antinucleon

@DrustZ
Copy link
Contributor Author

DrustZ commented Sep 21, 2015

besides, when I change the configuration of stride, ksize and pad, the outputs are also not the same.
some configuration work well on both cpu and gpu, but some only on cpu.

@DrustZ
Copy link
Contributor Author

DrustZ commented Sep 22, 2015

Is there any probability that GPU and CPU performs differently on = and Copy?
I use a lot of = between tensor and expression, following the original convent version.
And I also use Copy, which is used when = can't work (like tensor[i] = tensor2.Slice(a, a+1) or tensor[i] = tensor2[i]
What's more, I used expression like tensor.Slice(i, i+1) = some expression, is this valid? or do I must turn it into Copy(tensor.Slice(i, i+1) , exp, stream_)?

Please help.
@tqchen @antinucleon

@DrustZ
Copy link
Contributor Author

DrustZ commented Sep 22, 2015

I set all tensor's stream, and in GPU mode it didn't report error.
Is there a bug or other reason that GPU and CPU performs differently?

@DrustZ
Copy link
Contributor Author

DrustZ commented Sep 22, 2015

Besides, I used a std::vector to store tensors,
like vector[i] = tensor

@DrustZ
Copy link
Contributor Author

DrustZ commented Sep 22, 2015

I'll be really really grateful if someone has some idea to help me.

@antinucleon
Copy link
Contributor

Could post you code with gist so I can have a glance?

@tqchen
Copy link
Member

tqchen commented Oct 3, 2015

Note there is a special semantics on = in mshadow. So when you assign a tensor to another, it is a pointer copy instead of a assignment. So if you intended it be be an assignment, use

Copy(dst, src);

Or

dst = F<op::identity>(src);

Or

dst =1.0f * src;

yajiedesign pushed a commit to yajiedesign/mshadow that referenced this issue Oct 6, 2015
@szha
Copy link
Member

szha commented Aug 4, 2019

This code base has been donated to the Apache MXNet project per #373, and repo is deprecated. Future development and issue tracking should continue in Apache MXNet.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants