Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird problem on Jetson TX2 #21

Open
s7ev3n opened this issue May 21, 2018 · 7 comments
Open

Weird problem on Jetson TX2 #21

s7ev3n opened this issue May 21, 2018 · 7 comments

Comments

@s7ev3n
Copy link

s7ev3n commented May 21, 2018

Hi, thanks for your implementation.
I successfully trained a model on my own dataset, which has two classes. And I am deploying it on Jetson TX2. However, very weird things happen: the result on Jetson TX2 is totally different with my two servers(I trained a model on one server, and test it on another server) with the same tensorflow and CUDA, cuDNN version. The result on Jetson TX2 seems like random, and changes every time I run a test!
Is there any hints how to solve this problem? Thanks so much.

s7ev3n

@flomed
Copy link

flomed commented Jun 21, 2018

Hey, I'm also having this issue. Did you manage to solve it or have any ideas of what could be the problem?

flo

@Archon512
Copy link

Could you post exactly the version of TF and CUDA you are using? I was having the same issue but solved it using tf.nn.conv2d instead of the unstable code from tf.contrib.slim. Apparently the slim implementation of convs have something going wrong in the code. Either try an older version or completely bypass the use of slim module

@s7ev3n
Copy link
Author

s7ev3n commented Jun 21, 2018

@flomed @Archon512
Hi, I think I solved this weird problem.
I trained this model on two servers of the same environment: tf 1.6, cuda9, cudnn7, python2.7. On jetson tx2, it is jetpack3.2, tf.1.6, cuda9,cudnn7,python2.7. I managed to borrow another tx2 board, which is jetpack3.1, tf1.3, cuda8, cudnn5(I think it was this), and tested it, it magically worked.
Yes, as archon512 said, slim may not be supported well on tx2, especially atrous conv(I think the problem is caused by this op) so I will use tf.nn to rewrite this model.

@flomed
Copy link

flomed commented Jun 21, 2018

Ok so before I was using tf1.8, cuda9, cudnn7, python2.7 on the tx2. Now I installed tf1.3, cuda8, cudnn6, python2.7 but the resulting images still look completely random and different from the ones on my pc. Keep me up to date if you manage to rewrite the model and get it working

@s7ev3n
Copy link
Author

s7ev3n commented Jun 22, 2018

@flomed
Did you compile tf1.3 on tx2 or download a compiled .whl file and install? I highly suggest compiling tf on your own.

@flomed
Copy link

flomed commented Jun 22, 2018

I was using pre compiled .whl file. I thought I'd give a different pre compiled tf1.3 file a try and it's working now. Thanks for the tip.
Here is a link to the repo: https://github.com/jetsonhacks/installTensorFlowJetsonTX

@zmqp111
Copy link

zmqp111 commented Sep 19, 2018

@s7ev3n
Hi, S7ev3n.

I have same problem my TX1.
I used Jetpack3.3, tf 1.9.0, CUDA 9.0, cuDNN 7.0.

I tested video, but my frame rate is 1~2 fps.

How can i change TX1 system?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants