Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running tf_unet in distributed mode #263

Open
Mijyuoon opened this issue Apr 17, 2019 · 2 comments
Open

Running tf_unet in distributed mode #263

Mijyuoon opened this issue Apr 17, 2019 · 2 comments

Comments

@Mijyuoon
Copy link

I am trying to run this code in distributed tensorflow mode and have modified the code accordingly (i.e. using MonitoredTrainingSession and so on). But trying to use monitored training session doesn't work and produces errors about data dimension mismatch:

tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must be broadcastable: logits_size=[41664,2] labels_size=[60900,2]
	 [[{{node cost/softmax_cross_entropy_with_logits}}]]
	 [[{{node results/pixel_wise_softmax/truediv}}]]

(Full error log)

I'm not fully understanding why this error is happening, other code samples I've tried to convert to distributed mode work fine. Is there anything else that might need changing for distributed mode support?

@jakeret
Copy link
Owner

jakeret commented Apr 24, 2019

sorry for the late reply.
I'm wondering if something is not quite right with e size cropping of the labels prior to computing the cross entropy

@Mijyuoon
Copy link
Author

Well, i figured out that if you disable summary saver hook, it no longer crashes. Not sure what that has to do with anything but I've been able to run it that way for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants