-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gb/multi gpu #138
Gb/multi gpu #138
Conversation
@bnb32 can you take a look at what might be going on with the data-centric test? I have no idea how my edits would affect that code https://github.com/NREL/sup3r/actions/runs/3876667898/jobs/6610776941 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wish we could restructure to avoid duplicating some code but overall an elegant solution to the multi-gpu problem.
training_weights, | ||
device_name=f'/gpu:{i}', | ||
**calc_loss_kwargs)) | ||
for i, future in enumerate(futures): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we need as_completed() here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, applying the different gradients like this is better than applying the average of the gradients once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use as_completed() but the futures will finish at basically the same time so i was just keeping it simple. And i dont think either is better - averaging with a single weight update might actually be more intuitive, but this is nice and simple.
@@ -960,3 +1113,54 @@ def _tf_generate_wind(self, low_res, hi_res_topo): | |||
raise RuntimeError(msg) from e | |||
|
|||
return hi_res | |||
|
|||
@tf.function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having this copied code hurts me. but alas...
|
||
return grad, loss_details | ||
|
||
def run_gradient_descent(self, low_res, hi_res_true, training_weights, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a nice simple solution. well done!
Wish we could restructure to avoid duplicating some code but overall an elegant solution to the multi-gpu problem. |
… multi_gpu flag to train entry point
3680e90
to
d6e2cd1
Compare
No description provided.