You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a workaround for the ResourceExhaustedError?
That's what happen when I run main.py with a custom env:
Traceback (most recent call last):
File "main.py", line 125, in <module>
main()
File "main.py", line 103, in main
stats = algo.train(env, args, summary_writer)
File "[...]\Deep-RL-Keras\A2C\a2c.py", line 100, in train
self.train_models(states, actions, rewards, done)
File "[...]\Deep-RL-Keras\A2C\a2c.py", line 67, in train_models
self.c_opt([states, discounted_rewards])
File "[...]\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "[...]\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "[...]\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
run_metadata_ptr)
File "[...]\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[177581,177581] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
[[{{node sub_17}} = Sub[T=DT_FLOAT, _class=["loc:@gradients_1/sub_17_grad/Reshape_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_2_0_1, dense_6/BiasAdd)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
The text was updated successfully, but these errors were encountered:
Hi,
I am not familiar with this error, but it does seem like you are dealing with very large tensors ([177581,177581]), have you tried narrowing down where this tensor comes from? Also playing with the batch-size and input size should help.
Hi,
thanks for the response. What do you mean with input size? I tried to use a lower batch-size but the error still orccurs. I will have a closer look at the project later on.
Hi,
Apologies about the late reply! It seems like the size of your environment / state is very large and causes the network to produce some very large tensors at some point. You could try checking where this large tensor comes from, and optionally changing some network parameters
Is there a workaround for the
ResourceExhaustedError
?That's what happen when I run
main.py
with a custom env:The text was updated successfully, but these errors were encountered: