Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN error with continuous action space #24

Open
GraV1337y opened this issue Dec 1, 2021 · 2 comments
Open

NaN error with continuous action space #24

GraV1337y opened this issue Dec 1, 2021 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@GraV1337y
Copy link
Contributor

No description provided.

@GraV1337y GraV1337y added the bug Something isn't working label Dec 1, 2021
@GraV1337y GraV1337y self-assigned this Dec 1, 2021
@GraV1337y
Copy link
Contributor Author

[W python_anomaly_mode.cpp:102] Warning: Error detected in ExpBackward. Traceback of forward call that caused the error:
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/threading.py", line 912, in _bootstrap
self._bootstrap_inner()
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/threading.py", line 954, in _bootstrap_inner
self.run()
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/threading.py", line 892, in run
self._target(*self._args, **self._kwargs)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/learner.py", line 1154, in _train_loop
self._process_training_data(data, timing, wait_stats)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/learner.py", line 1104, in _process_training_data
train_stats = self._train(buffer, batch_size, experience_size, timing)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/learner.py", line 729, in _train
result = self.actor_critic(tail = True, core_output = core_outputs, with_action_distribution=True)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
output = self.module(*inputs, **kwargs)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/model.py", line 114, in forward
return self.forward_tail(core_output, with_action_distribution)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/model.py", line 92, in forward_tail
action_distribution_params, action_distribution = self.action_parameterization(core_output)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/model_utils.py", line 424, in forward
action_distribution = get_action_distribution(self.action_space, raw_logits=action_distribution_params)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/utils/action_distributions.py", line 85, in get_action_distribution
return ContinuousActionDistribution(params=raw_logits)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/utils/action_distributions.py", line 277, in init
self.stddevs = self.log_std.exp()
(function _print_stack)
Exception in thread Thread-4:
Traceback (most recent call last):
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/threading.py", line 954, in _bootstrap_inner
self.run()
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/threading.py", line 892, in run
self._target(*self._args, **self._kwargs)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/learner.py", line 1154, in _train_loop
self._process_training_data(data, timing, wait_stats)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/learner.py", line 1104, in _process_training_data
train_stats = self._train(buffer, batch_size, experience_size, timing)
File "/work/smyawege/multi-sample-factory/multi_sample_factory/algorithms/appo/learner.py", line 834, in _train
loss.backward()
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/_tensor.py", line 306, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/autograd/init.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: Function 'ExpBackward' returned nan values in its 0th output.

@GraV1337y
Copy link
Contributor Author

^[[33m[2021-11-29 15:35:15,078][06710] High loss value: l:34460.8203 pl:0.0162 vl:34460.8242 exp_l:-0.0215 kl_l:0.0000 (recommended to adjust the --reward_scale parameter)^[[0m
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
^[[31m^[[01m[2021-11-29 15:35:15,096][06953] Unknown exception on policy worker
Traceback (most recent call last):
File "/work/grudelpg/sample-factory/sample_factory/algorithms/appo/policy_worker.py", line 247, in _run
self._handle_policy_steps(timing)
File "/work/grudelpg/sample-factory/sample_factory/algorithms/appo/policy_worker.py", line 109, in _handle_policy_steps
policy_outputs = self.actor_critic(observations, rnn_states)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/grudelpg/sample-factory/sample_factory/algorithms/appo/model.py", line 112, in forward
result = self.forward_tail(x, with_action_distribution=with_action_distribution)
File "/work/grudelpg/sample-factory/sample_factory/algorithms/appo/model.py", line 92, in forward_tail
action_distribution_params, action_distribution = self.action_parameterization(core_output)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/grudelpg/sample-factory/sample_factory/algorithms/appo/model_utils.py", line 424, in forward
action_distribution = get_action_distribution(self.action_space, raw_logits=action_distribution_params)
File "/work/grudelpg/sample-factory/sample_factory/algorithms/utils/action_distributions.py", line 57, in get_action_distribution
return ContinuousActionDistribution(params=raw_logits)
File "/work/grudelpg/sample-factory/sample_factory/algorithms/utils/action_distributions.py", line 254, in init
normal_dist = Normal(self.means, self.stddevs)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/distributions/normal.py", line 50, in init
super(Normal, self).init(batch_shape, validate_args=validate_args)
File "/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/torch/distributions/distribution.py", line 55, in init
raise ValueError(
ValueError: Expected parameter loc (Tensor of shape (24, 8)) of distribution Normal(loc: torch.Size([24, 8]), scale: torch.Size([24, 8])) to satisfy the constraint Real(), but found$
tensor([[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan]], device='cuda:0')^[[0m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant