Structure of ReplayBuffer._storage #610

FHainzl · 2019-12-10T10:44:20Z

In deepq.ReplayBuffer, the docstring for ReplayBuffer._storage says
[(np.ndarray, float, float, np.ndarray, bool)]
but shouldn't it actually be
"""[(np.ndarray, [float], float, np.ndarray, bool)]?

I think this would be correct for two reasons:

Actions could be multi-dimensional
The docstring for ReplayBuffer.add claims
:param action: ([float]) the action,
which I think is an inconsistency with the storage docstring.

Am I missing something here?

The text was updated successfully, but these errors were encountered:

araffin · 2019-12-10T19:49:11Z

Hello,

Good point.

in fact, it depends on whether the buffer is used with discrete actions (e.g. DQN) or continuous actions (e.g. SAC).

For discrete: action should be int
For continuous: action should be np.ndarray (in fact a numpy array of type float32)

Overall, np.ndarray would be the most correct approximation, feel free to submit a PR that solves this inconsistency ;)

* Bump version * Add a message to PPO2 assert (closes #625) * Update replay buffer doctring (closes #610) * Don't specify a version for pytype * Fix `VecEnv` docstrings (closes #577) * Typo * Re-add python version for pytype

araffin added documentation Documentation should be updated question Further information is requested labels Dec 10, 2019

araffin added a commit that referenced this issue Dec 19, 2019

Update replay buffer doctring (closes #610)

a355aec

araffin mentioned this issue Dec 19, 2019

Release v2.9.0 #629

Merged

11 tasks

araffin closed this as completed in #629 Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structure of ReplayBuffer._storage #610

Structure of ReplayBuffer._storage #610

FHainzl commented Dec 10, 2019

araffin commented Dec 10, 2019

Structure of ReplayBuffer._storage #610

Structure of ReplayBuffer._storage #610

Comments

FHainzl commented Dec 10, 2019

araffin commented Dec 10, 2019