-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Optimization] Replay Buffers shouldn't use copy when using np.array #112
Comments
Good catch. Mind if I include this in a PR with credits to you (#110)? If you find bunch more such quirks and optimizations, feel free to bundle them up to a PR though :) |
I can make a large one that deals with code optimisation/quirks and leave #110 for algorithm performance, it's up to you :) |
Ok, lets make another PR for optimizations like this 👍 . Lets wait until agent performance has been matched so things do not crisscross to badly. |
After some more investigation, I believe that we don't need to copy as the assignment is done on the original data, i.e. where the buffer is located, so
Copies the data twice, once from data to the temporary, and once more from the temporary to x.
https://numpy.org/doc/stable/user/basics.indexing.html#assigning-values-to-indexed-arrays |
We need to be extra careful with that one, I remember having weird issues that apparently came from changes by reference (hence the extra conservative copy that is currently in the code). But if everything is copied, then perfect. |
But if everything is copied, then perfect.
I played a bit with it in a local implementation and there seemed to be some performance degradation with respect to speed when I removed the second copy so I suggest that we investigate both scenarios when the time comes.
|
Another (general) optimization suggestion for numpy operations would be using numba or pytorch tensors. |
Why would pytorch tensors be faster than numpy? |
Using pytorch tensors doesn't necessarily improve performance mainly due to the need to perform the computations @araffin mentioned, moreover, going cpu->gpu->cpu will probably degrade performance due to latency transfers. The data is usually rather small and fit in the cache so the transfers will probably dominate the runtime. |
Pardon, it was a general suggestion, not related specifically to this issue. |
Related to #49
In
stable-baselines3/stable_baselines3/common/buffers.py
Lines 198 to 208 in 23afedb
stable-baselines3/stable_baselines3/common/buffers.py
Lines 346 to 349 in 23afedb
We call
np.ndarray(x).copy()
. This is unnecessary because np.array has the argument "copy" which is True by default.https://numpy.org/doc/stable/reference/generated/numpy.array.html
The text was updated successfully, but these errors were encountered: