-
Notifications
You must be signed in to change notification settings - Fork 7.2k
[rllib] Replace ray.get() with ray_get_and_free() to optimize memory usage #4586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@robertnishihara @guoyuhong do you see any performance issues with calling ray.internal.free up to a couple hundred times a second here? |
|
Test PASSed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test FAILed. |
|
@ericl Maybe you need to do some tests. |
|
Hm yeah, that might be a high overhead if we broadcast to all nodes for each object created. I could also batch the objects and periodically flush them. |
|
Test PASSed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test FAILed. |
|
jenkins retest this please |
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
cc @joneswong |
|
|
||
| now = time.time() | ||
| if (len(_to_free) > MAX_FREE_QUEUE_SIZE | ||
| or now - _last_free_time > FREE_DELAY_S): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch the free calls to avoid too much overheads
|
Test PASSed. |
|
Test PASSed. |
|
Test FAILed. |
| """ | ||
| worker = ray.worker.get_global_worker() | ||
|
|
||
| if ray.worker._mode() == ray.worker.LOCAL_MODE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh this is nice
This PR contains changes that help with memory issues ray-project#4586
What do these changes do?
When ray.get() is called in RLlib, the result should be freed immediately. This prevents object store memory from growing unnecessarily.
Related issue number
(many previous issues)
Linter
scripts/format.shto lint the changes in this PR.