avoid calling gc.collect and cuda.empty_cache#34514
Conversation
Rocketknight1
left a comment
There was a problem hiding this comment.
Yes, this seems like a good speed fix! cc @LysandreJik @ArthurZucker for core maintainer review
gc.collect and cuda.empty_cache
LysandreJik
left a comment
There was a problem hiding this comment.
Smart! Should a helper method be made that only runs on CPU?
Both the gc.collect and the torch device checks could be moved into the backend_empty_cache method (or an other method that wraps both)
|
Yes, a helper method is nice. Will update |
82e3add to
6620320
Compare
|
updated. So far it doesn't call |
6620320 to
4403c5a
Compare
1f47700 to
18d6d5d
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
* update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
What does this PR do?
Let's avoid calling
gc.collectandcuda.empty_cachewhile the tests are running on CPU:Running on GPT2 tests,
60 seconds on main, 20 seconds on this PR