Call cuda empty_cache to prevent OOM when quantizing model #2671

AllentDan · 2024-10-28T08:56:34Z

No description provided.

* Call cuda empty_cache to prevent OOM when quantizing model * empty cache during export and after forward

…2671) * Call cuda empty_cache to prevent OOM when quantizing model * empty cache during export and after forward

AllentDan added 2 commits October 28, 2024 16:56

Call cuda empty_cache to prevent OOM when quantizing model

93f5c7c

empty cache during export and after forward

b1d3717

lvhan028 approved these changes Oct 31, 2024

View reviewed changes

lvhan028 added the improvement label Oct 31, 2024

lvhan028 merged commit dde5d23 into InternLM:main Oct 31, 2024
4 of 5 checks passed

lvhan028 pushed a commit that referenced this pull request Nov 5, 2024

Call cuda empty_cache to prevent OOM when quantizing model (#2671)

28c8b79

* Call cuda empty_cache to prevent OOM when quantizing model * empty cache during export and after forward

AllentDan added a commit to AllentDan/lmdeploy that referenced this pull request Nov 13, 2024

Call cuda empty_cache to prevent OOM when quantizing model (InternLM#…

17c95a1

…2671) * Call cuda empty_cache to prevent OOM when quantizing model * empty cache during export and after forward

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call cuda empty_cache to prevent OOM when quantizing model #2671

Call cuda empty_cache to prevent OOM when quantizing model #2671

AllentDan commented Oct 28, 2024

Call cuda empty_cache to prevent OOM when quantizing model #2671

Call cuda empty_cache to prevent OOM when quantizing model #2671

Conversation

AllentDan commented Oct 28, 2024