Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

retry GPU memory allocation if fragmented #16194

Merged
merged 1 commit into from
Sep 18, 2019
Merged

Conversation

szha
Copy link
Member

@szha szha commented Sep 18, 2019

Description

retry GPU memory allocation if allocation failure occurs due to fragmentation.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • retry GPU memory allocation if allocation failure occurs due to fragmentation.

Comments

  • Memory fragmentation happens at runtime when total free memory is larger than the requested memory chunk, but no consecutive space can be found to accommodate the memory request.
  • I'm changing the memory pool allocation logic when cudaErrorMemoryAllocation occurrs:
    1. release all cached unused memory chunks and retry
    2. retry memory allocation

@eric-haibin-lin eric-haibin-lin merged commit 479ab46 into apache:master Sep 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants