Enable external CUDA allocator in ORTModule. #6745

codemzs · 2021-02-18T17:10:08Z

Enables external CUDA allocator (i.e PyTorch CUDA caching allocator) and also removes all references to torch.cuda.empty_cache() since we should not be clearing the cache as it can affect throughput and PyTorch allocator does that on need basis.

thiagocrepaldi · 2021-02-18T17:21:43Z

orttraining/orttraining/python/training/ortmodule.py


+        # CPP extension to get torch CUDA allocator's alloc and free function addresses
+        self._use_external_cuda_allocator = True
+        if self._use_external_cuda_allocator:


Is it allowed to do something like

model1 = ORTModule(model1) model2 = ORTModule(model2)

in the same process (python interpreter)? Just checking load_inline or self._torch_cuda_allocator.cuda_caching_allocator_raw_delete_address()` can handle this

I haven't personally tried but I don't see why it wouldn't work. It will just recompile and recreate the binary file.

thiagocrepaldi · 2021-02-18T17:24:40Z

orttraining/orttraining/python/training/ortmodule.py

            providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
-            provider_options = [{"device_id": str(self._device.index)}, {}]
+            if self._use_external_cuda_allocator:
+                provider_options = [{"device_id": str(self._device.index), "cuda_external_alloc": str(self._torch_alloc), "cuda_external_free": str(self._torch_free)}, {}]


Suggested change

provider_options = [{"device_id": str(self._device.index), "cuda_external_alloc": str(self._torch_alloc), "cuda_external_free": str(self._torch_free)}, {}]

provider_options = [{"device_id": str(_utils.get_device_index(self._device)), "cuda_external_alloc": str(self._torch_alloc), "cuda_external_free": str(self._torch_free)}, {}]

Thanks, @thiagocrepaldi , seems like a good suggestion but the standard software engineering practice would be to make this change as a different PR since I'm not touching this particular piece of code (i.e "device_id": str(self._device.index)). Your suggestion is a clean up and I do not want to mix it with enabling an external allocator.

thiagocrepaldi · 2021-02-18T17:25:22Z

orttraining/orttraining/python/training/ortmodule.py

+            if self._use_external_cuda_allocator:
+                provider_options = [{"device_id": str(self._device.index), "cuda_external_alloc": str(self._torch_alloc), "cuda_external_free": str(self._torch_free)}, {}]
+            else:
+                provider_options = [{"device_id": str(self._device.index)}]


Suggested change

provider_options = [{"device_id": str(self._device.index)}]

provider_options = [{"device_id": str(_utils.get_device_index(self._device))}]

same as above.

codemzs · 2021-02-19T04:00:56Z

Thanks, @SherlockNoMad and @thiagocrepaldi for the review. Thanks @baijumeswani for answering questions regarding the torch no_grad memory test.

Enable external CUDA allocator in ORTModule.

ef619b6

codemzs requested review from BowenBao, baijumeswani, liqunfu, spandantiwari and thiagocrepaldi as code owners February 18, 2021 17:10

SherlockNoMad approved these changes Feb 18, 2021

View reviewed changes

thiagocrepaldi reviewed Feb 18, 2021

View reviewed changes

codemzs added 4 commits February 18, 2021 18:53

Fix assert after unification of allocators.

6c4770c

Update no grad memory test.

0bfcd94

update comments.

0fb754d

fix provider options array when not sharing allocator.

e9133e3

codemzs merged commit 1a2f1bd into thiagofc/ortmodule-api Feb 19, 2021

codemzs deleted the mzs/external-cuda-allocator-ortmodule branch February 19, 2021 04:01

mrry mentioned this pull request Jun 16, 2021

Support plugging in custom user-defined allocators for sharing between sessions #8059

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable external CUDA allocator in ORTModule. #6745

Enable external CUDA allocator in ORTModule. #6745

Uh oh!

codemzs commented Feb 18, 2021 •

edited

Loading

Uh oh!

thiagocrepaldi Feb 18, 2021

Uh oh!

codemzs Feb 19, 2021

Uh oh!

thiagocrepaldi Feb 18, 2021

Uh oh!

codemzs Feb 19, 2021

Uh oh!

thiagocrepaldi Feb 18, 2021

Uh oh!

codemzs Feb 19, 2021

Uh oh!

codemzs commented Feb 19, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	provider_options = [{"device_id": str(self._device.index), "cuda_external_alloc": str(self._torch_alloc), "cuda_external_free": str(self._torch_free)}, {}]
	provider_options = [{"device_id": str(_utils.get_device_index(self._device)), "cuda_external_alloc": str(self._torch_alloc), "cuda_external_free": str(self._torch_free)}, {}]

Enable external CUDA allocator in ORTModule. #6745

Enable external CUDA allocator in ORTModule. #6745

Uh oh!

Conversation

codemzs commented Feb 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thiagocrepaldi Feb 18, 2021

Choose a reason for hiding this comment

Uh oh!

codemzs Feb 19, 2021

Choose a reason for hiding this comment

Uh oh!

thiagocrepaldi Feb 18, 2021

Choose a reason for hiding this comment

Uh oh!

codemzs Feb 19, 2021

Choose a reason for hiding this comment

Uh oh!

thiagocrepaldi Feb 18, 2021

Choose a reason for hiding this comment

Uh oh!

codemzs Feb 19, 2021

Choose a reason for hiding this comment

Uh oh!

codemzs commented Feb 19, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codemzs commented Feb 18, 2021 •

edited

Loading