Add RMM Python docs(rapidsai#632)

shwina · web-flow · commit fa552c2d96c1 · 2020-11-24T14:55:22.000Z
Closes rapidsai#630 Authors: - Ashwin Srinath <shwina@users.noreply.github.com> Approvers: - Keith Kraus - AJ Schmidt - Mark Harris URL: rapidsai#632
diff --git a/.gitignore b/.gitignore
@@ -107,7 +107,7 @@ instance/
 .scrapy
 
 # Sphinx documentation
-docs/_build/
+python/docs/_build/
 
 # PyBuilder
 target/
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,7 @@
 
 - PR #596 Add `tracking_memory_resource_adaptor` to help catch memory leaks
 - PR #608 Add stream wrapper type
+- PR #632 Add RMM Python docs
 
 ## Improvements
 
diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh
@@ -49,4 +49,7 @@ sed_runner 's/'"RMM VERSION .* LANGUAGES"'/'"RMM VERSION ${NEXT_FULL_TAG} LANGUA
 
 sed_runner 's/version=.*/version=\"'"${NEXT_FULL_TAG}"'\",/g' python/setup.py
 
-sed_runner 's/'"PROJECT_NUMBER         = .*"'/'"PROJECT_NUMBER         = ${NEXT_SHORT_TAG}"'/g' doxygen/Doxyfile
+sed_runner 's/'"PROJECT_NUMBER         = .*"'/'"PROJECT_NUMBER         = ${NEXT_SHORT_TAG}"'/g' doxygen/Doxyfile
+
+sed_runner 's/'"version =.*"'/'"version = \"${NEXT_SHORT_TAG}\""'/g' python/docs/conf.py
+sed_runner 's/'"release =.*"'/'"release = \"${NEXT_FULL_TAG}\""'/g' python/docs/conf.py
diff --git a/python/docs/Makefile b/python/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/python/docs/api.rst b/python/docs/api.rst
@@ -0,0 +1,28 @@
+API Reference
+==============
+
+High-level API
+--------------
+
+.. automodule:: rmm.rmm
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+
+Memory Resources
+----------------
+
+.. automodule:: rmm.mr
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+
+Module contents
+---------------
+
+.. automodule:: rmm
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/python/docs/basics.md b/python/docs/basics.md
@@ -0,0 +1,165 @@
+# RMM - the RAPIDS Memory Manager
+
+Achieving optimal performance in GPU-centric workflows frequently requires
+customizing how GPU ("device") memory is allocated.
+
+RMM is a package that enables you to allocate device memory
+in a highly configurable way. For example, it enables you to
+allocate and use pools of GPU memory, or to use
+[managed memory](https://developer.nvidia.com/blog/unified-memory-cuda-beginners/)
+for allocations.
+
+You can also easily configure other libraries like Numba and CuPy
+to use RMM for allocating device memory.
+
+## Installation
+
+See the project [README](https://github.com/rapidsai/rmm) for how to install RMM.
+
+## Using RMM
+
+There are two ways to use RMM in Python code:
+
+1. Using the `rmm.DeviceBuffer` API to explicitly create and manage
+   device memory allocations
+2. Transparently via external libraries such as CuPy and Numba
+
+RMM provides a `MemoryResource` abstraction to control _how_ device
+memory is allocated in both the above uses.
+
+### DeviceBuffers
+
+A DeviceBuffer represents an **untyped, uninitialized device memory
+allocation**.  DeviceBuffers can be created by providing the
+size of the allocation in bytes:
+
+```python
+>>> import rmm
+>>> buf = rmm.DeviceBuffer(size=100)
+```
+
+The size of the allocation and the memory address associated with it
+can be accessed via the `.size` and `.ptr` attributes respectively:
+
+```python
+>>> buf.size
+100
+>>> buf.ptr
+140202544726016
+```
+
+DeviceBuffers can also be created by copying data from host memory:
+
+```python
+>>> import rmm
+>>> import numpy as np
+>>> a = np.array([1, 2, 3], dtype='float64')
+>>> buf = rmm.to_device(a.tobytes())
+>>> buf.size
+24
+```
+
+Conversely, the data underlying a DeviceBuffer can be copied to the
+host:
+
+```python
+>>> np.frombuffer(buf.tobytes())
+array([1., 2., 3.])
+```
+
+### MemoryResource objects
+
+`MemoryResource` objects are used to configure how device memory allocations are made by
+RMM.
+
+By default if a `MemoryResource` is not set explicitly, RMM uses the `CudaMemoryResource`, which
+uses `cudaMalloc` for allocating device memory.
+
+`rmm.reinitialize()` provides an easy way to initialize RMM with specific memory resource options
+across multiple devices. See `help(rmm.reinitialize)` for full details.
+
+For lower-level control, the `rmm.mr.set_current_device_resource()` function can be
+used to set a different MemoryResource for the current CUDA device.  For
+example, enabling the `ManagedMemoryResource` tells RMM to use
+`cudaMallocManaged` instead of `cudaMalloc` for allocating memory:
+
+```python
+>>> import rmm
+>>> rmm.mr.set_current_device_resource(rmm.mr.ManagedMemoryResource())
+```
+
+> :warning: The default resource must be set for any device **before**
+> allocating any device memory on that device.  Setting or changing the
+> resource after device allocations have been made can lead to unexpected
+> behaviour or crashes. See [Multiple Devices](#multiple-devices)
+
+As another example, `PoolMemoryResource` allows you to allocate a
+large "pool" of device memory up-front. Subsequent allocations will
+draw from this pool of already allocated memory.  The example
+below shows how to construct a PoolMemoryResource with an initial size
+of 1 GiB and a maximum size of 4 GiB. The pool uses
+`CudaMemoryResource` as its underlying ("upstream") memory resource:
+
+```python
+>>> import rmm
+>>> pool = rmm.mr.PoolMemoryResource(
+...     upstream=rmm.mr.CudaMemoryResource(),
+...     initial_pool_size=2**30,
+...     maximum_pool_size=2**32
+... )
+>>> rmm.mr.set_current_device_resource(pool)
+```
+
+Similarly, to use a pool of managed memory:
+
+```python
+>>> import rmm
+>>> pool = rmm.mr.PoolMemoryResource(
+...     upstream=rmm.mr.ManagedMemoryResource(),
+...     initial_pool_size=2**30,
+...     maximum_pool_size=2**32
+... )
+>>> rmm.mr.set_current_device_resource(pool)
+```
+
+Other MemoryResources include:
+
+* `FixedSizeMemoryResource` for allocating fixed blocks of memory
+* `BinningMemoryResource` for allocating blocks within specified "bin" sizes from different memory
+resources
+
+MemoryResources are highly configurable and can be composed together in different ways.
+See `help(rmm.mr)` for more information.
+
+### Using RMM with CuPy
+
+You can configure [CuPy](https://cupy.dev/) to use RMM for memory
+allocations by setting the CuPy CUDA allocator to
+`rmm_cupy_allocator`:
+
+```python
+>>> import rmm
+>>> import cupy
+>>> cupy.cuda.set_allocator(rmm.rmm_cupy_allocator)
+```
+
+### Using RMM with Numba
+
+You can configure Numba to use RMM for memory allocations using the
+Numba [EMM Plugin](http://numba.pydata.org/numba-doc/latest/cuda/external-memory.html#setting-the-emm-plugin).
+
+This can be done in two ways:
+
+1. Setting the environment variable `NUMBA_CUDA_MEMORY_MANAGER`:
+
+  ```python
+  $ NUMBA_CUDA_MEMORY_MANAGER=rmm python (args)
+  ```
+
+2. Using the `set_memory_manager()` function provided by Numba:
+
+  ```python
+  >>> from numba import cuda
+  >>> import rmm
+  >>> cuda.set_memory_manager(rmm.RMMNumbaManager)
+  ```
diff --git a/python/docs/conf.py b/python/docs/conf.py
diff --git a/python/docs/index.rst b/python/docs/index.rst