Add RMM User Guide by bdice · Pull Request #2087 · rapidsai/rmm

bdice · 2025-10-11T15:41:03Z

Description

Adds a comprehensive User Guide for RMM and expands the existing programming guide.

Contributes to #1562, #1694, #2035.

New pages:

Introduction — Overview of RMM's purpose and key abstractions
Installation — Build and install instructions for C++ and Python
Choosing a Memory Resource — Decision guide for selecting the right MR
Pool Allocators — PoolMemoryResource, ArenaMemoryResource, BinningMemoryResource configuration and best practices
Stream-Ordered Allocation — Async allocation patterns and stream safety
Managed Memory — Unified memory usage with prefetching strategies
Logging and Profiling — Allocation logging, statistics tracking, and the rmm.statistics profiler

Expanded:

Programming Guide — Memory resources, containers, adaptors, library integrations (CuPy, Numba, PyTorch, Thrust), multi-device usage

All C++ examples use the 26.06 API: set_current_device_resource_ref(), pass-by-value adaptor constructors, get_bytes_counter()/get_allocations_counter(), required stream arguments for device_buffer, and copyable value-type resources.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2025-10-11T15:41:06Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

docs/user_guide/managed_memory.md

docs/user_guide/choosing_memory_resources.md

bdice · 2025-11-11T04:17:44Z

I've broken this PR into a few parts that are ready to merge:

After #2137 merges, I will start breaking up the new user guide documents into their own PRs.

This PR improves a few small issues in the Python documentation. Split off from #2087. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #2139

This is split off of #2087. I am overhauling the RMM documentation. This is the first set of changes, which includes a new theme and a reorganization of the C++ docs. All docs now use Markdown / Myst. The next phases will include docstring tweaks to fix various formatting/cross-linking issues (see #2138 and #2139 for current progress on this), an expansion of the Python API docs, and adding user guides for various features. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Matthew Murray (https://github.com/Matt711) - Rong Ou (https://github.com/rongou) - Jake Awe (https://github.com/AyodeAwe) URL: #2137

This PR improves a few small issues in the C++ documentation. Split off from #2087. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) URL: #2138

Update all C++ code examples to use set_current_device_resource_ref() instead of set_current_device_resource(&ptr), pass resource refs by value to adaptor constructors, use get_bytes_counter/get_allocations_counter instead of fictional get_statistics(), add compute-sanitizer UM flags, fix managed_memory multi-GPU example, improve choosing_memory_resources managed memory example with PrefetchResourceAdaptor, and fix incorrect upstream= keyword args in pool_allocators.md.

coderabbitai · 2026-04-02T05:17:50Z

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Summary by CodeRabbit

Release Notes

Documentation
- Added comprehensive user guide with programming guide, installation instructions, memory resource selection guidance, and configuration examples.
- Added documentation for stream-ordered allocation, managed memory, pool allocators, and logging capabilities.
New Features
- Improved memory resource architecture with enhanced composability and shared ownership semantics.
- Updated resource APIs for better consistency across synchronous and asynchronous allocation patterns.

Walkthrough

This PR migrates RMM's memory resource architecture from a custom device_memory_resource base class to CUDA C++ Core Library (CCCL) memory resource concepts. The refactoring removes the legacy base class, replaces it with a cuda::mr::shared_resource<...impl> pattern across all resources and adaptors, updates allocation APIs to include alignment parameters, splits implementations into detail headers and source files, and updates the per-device resource API to use type-erased async resource references instead of raw pointers. Extensive test and documentation updates accompany the changes.

Changes

Cohort / File(s)	Summary
Core Architecture Removal `cpp/include/rmm/mr/device_memory_resource.hpp`, `cpp/include/rmm/detail/cccl_adaptors.hpp`	Removed legacy `device_memory_resource` base class and CCCL adaptor wrappers that previously bridged RMM and CCCL resource types.
Memory Resource Base/Core Headers `cpp/include/rmm/mr/cuda_memory_resource.hpp`, `cpp/include/rmm/mr/managed_memory_resource.hpp`, `cpp/include/rmm/mr/pinned_host_memory_resource.hpp`, `cpp/include/rmm/mr/system_memory_resource.hpp`	Converted from inheriting `device_memory_resource` to direct CCCL-style implementations with `allocate(cuda::stream_ref, bytes, alignment)` / `deallocate(...)` methods, property hooks, and equality operators.
Memory Resource Adaptor Headers `cpp/include/rmm/mr/aligned_resource_adaptor.hpp`, `cpp/include/rmm/mr/arena_memory_resource.hpp`, `cpp/include/rmm/mr/binning_memory_resource.hpp`, `cpp/include/rmm/mr/callback_memory_resource.hpp`, `cpp/include/rmm/mr/failure_callback_resource_adaptor.hpp`, `cpp/include/rmm/mr/limiting_resource_adaptor.hpp`, `cpp/include/rmm/mr/logging_resource_adaptor.hpp`, `cpp/include/rmm/mr/pool_memory_resource.hpp`, `cpp/include/rmm/mr/prefetch_resource_adaptor.hpp`, `cpp/include/rmm/mr/statistics_resource_adaptor.hpp`, `cpp/include/rmm/mr/tracking_resource_adaptor.hpp`, `cpp/include/rmm/mr/thread_safe_resource_adaptor.hpp`	Converted from templated device_memory_resource-derived classes to non-templated classes deriving from `cuda::mr::shared_resource<detail::..._impl>`, enabling shared ownership and implementation delegation.
Async CUDA Memory Resources `cpp/include/rmm/mr/cuda_async_memory_resource.hpp`, `cpp/include/rmm/mr/cuda_async_view_memory_resource.hpp`, `cpp/include/rmm/mr/cuda_async_managed_memory_resource.hpp`	Refactored to derive from `cuda::mr::shared_resource<...impl>`, exposing async allocation/deallocation APIs with stream references and alignment, plus synchronous variants.
Fixed-Size & Stream-Ordered Resources `cpp/include/rmm/mr/fixed_size_memory_resource.hpp`, `cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp`	Updated to use new shared_resource pattern and cuda::stream_ref-based APIs; removed inheritance from device_memory_resource.
Implementation Detail Headers `cpp/include/rmm/mr/detail/_impl.hpp` (aligned, arena, binning, callback, cuda_async, failure_callback, fixed_size, limiting, logging, pool, prefetch, sam_headroom, statistics, tracking, thread_safe)	Added new implementation headers defining non-copyable/move-deleted classes handling allocation logic, upstream storage, and property/equality semantics.
Resource Reference & Per-Device APIs `cpp/include/rmm/resource_ref.hpp`, `cpp/include/rmm/mr/per_device_resource.hpp`	Removed CCCL adaptor dependencies and updated type aliases to directly use `cuda::mr` refs; changed per-device setters to return `any_resource` instead of raw pointers.
Source Implementation Files `cpp/src/mr/.cpp`, `cpp/src/mr/detail/.cpp`	Added constructor/accessor implementations delegating to underlying `cuda::mr::make_shared_resource<...impl>(...)` and forwarding calls to shared implementations.
Memory Resource Support `cpp/include/rmm/detail/export.hpp`	Added `RMM_CONSTEXPR_FRIEND` macro for Doxygen-compatible property friend declarations.
Stream CUDA Compilation `cpp/CMakeLists.txt`	Added `src/cuda_stream.cpp` and numerous new detail/implementation source files to the rmm library build.
Test Updates `cpp/tests/CMakeLists.txt`, `cpp/tests/mr/.hpp`, `cpp/tests/mr/.cpp`, `cpp/tests/mr/.cu`, `cpp/tests/mock_resource.hpp`, `cpp/tests/_tests.cpp`	Refactored test fixtures, mocks, and benchmarks to use new allocation/deallocation APIs with alignment, removed device_memory_resource dependencies, added new CCCL-based test suites, and disabled alignment-related failure tests.
Benchmark Updates `cpp/benchmarks//...bench.`	Changed from `std::shared_ptr<device_memory_resource>` and owning_wrapper patterns to `cuda::mr::any_resource<device_accessible>` type-erased references.
Python Bindings `python/rmm/rmm/librmm/.pxd`, `python/rmm/rmm/pylibrmm/.pyx`, `python/rmm/rmm/pylibrmm/memory_resource/*.pyx`	Updated Cython declarations to use `device_async_resource_ref` instead of `device_memory_resource*`, changed internal storage from `shared_ptr` to `unique_ptr` with `make_device_async_resource_ref` helpers, and updated per-device APIs.
Documentation `docs/user_guide/*.md`, `docs/conf.py`, `docs/index.md`	Added comprehensive user guides (introduction, installation, programming guide, choosing resources, stream allocation, managed memory, pool allocators, logging), updated documentation index and configuration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Rationale: This refactoring is substantial and pervasive, affecting 500+ files across core headers, implementations, tests, benchmarks, Python bindings, and documentation. While the pattern is consistent (device_memory_resource → shared_resource delegation), the scope is massive and verifying correctness requires checking: (1) proper impl delegation in each resource type, (2) consistent allocation/deallocation signatures and alignment handling across all resources/adaptors, (3) property and equality operator correctness, (4) test suite adequacy and mock updates, (5) Python binding type correctness, and (6) documentation accuracy. The changes are heterogeneous enough that each resource/adaptor requires separate reasoning despite following a common pattern.

Possibly related issues

[FEA] Expose CUDA 13 async pools for managed and pinned memory #2054: Main changes add cuda_async_managed_memory_resource implementation (using cudaMemGetDefaultMemPool and CUDA 13+ pool APIs), directly implementing the feature requested in this issue.
Add migration guide for RMM 26.04 → 26.06 breaking changes #2345: Main changes complete the CCCL-native memory resource migration (removing device_memory_resource, updating resource types/signatures, and adding device_async_resource_ref wrappers), which are the exact breaking changes documented in this migration guide issue.
Leaf MRs do not enforce alignment limits after device_memory_resource removal #2342: Main changes address alignment regression by adding aligned_resource_adaptor implementation and updating leaf MR allocate/deallocate signatures to accept alignment, directly resolving this issue.
[DOC] Recommend async memory resource more strongly as a default #2035: Main changes add new user guide pages (including "Choosing a Memory Resource") explicitly recommending CudaAsyncMemoryResource as the default, directly addressing the preference guidance requested in this issue.
[DOC] Add documentation page on stream-ordered allocation / deallocation #2015: Main changes include new "stream_ordered_allocation.md" documentation page covering stream-ordered allocation semantics, directly implementing the documentation requested in this issue.
[DOC] Explain managed memory behavior and prefetching design #2033: Main changes include new managed memory documentation and refactored managed memory/prefetch/adaptor APIs, directly addressing the documentation request about managed memory behavior and prefetching.

Possibly related PRs

Fix ABA problem in tracking resource adaptor and statistics resource adaptor #2304: Related via shared modifications to tracking/statistics memory resource adaptors (changing allocation/deallocation ordering and internal tracking logic).
Reduce default pool sizes in Python tests to speed up suite #2273: Related via overlapping changes to pool_memory_resource constructor signatures and initial pool size handling.
Forward-merge release/26.04 into main #2310: Related via intersecting code-level changes to tracking_resource_adaptor and statistics_resource_adaptor implementations.

Suggested labels

breaking, improvement

Suggested reviewers

gforsyth
lamarrr

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

⚔️ Resolve merge conflicts

Resolve merge conflict in branch docs-overhaul

…ue-type resources

Lead with performance rationale, link Unified Memory and oversubscription to CUDA Programming Guide, trim CudaMemoryResource to minimal description, remove redundant Performance Considerations section, add logging adaptor to compositions and best practices.

…uide and API reference

…pport

…structions

…gration

bdice · 2026-04-03T21:23:16Z

docs/user_guide/logging.md

I like this information but I want to have an agent verify that all the code will compile and run.

I haven't used it, but myst does support a {code-cell} directive: https://mystmd.org/guide/notebooks-with-markdown#code-cell

But I don't know how that would work for C++ (I know there are jupyter kernels for C++), or whether we want to run these every time we build the docs (probably not).

bdice · 2026-04-03T21:23:59Z

docs/user_guide/managed_memory.md

Some of this overlaps with the recently-improved CUDA Programming Guide: https://docs.nvidia.com/cuda/cuda-programming-guide/04-special-topics/unified-memory.html

Reviewers: What do you think about this? Keep? Reduce? Delete? Please leave signpost comments on the parts that you think are valuable to mention in the user guide.

bdice · 2026-04-03T21:25:47Z

docs/user_guide/pool_allocators.md

I'm inclined to delete this page. There are a few tiny pieces of this that might be valuable, but they should probably be copied into other pages.

Reviewers: What do you think? Please leave signpost comments on the parts that you think are valuable to mention in the user guide.

bdice · 2026-04-03T21:26:50Z

docs/user_guide/stream_ordered_allocation.md

As above, some of this could be deleted if we point to the CUDA Programming Guide on Asynchronous Execution. https://docs.nvidia.com/cuda/cuda-programming-guide/02-basics/asynchronous-execution.html

Reviewers: What do you think? Please leave signpost comments on the parts that you think are valuable to mention in the user guide.

wence-

This needs a huge amount of work

wence- · 2026-04-07T08:58:44Z

docs/user_guide/choosing_memory_resources.md

+`````{tabs}
+````{code-tab} c++
+#include <rmm/mr/cuda_async_memory_resource.hpp>
+#include <rmm/mr/per_device_resource.hpp>
+
+rmm::mr::cuda_async_memory_resource mr;
+rmm::mr::set_current_device_resource_ref(mr);
+````
+````{code-tab} python
+import rmm
+
+mr = rmm.mr.CudaAsyncMemoryResource()
+rmm.mr.set_current_device_resource(mr)
+````
+`````


Can we avoid pushing the set_current_device_resource model in our examples? As we're seeing in all the libraries we have, it is best to manage resources explicitly (not for lifetime reasons, now we have any_resource ownership).

wence- · 2026-04-07T08:59:50Z

docs/user_guide/choosing_memory_resources.md

+
+// Use 80% of GPU memory, rounded down to nearest 256 bytes
+auto [free_memory, total_memory] = rmm::available_device_memory();
+std::size_t pool_size = (static_cast<std::size_t>(total_memory * 0.8) / 256) * 256;


rmm::align_down is a public function.

wence- · 2026-04-07T09:02:07Z

docs/user_guide/choosing_memory_resources.md

+rmm::mr::managed_memory_resource managed_mr;
+rmm::mr::pool_memory_resource pool_mr{managed_mr, pool_size};


Question: Should we not be recommending cuda_async_managed_memory_resouce (at least on cuda-13)?

That feature is currently experimental. I am seeing mixed results on performance, and addressing those with the CUDA memory team. Long term this should be the preferred direction.

wence- · 2026-04-07T09:02:36Z

docs/user_guide/choosing_memory_resources.md

+#include <rmm/mr/cuda_async_memory_resource.hpp>
+#include <rmm/mr/per_device_resource.hpp>
+
+rmm::mr::cuda_async_memory_resource mr;


question: Should we be recommending the "default" async pool, rather than this one that makes its own mempool?

No. We need the custom mempool to enable Blackwell decompression engine support and a custom release threshold. We don’t want to alter the flags on the default mempool.

OK, we should explain this, because this is a trade-off.

wence- · 2026-04-07T09:04:27Z

docs/user_guide/choosing_memory_resources.md

+
+### CudaAsyncMemoryResource
+
+The `CudaAsyncMemoryResource` uses CUDA's driver-managed memory pool (via `cudaMallocAsync`). This is the **recommended default** for most applications.


This is a correct, but misleading, statement.

It use a driver-managed pool. But, crucially, does not use the default mempool for the device.

Yes, we can clarify that.

wence- · 2026-04-07T10:23:43Z

docs/user_guide/stream_ordered_allocation.md

+# Synchronize to ensure allocation completes
+stream.synchronize()
+
+# Now safe to do CPU operations with buffer.ptr


This is misleading. The ptr is always available on the CPU immediately, even when stream ordered.

wence- · 2026-04-07T10:24:06Z

docs/user_guide/stream_ordered_allocation.md

+
+# Create a pool that maintains stream-ordered semantics
+pool = rmm.mr.PoolMemoryResource(
+    rmm.mr.CudaAsyncMemoryResource(),  # stream-ordered upstream


Upstream doesn't have to be stream ordered. And again, I think we shouldn't be advocating Pool around CudaAsync

wence- · 2026-04-07T10:25:53Z

docs/user_guide/stream_ordered_allocation.md

+with stream:
+    kernel[100, 10](cuda.as_cuda_array(buffer).view('float32'), 1000)


This doesn't launch the kernel on stream, but rather the default stream.

wence- · 2026-04-07T10:26:57Z

docs/user_guide/stream_ordered_allocation.md

+   # BAD: May access uninitialized memory
+   # some_function(buffer.ptr)
+
+   # GOOD: Synchronize first
+   stream.synchronize()
+   some_function(buffer.ptr)


This is misleading, for the same reason above. The ptr is always valid on the CPU. So "accessing" from the CPU is a meaningless statement.

wence- · 2026-04-07T10:27:46Z

docs/user_guide/stream_ordered_allocation.md

+   ```python
+   stream = rmm.cuda_stream()
+
+   def allocate_and_use():
+       buffer = rmm.DeviceBuffer(size=1000, stream=stream)
+       # Launch kernel using buffer
+       kernel[...](buffer.ptr)
+       # BAD: buffer is deallocated when function returns
+       # but kernel may still be running!
+
+   allocate_and_use()
+   stream.synchronize()  # May crash - buffer already freed
+   ```
+
+   Fix: Keep buffer alive until synchronization:
+
+   ```python
+   stream = rmm.cuda_stream()
+   buffer = allocate_and_use()  # Return the buffer
+   stream.synchronize()  # Now safe
+   buffer = None  # Explicit cleanup after sync
+   ```


This is wrong. If kernel is launched on stream then there is no problem.

TomAugspurger · 2026-04-07T11:43:52Z

docs/user_guide/introduction.md

+The choice of resource determines the underlying type of memory and thus its accessibility from host or device.
+For example, the `cuda_async_memory_resource` uses a pool of memory managed by the CUDA driver.
+This resource is recommended for most applications, because of its performance and support for asynchrous (stream-ordered) allocations. See [Stream-Ordered Allocation](stream_ordered_allocation.md) for details.
+As another example, the `managed_memory_resource` provides unified memory for CPU+GPU, and is recommended for applications exceeding the available GPU memory.
+
+See [Choosing a Memory Resource](choosing_memory_resources.md) for guidance on the available memory resources, performance considerations, and how they fit into efficient CUDA application design strategies.
+[NVIDIA Nsight™ Systems](https://developer.nvidia.com/nsight-systems) can be used to profile memory resource performance.


Agreed, but I do appreciate the link to "Choosing a memory resource". "Which one should I pick" is a natural first question to ask.

Having just that, after defining a memory resource, would be sufficient.

TomAugspurger · 2026-04-07T11:45:12Z

docs/user_guide/introduction.md

+Resource adaptors wrap and add functionality to existing resources.
+For example, the `statistics_resource_adaptor` can be used to track allocation statistics.
+The `logging_resource_adaptor` logs allocations to a CSV file.
+Adaptors are composable - wrap multiple adaptors for combined functionality.


Use "Resource adaptors" here, instead of just "Adaptors"? Or are we using those interchangeably?

TomAugspurger · 2026-04-07T11:47:24Z

docs/user_guide/introduction.md

+
+### 3. Containers
+
+RMM provides [RAII](https://en.cppreference.com/w/cpp/language/raii.html) container classes that manage memory lifetime.


Note that this is C++ specific? Or generalize things a bit to apply to python or C++.

TomAugspurger · 2026-04-07T11:49:18Z

docs/user_guide/introduction.md

+Memory resources aim to serve the needs of a wide range of applications, from data science and machine learning to high-performance simulation.
+
+RMM's memory resources leverage CUDA features like **stream-ordered** (asynchronous) pipeline parallelism, **managed** memory (also known as unified virtual memory, UVM), and **pinned** memory, making it easier to write complex workflows that optimally use both device and host memory.
+The integrations provided in RMM allow memory resources to benefit memory management across libraries frequently used together, such as **PyTorch** and **RAPIDS**.


"allow memory resources to benefit memory management" is a bit awkward.

Maybe "RMM provides integrations with other GPU libraries, enabling uniform memory handling for your entire application." or something like that.

And maybe link to the "Integration with GPU libraries below."

TomAugspurger · 2026-04-08T16:28:53Z

docs/user_guide/logging.md

I haven't used it, but myst does support a {code-cell} directive: https://mystmd.org/guide/notebooks-with-markdown#code-cell

But I don't know how that would work for C++ (I know there are jupyter kernels for C++), or whether we want to run these every time we build the docs (probably not).

TomAugspurger · 2026-04-08T16:32:52Z

docs/user_guide/logging.md

+
+### Python: Using Memory Event Logging
+
+Enable logging by wrapping your memory resource with `LoggingResourceAdaptor`:


General comment, we should be able to link to Python API docs with something like

{ref}`rmm.mr.LoggingResourceAdaptor`

(assuming we're building this with myst, which I think we are).

I'm not sure about the c++ side.

TomAugspurger · 2026-04-08T16:36:58Z

docs/user_guide/managed_memory.md

+These page faults can significantly impact performance, especially for:
+- First-touch access patterns
+- Random memory access
+- Large datasets that don't fit in GPU memory


I don't understand this third bullet, since I'd assume that "larger than GPU memory" is a precondition for the page fault?

My best guess is that this suggests something about repeated page faults as subsets of a large dataset are paged in and out of GPU memory?

github-project-automation bot added this to RMM Project Board Oct 11, 2025

GregoryKimball reviewed Oct 17, 2025

View reviewed changes

docs/user_guide/managed_memory.md Show resolved Hide resolved

GregoryKimball reviewed Oct 17, 2025

View reviewed changes

docs/user_guide/choosing_memory_resources.md Outdated Show resolved Hide resolved

GregoryKimball reviewed Oct 17, 2025

View reviewed changes

docs/user_guide/choosing_memory_resources.md Outdated Show resolved Hide resolved

GregoryKimball added this to libcudf Oct 17, 2025

GregoryKimball moved this to Burndown in libcudf Oct 17, 2025

This was referenced Nov 11, 2025

Use PyData docs theme, reorganize C++ docs #2137

Merged

Improve C++ docs #2138

Merged

Improve Python docs #2139

Merged

bdice removed the status in libcudf Nov 11, 2025

bdice self-assigned this Jan 9, 2026

bdice added 2 commits April 2, 2026 01:44

Add RMM User Guide (draft from docs-overhaul branch)

841414e

bdice force-pushed the docs-overhaul branch from 770ee19 to 2a1f4b8 Compare April 2, 2026 04:55

bdice changed the base branch from main to staging April 2, 2026 04:56

Remove autodoc section for deleted rmm.pylibrmm.cuda_stream module

0466c81

bdice added 8 commits April 3, 2026 14:40

Align user guide with 26.06 migration: compiled lib, stream args, val…

5a2bbaa

…ue-type resources

Replace incomplete Available Resources table with links to choosing g…

c00a8c8

…uide and API reference

Add Python API reference link to Available Resources section

91fb555

Merge Per-Device Resources into Multi-Device Usage, add Python API link

03de6a4

Replace hardcoded system requirements with link to RAPIDS Platform Su…

4b7edb6

…pport

Consolidate duplicate conda environment setup in build-from-source in…

6ea60a8

…structions

Use 26.06

fc681f1

bdice added 2 commits April 3, 2026 21:17

Match scope of C++/Python basic examples, add allocation to CuPy inte…

d4d8ea6

…gration

Rename base to base_mr in logging.md examples

5a51696

bdice commented Apr 3, 2026

View reviewed changes

bdice added doc Documentation non-breaking Non-breaking change labels Apr 3, 2026

bdice marked this pull request as ready for review April 3, 2026 21:28

bdice requested a review from GregoryKimball April 3, 2026 21:28

Merge branch 'staging' into docs-overhaul

db54ca6

wence- requested changes Apr 7, 2026

View reviewed changes

github-project-automation bot moved this to Review in RMM Project Board Apr 7, 2026

TomAugspurger reviewed Apr 9, 2026

View reviewed changes

		rmm::mr::managed_memory_resource managed_mr;
		rmm::mr::pool_memory_resource pool_mr{managed_mr, pool_size};


		### CudaAsyncMemoryResource

		The `CudaAsyncMemoryResource` uses CUDA's driver-managed memory pool (via `cudaMallocAsync`). This is the recommended default for most applications.

		with stream:
		kernel[100, 10](cuda.as_cuda_array(buffer).view('float32'), 1000)


		### 3. Containers

		RMM provides [RAII](https://en.cppreference.com/w/cpp/language/raii.html) container classes that manage memory lifetime.


		### Python: Using Memory Event Logging

		Enable logging by wrapping your memory resource with `LoggingResourceAdaptor`:

Conversation

bdice commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

copy-pr-bot bot commented Oct 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bdice commented Nov 11, 2025

Uh oh!

coderabbitai bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wence- left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomAugspurger Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

bdice commented Oct 11, 2025 •

edited

Loading

coderabbitai bot commented Apr 2, 2026 •

edited

Loading

TomAugspurger Apr 7, 2026 •

edited

Loading