Skip to content

Fix CCCL main compatibility: explicit alignment and NVCC host/device diagnostics#2343

Merged
rapids-bot[bot] merged 1 commit intorapidsai:mainfrom
bdice:fix/cccl-main-compat
Apr 1, 2026
Merged

Fix CCCL main compatibility: explicit alignment and NVCC host/device diagnostics#2343
rapids-bot[bot] merged 1 commit intorapidsai:mainfrom
bdice:fix/cccl-main-compat

Conversation

@bdice
Copy link
Copy Markdown
Collaborator

@bdice bdice commented Apr 1, 2026

Description

CCCL main (3.4.0) introduces two breaking changes that affect RMM:

  1. Deprecated default-alignment overloads: allocate_sync, deallocate_sync, allocate, and deallocate on resource_ref types no longer provide a default alignment parameter. Under -Werror, the deprecation warning becomes a build failure.

  2. basic_resource_ref copy constructor is __host__-only: thrust_allocator inherits __host__ __device__ from thrust::device_malloc_allocator, so its implicitly-generated copy constructor calls the __host__-only cccl_async_resource_ref copy constructor, triggering NVCC error 20011.

Changes

  • cpp/include/rmm/detail/cccl_adaptors.hpp: Pass rmm::CUDA_ALLOCATION_ALIGNMENT explicitly to all forwarded allocate/deallocate/allocate_sync/deallocate_sync calls on the underlying resource_ref, avoiding the deprecated default-alignment codepath.

  • cpp/include/rmm/mr/thrust_allocator_adaptor.hpp: Provide explicit copy/move constructors with RMM_EXEC_CHECK_DISABLE so the #pragma nv_exec_check_disable has a function body to attach to (it has no effect on = default). This mirrors the pattern used for device_uvector's destructor and move constructor.

  • cpp/tests/mr/resource_ref_conversion_tests.cpp: Pass explicit alignment to allocate_sync/deallocate_sync on the type-erased synchronous_resource_ref to avoid the same deprecation error.

Validated by building with CCCL main (clean 119/119 targets including tests and benchmarks).

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

…diagnostics

CCCL main (3.4.0) deprecates default-alignment overloads of
allocate_sync/deallocate_sync/allocate/deallocate on resource_ref types,
promoting the deprecation warning to an error under -Werror. Additionally,
basic_resource_ref's copy constructor is __host__-only, which causes NVCC
error 20011 when thrust_allocator's implicitly-generated copy constructor
(inherited as __host__ __device__ from thrust::device_malloc_allocator)
attempts to copy the cccl_async_resource_ref member.

cccl_adaptors.hpp: Pass rmm::CUDA_ALLOCATION_ALIGNMENT explicitly to all
forwarded allocate/deallocate calls on the underlying resource_ref, avoiding
the deprecated default-alignment codepath.

thrust_allocator_adaptor.hpp: Provide explicit copy/move constructors with
RMM_EXEC_CHECK_DISABLE so the pragma has a function body to attach to
(it has no effect on = default). This mirrors the pattern used for
device_uvector's destructor and move constructor.

resource_ref_conversion_tests.cpp: Pass explicit alignment to
allocate_sync/deallocate_sync on the type-erased synchronous_resource_ref
to avoid the same deprecation error.
@bdice bdice requested a review from a team as a code owner April 1, 2026 18:46
@bdice bdice requested review from lamarrr and vyasr April 1, 2026 18:46
@bdice bdice added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Apr 1, 2026
@bdice bdice moved this to In Progress in RMM Project Board Apr 1, 2026
@bdice bdice self-assigned this Apr 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Corrected memory alignment in resource allocation operations across adapters.
    • Ensured proper initialization and assignment semantics for allocator objects.
  • Refactor

    • Updated allocation operations to explicitly specify memory alignment parameters.

Walkthrough

This PR updates memory allocation and deallocation methods across RMM wrapper classes to explicitly pass a rmm::CUDA_ALLOCATION_ALIGNMENT argument to underlying resource calls, and refactors constructors and assignment operators in the thrust allocator adaptor with execution check disable annotations.

Changes

Cohort / File(s) Summary
CCCL Adaptor Alignment Forwarding
cpp/include/rmm/detail/cccl_adaptors.hpp
Added include for <rmm/aligned.hpp> and updated allocate_sync, deallocate_sync, allocate, and deallocate methods in both cccl_resource_ref and cccl_async_resource_ref to pass rmm::CUDA_ALLOCATION_ALIGNMENT as an explicit alignment argument to underlying resource calls.
Thrust Allocator Adaptor Constructor Refactoring
cpp/include/rmm/mr/thrust_allocator_adaptor.hpp
Converted default constructor from = default to explicit empty body, added explicit copy and move constructors with proper member initialization, added defaulted copy and move assignment operators, and annotated all constructors and templated copy constructor with RMM_EXEC_CHECK_DISABLE. Added include for <rmm/detail/exec_check_disable.hpp>.
Test Alignment Updates
cpp/tests/mr/resource_ref_conversion_tests.cpp
Updated ForwardPropertyAdaptor::TypeEraseSyncAdaptor test to pass explicit rmm::CUDA_ALLOCATION_ALIGNMENT argument in allocate_sync and deallocate_sync calls.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • rongou
  • PointKernel
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main changes: fixing CCCL main compatibility by adding explicit alignment parameters and handling NVCC host/device diagnostics.
Description check ✅ Passed The description comprehensively explains the two breaking changes from CCCL main and maps each change to the specific file modifications, making it clearly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cpp/include/rmm/mr/thrust_allocator_adaptor.hpp (1)

9-17: Add explicit #include <utility> for std::move in public header.

This header uses std::move in the move constructor (lines 104, 106) but does not directly include <utility>. While transitive includes may mask this dependency, public API headers should explicitly include all headers for functionality they use.

♻️ Suggested fix
 `#include` <rmm/detail/exec_check_disable.hpp>
 `#include` <rmm/detail/export.hpp>
 `#include` <rmm/detail/thrust_namespace.h>
 `#include` <rmm/mr/per_device_resource.hpp>
 `#include` <rmm/resource_ref.hpp>

 `#include` <thrust/device_malloc_allocator.h>
 `#include` <thrust/device_ptr.h>
 `#include` <thrust/memory.h>
+#include <utility>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/rmm/mr/thrust_allocator_adaptor.hpp` around lines 9 - 17, This
header uses std::move in the move constructor of thrust_allocator_adaptor but
doesn't include <utility>; add an explicit `#include` <utility> to the top of
cpp/include/rmm/mr/thrust_allocator_adaptor.hpp so the public header directly
provides the declaration for std::move (ensuring the move constructor code in
thrust_allocator_adaptor compiles even without transitive includes).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/include/rmm/mr/thrust_allocator_adaptor.hpp`:
- Around line 122-125: The templated rebind constructor
thrust_allocator(thrust_allocator<U> const& other) incorrectly initializes _mr
from other.resource() instead of copying the allocator's underlying PMR pointer;
change the initialization to copy other._mr (or call
other.get_upstream_resource() / other._mr) so the rebound allocator preserves
the exact upstream RMM resource used by non-templated constructors and avoid
drifting to a different resource; update the initializer list for
thrust_allocator<U> to use other._mr (or other.get_upstream_resource()) while
keeping stream() and _device as-is.

---

Nitpick comments:
In `@cpp/include/rmm/mr/thrust_allocator_adaptor.hpp`:
- Around line 9-17: This header uses std::move in the move constructor of
thrust_allocator_adaptor but doesn't include <utility>; add an explicit `#include`
<utility> to the top of cpp/include/rmm/mr/thrust_allocator_adaptor.hpp so the
public header directly provides the declaration for std::move (ensuring the move
constructor code in thrust_allocator_adaptor compiles even without transitive
includes).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc5ecd2c-5d99-48ec-9880-4c1111d4b3a7

📥 Commits

Reviewing files that changed from the base of the PR and between 68ec86f and c3c3aeb.

📒 Files selected for processing (3)
  • cpp/include/rmm/detail/cccl_adaptors.hpp
  • cpp/include/rmm/mr/thrust_allocator_adaptor.hpp
  • cpp/tests/mr/resource_ref_conversion_tests.cpp

Comment on lines +122 to 125
RMM_EXEC_CHECK_DISABLE
template <typename U>
thrust_allocator(thrust_allocator<U> const& other)
: _mr(other.resource()), _stream{other.stream()}, _device{other._device}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Rebind construction should copy the allocator’s _mr, not resource().

Line 125 initializes _mr from other.resource(), but this class’s actual upstream RMM state is _mr/get_upstream_resource(). The non-templated constructors only populate _mr, so rebinding from thrust_allocator<U> can drift away from the original RMM resource here.

🛠️ Proposed fix
   template <typename U>
   thrust_allocator(thrust_allocator<U> const& other)
-    : _mr(other.resource()), _stream{other.stream()}, _device{other._device}
+    : _mr(other.get_upstream_resource()), _stream{other.stream()}, _device{other._device}
   {
   }

Based on learnings: Memory resources must maintain PMR (polymorphic memory resource) compatibility; follow standard PMR interface contracts.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RMM_EXEC_CHECK_DISABLE
template <typename U>
thrust_allocator(thrust_allocator<U> const& other)
: _mr(other.resource()), _stream{other.stream()}, _device{other._device}
RMM_EXEC_CHECK_DISABLE
template <typename U>
thrust_allocator(thrust_allocator<U> const& other)
: _mr(other.get_upstream_resource()), _stream{other.stream()}, _device{other._device}
{
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/rmm/mr/thrust_allocator_adaptor.hpp` around lines 122 - 125, The
templated rebind constructor thrust_allocator(thrust_allocator<U> const& other)
incorrectly initializes _mr from other.resource() instead of copying the
allocator's underlying PMR pointer; change the initialization to copy other._mr
(or call other.get_upstream_resource() / other._mr) so the rebound allocator
preserves the exact upstream RMM resource used by non-templated constructors and
avoid drifting to a different resource; update the initializer list for
thrust_allocator<U> to use other._mr (or other.get_upstream_resource()) while
keeping stream() and _device as-is.

@bdice
Copy link
Copy Markdown
Collaborator Author

bdice commented Apr 1, 2026

/merge

@rapids-bot rapids-bot bot merged commit 6833f72 into rapidsai:main Apr 1, 2026
87 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in RMM Project Board Apr 1, 2026
@bdice bdice mentioned this pull request Apr 3, 2026
3 tasks
bdice added a commit to bdice/rmm that referenced this pull request Apr 4, 2026
bdice added a commit to bdice/rmm that referenced this pull request Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants