NIXL EP: Use VMM API for device memory allocation. by ofirfarjun7 · Pull Request #1415 · ai-dynamo/nixl

ofirfarjun7 · 2026-03-08T16:45:44Z

What?

Use VMM API for device memory allocation in nixl_ep

Why?

To support multi node nvlink.

How?

Create cuda allocator wrapper.
Replace calls for cudaMalloc with new allocator.
Fallback to cudaMalloc if fabric is not supported

Summary by CodeRabbit

Refactor
- Endpoint memory management now uses virtual-memory-backed allocation throughout, replacing direct device allocations for more robust initialization, teardown, and automatic cleanup.
New Features
- Added support for virtual memory / RDMA-backed device regions, enabling larger, more flexible buffers and improved interoperability with advanced device capabilities.

copy-pr-bot · 2026-03-08T16:45:47Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-08T16:45:52Z

👋 Hi ofirfarjun7! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

coderabbitai · 2026-03-08T16:50:31Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds CUDA Driver VMM-backed allocation support: introduces vmm_region, vmm_init(size_t, CUdevice) and vmm_free(vmm_region&), replaces direct cudaMalloc/cudaFree with VMM-backed allocations in Buffer (workspace, RDMA, mask, sync, sync-count), updates init/destroy flows and memory-view integration.

Changes

Cohort / File(s)	Summary
Header: types & Buffer fields `examples/device/ep/csrc/nixl_ep.hpp`	Adds `struct vmm_region { CUdeviceptr ptr; size_t size; CUmemGenericAllocationHandle handle; bool is_cuda_malloc; }`; adds private `vmm_region` members to `Buffer` for workspace, rdma, mask, sync, and sync_count; adds `<cuda.h>`, `<cuda_runtime.h>`, and `<stdexcept>` includes.
Source: VMM helpers & allocator logic `examples/device/ep/csrc/nixl_ep.cpp`	Introduces `vmm_init(size_t, CUdevice)` and `vmm_free(vmm_region&)`, internal `cuda_alloc_ctx`, device capability/granularity checks, and fallback to `cudaMalloc`; replaces prior `cudaMalloc`/`cudaFree` with VMM-backed allocations and stores regions in new `m_*_alloc` members.
Buffer lifecycle & memory-view integration `examples/device/ep/csrc/...`	Updates `Buffer::init` and `Buffer::destroy` to allocate/free via `vmm_init`/`vmm_free`, assign pointer fields from `m_*_alloc.ptr`, reset pointers to `nullptr`, and adapt calls to `_nixl_ep_memory_views_create` / `_nixl_ep_memory_views_destroy` to use VMM regions.
Includes & compilation `examples/device/ep/csrc/...`	Adds CUDA Driver API includes and runtime headers required for VMM and driver calls.

Sequence Diagram(s)

mermaid
sequenceDiagram
rect rgba(200,200,255,0.5)
participant App as Buffer (app)
end
rect rgba(200,255,200,0.5)
participant Driver as CUDA Driver
end
rect rgba(255,200,200,0.5)
participant Device as GPU/Device
end
rect rgba(255,255,200,0.5)
participant Fallback as cudaMalloc/runtime
end

App->>Driver: vmm_init(size, device)
Driver->>Driver: check device attributes & granularity
alt VMM supported
Driver->>Device: CUmemCreateAllocation / map / reserve VMM
Device-->>Driver: allocation handle & device pointer
Driver-->>App: vmm_region {ptr, size, handle, is_cuda_malloc=false}
else Fallback
Driver->>Fallback: cudaMalloc(size)
Fallback-->>Driver: device pointer
Driver-->>App: vmm_region {ptr, size, handle=0, is_cuda_malloc=true}
end
App->>Driver: use ptr for buffers / create memory views
App->>Driver: vmm_free(vmm_region)
Driver->>Device: CUmemRelease / unmap (or cudaFree if fallback)
Driver-->>App: freed

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I swapped my mallocs for mapped terrain,
Handles hug regions, pointers train,
Granularity snug, no stray pain,
Views aligned along the lane,
Hop—VMM carrots in my brain 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: replacing direct CUDA allocations with VMM API for device memory allocation in NIXL EP.
Description check	✅ Passed	The PR description includes all required sections (What, Why, How) and provides sufficient detail about the changes and their purpose.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/device/ep/csrc/nixl_ep.hpp`:
- Around line 66-77: The two calls to cuDeviceGetAttribute (checking
CU_DEVICE_ATTRIBUTE_GPU_DIRECT_RDMA_WITH_CUDA_VMM_SUPPORTED and
CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED) do not check their CUresult
return values; update the code around the variables rdma_vmm_supported and
fabric_supported to capture the CUresult, test it against CUDA_SUCCESS, and on
failure throw or log a runtime_error that includes the cuGetErrorString result
and context (which attribute failed and for which device); ensure you only rely
on rdma_vmm_supported/fabric_supported after the call succeeds so you don't act
on zero-initialized values.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8bc85385-7d61-405a-90d0-e86c5ca8956c

📥 Commits

Reviewing files that changed from the base of the PR and between 1870127 and 07674ee.

📒 Files selected for processing (2)

examples/device/ep/csrc/nixl_ep.cpp
examples/device/ep/csrc/nixl_ep.hpp

examples/device/ep/csrc/nixl_ep.hpp

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/device/ep/csrc/nixl_ep.hpp`:
- Around line 102-110: The destructor ~cuda_allocator() currently unmaps and
releases VMM state without waiting for GPU work; fix by calling
cudaDeviceSynchronize() at the start of ~cuda_allocator() before any
cuMemUnmap/cuMemAddressFree/cuMemRelease calls so all in-flight
kernels/transfers are fenced; additionally, ensure allocator creation paths
cannot bypass that fence on exception by either making explicitly_destroy
default to false or wrapping allocator construction in the init paths
(_nixl_agent_init(), _nixl_ep_init(), or any init() that creates the allocator)
with a try/catch that calls cudaDeviceSynchronize() before rethrowing so
stack-unwound destructor runs safe; keep references to the methods destroy() and
~cuda_allocator() when making changes.
- Around line 56-64: The allocator currently queries the ambient CUDA context
via cuCtxGetDevice() which is unsafe; change cuda_allocator to accept an
explicit CUdevice (or device_id) parameter, set the context explicitly inside
the constructor using cuCtxSetCurrent() and handle errors (throw on failure),
then update Buffer::init call sites to pass the tracked device_id into
cuda_allocator so allocations are bound to the correct device regardless of
external context switches; ensure you remove the cuCtxGetDevice() usage in
cuda_allocator and propagate failures with clear runtime_error messages.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d14696d0-b771-4297-bed3-a7596c6c4643

📥 Commits

Reviewing files that changed from the base of the PR and between 07674ee and 7b60106.

📒 Files selected for processing (1)

examples/device/ep/csrc/nixl_ep.hpp

examples/device/ep/csrc/nixl_ep.hpp

itayalroy · 2026-03-09T09:48:30Z

examples/device/ep/csrc/nixl_ep.hpp

+            throw std::runtime_error("Failed to get CUDA allocation granularity");
+        }
+
+        init_vmm(size, device, prop, granularity);


This seems to be the only thing we need to do per allocation, which makes me believe we don't need this cuda_allocator class, we can just verify once that we can allocate using vmm and get granularity etc and then just call init_vmm directly

Can change it.
You want to get rid of the class? or make it static?

ofirfarjun7 · 2026-03-09T14:00:59Z

/build

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/device/ep/csrc/nixl_ep.cpp`:
- Around line 70-127: The static cuda_alloc_ctx ctx binds device-specific fields
(prop.location.id, granularity, fabric support) to whichever device was active
at first cuCtxGetDevice() call, causing wrong-device VMM allocations; change
initialization so driver/version checks remain global but device-specific
queries (cuCtxGetDevice(), cuDeviceGetAttribute(),
cuMemGetAllocationGranularity()) are performed per-call or cached per-device
(keyed by device ID) instead of in the static cuda_alloc_ctx constructor—either
remove static cuda_alloc_ctx ctx and build a ctx per vmm_init()/allocation (with
a device-ID cache), or split cuda_alloc_ctx into a static global verifier and a
per-device struct populated on each allocation using the current context.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 120c00f8-eacc-4630-9658-e92390364a9c

📥 Commits

Reviewing files that changed from the base of the PR and between 7b60106 and 25b2e5b.

📒 Files selected for processing (2)

examples/device/ep/csrc/nixl_ep.cpp
examples/device/ep/csrc/nixl_ep.hpp

examples/device/ep/csrc/nixl_ep.cpp

ofirfarjun7 · 2026-03-11T09:54:29Z

/build

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

examples/device/ep/csrc/nixl_ep.cpp (1)
70-121: ⚠️ Potential issue | 🟠 Major

Do not cache device-specific VMM state in a function-local static.

static cuda_alloc_ctx ctx(device); is initialized only on the first vmm_init() call, so every later allocation reuses that first device's prop.location.id, granularity, and fallback decision. In a multi-GPU process, buffers allocated on GPU 1 can end up using GPU 0's VMM properties, which defeats the multi-device support this change is introducing.
Suggested direction
-    static cuda_alloc_ctx ctx(device);
+    const cuda_alloc_ctx ctx(device);
If the repeated driver/version probe is a concern, keep that part in a separate one-time helper and build/cache the device-specific state per CUdevice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/device/ep/csrc/nixl_ep.cpp` around lines 70 - 121, The
device-specific VMM context is incorrectly cached in a function-local static
(static cuda_alloc_ctx ctx(device);) causing all subsequent calls to reuse the
first device's prop.location.id, granularity and fallback decision; change this
by removing the function-local static and either (a) create a per-call
cuda_alloc_ctx instance (e.g., cuda_alloc_ctx ctx(device);) so each device is
probed correctly, or (b) implement a per-device cache keyed by CUdevice (e.g.,
std::unordered_map<CUdevice,cuda_alloc_ctx>) and look up/create the
cuda_alloc_ctx for the specific device, while extracting any global-only
driver/version probe into a separate one-time helper function to avoid repeated
work.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/device/ep/csrc/nixl_ep.cpp`:
- Around line 124-129: The code incorrectly passes a CUdeviceptr* to cudaMalloc;
change the allocation to use a temporary void* (e.g., void* tmp = nullptr), call
cudaMalloc(&tmp, size), check the return, and then assign region.ptr =
reinterpret_cast<CUdeviceptr>(tmp) (or static_cast if appropriate) so that
vmm_region.region.ptr receives the allocated device pointer without violating
the CUDA Runtime API contract.
- Around line 101-104: Replace the throw when rdma_vmm_supported is false with
an early return so the function can continue to the existing fallback path (the
later check of fabric_supported that falls back to cudaMalloc); specifically,
remove the std::runtime_error throw and return (keeping the function's normal
flow) when rdma_vmm_supported == false to match the behavior used for the CUDA
version and fabric support checks and allow the cudaMalloc fallback to execute.

In `@examples/device/ep/csrc/nixl_ep.hpp`:
- Around line 53-58: The vmm_region fields are left uninitialized causing
vmm_free() to operate on garbage values; update the vmm_region definition so its
members are zero-initialized by default (e.g., initialize CUdeviceptr ptr,
size_t size, and CUmemGenericAllocationHandle handle to zero or provide a
default ctor that sets them to 0) so that Buffer's members (m_rdma_alloc,
m_mask_alloc, m_sync_alloc, m_sync_count_alloc, m_workspace_alloc) are safe if
Buffer::~Buffer()/destroy() runs before init(); ensure the guard in vmm_free()
will reliably detect an unused region.

---

Duplicate comments:
In `@examples/device/ep/csrc/nixl_ep.cpp`:
- Around line 70-121: The device-specific VMM context is incorrectly cached in a
function-local static (static cuda_alloc_ctx ctx(device);) causing all
subsequent calls to reuse the first device's prop.location.id, granularity and
fallback decision; change this by removing the function-local static and either
(a) create a per-call cuda_alloc_ctx instance (e.g., cuda_alloc_ctx
ctx(device);) so each device is probed correctly, or (b) implement a per-device
cache keyed by CUdevice (e.g., std::unordered_map<CUdevice,cuda_alloc_ctx>) and
look up/create the cuda_alloc_ctx for the specific device, while extracting any
global-only driver/version probe into a separate one-time helper function to
avoid repeated work.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cb93105b-51b9-4789-b86a-242031b99df3

📥 Commits

Reviewing files that changed from the base of the PR and between 25b2e5b and ee2fd64.

📒 Files selected for processing (2)

examples/device/ep/csrc/nixl_ep.cpp
examples/device/ep/csrc/nixl_ep.hpp

examples/device/ep/csrc/nixl_ep.cpp

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/device/ep/csrc/nixl_ep.hpp`:
- Around line 53-58: The functions vmm_init and vmm_free currently have external
linkage but no header declarations; decide intended scope and fix accordingly:
if they are internal implementation details, add the static keyword to their
definitions in nixl_ep.cpp (i.e., mark vmm_init and vmm_free static) to restrict
linkage; if they are part of the API, add forward declarations for vmm_init and
vmm_free to the header alongside vmm_region so callers can see the prototypes
and linkage is explicit. Ensure the chosen change is applied consistently for
both functions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7af83afb-0d9f-4ed5-bc2c-d16193eb3605

📥 Commits

Reviewing files that changed from the base of the PR and between ee2fd64 and 7677e59.

📒 Files selected for processing (1)

examples/device/ep/csrc/nixl_ep.hpp

examples/device/ep/csrc/nixl_ep.hpp

examples/device/ep/csrc/nixl_ep.cpp

examples/device/ep/csrc/nixl_ep.hpp

rakhmets · 2026-03-24T17:38:31Z

examples/device/ep/csrc/vmm.hpp

+    size_t size_ = 0;
+    CUmemGenericAllocationHandle handle_ = 0;
+    bool is_cuda_malloc_ = false;
+    bool vmm_addr_reserved_ = false;


vmm_addr_reserved_ can be removed.

rakhmets · 2026-03-24T17:40:28Z

examples/device/ep/csrc/vmm.cpp

+    if (!ctx.fabric_supported) {
+        size_ = size;
+        is_cuda_malloc_ = true;
+        if (cudaMalloc(reinterpret_cast<void **>(&ptr_), size) != cudaSuccess) {


cudaMalloc -> cuMemAlloc
cudaFree -> cuMemFree

to avoid cast, and #include <cuda_runtime.h> can be removed from vmm.hpp

rakhmets · 2026-03-24T17:40:51Z

examples/device/ep/csrc/vmm.hpp

+    [[nodiscard]] size_t
+    size() const noexcept {
+        return size_;
+    }
+
+    [[nodiscard]] CUmemGenericAllocationHandle
+    handle() const noexcept {
+        return handle_;
+    }


Can be removed as they are unused.

rakhmets · 2026-03-24T17:43:41Z

examples/device/ep/csrc/vmm.hpp

+
+    [[nodiscard]] CUdeviceptr
+    ptr() const noexcept {
+        return ptr_;


maybe do reinterpret_cast here and return void *, as it only used to get the pointer.

rakhmets · 2026-03-24T17:54:20Z

examples/device/ep/csrc/vmm.cpp

+    access_desc.location.type = CU_MEM_LOCATION_TYPE_DEVICE;
+    access_desc.location.id = device;
+    access_desc.flags = CU_MEM_ACCESS_FLAGS_PROT_READWRITE;


it seems that this is also should be done only once in cuda_alloc_ctx.

rakhmets · 2026-03-24T17:59:13Z

examples/device/ep/csrc/vmm.cpp

+
+            prop.type = CU_MEM_ALLOCATION_TYPE_PINNED;
+            prop.location.type = CU_MEM_LOCATION_TYPE_DEVICE;
+            prop.location.id = dev;


In this implementation cuda_alloc_ctx is initialized only once.
So you can remove CUdevice device from the parameters list, call cuCtxGetDevice, and throw if it returns an error (which means that device should be set before constructing a vmm_region). And remove const CUdevice cu_dev = static_cast<CUdevice>(device_id); from nixl_ep.cpp.

examples/device/ep/csrc/vmm.cpp

rakhmets · 2026-03-24T18:02:57Z

examples/device/ep/csrc/vmm.cpp

+    if (size == 0) {
+        throw std::invalid_argument("vmm_region: size must be non-zero");
+    }


Is it really needed? I guess it can be removed.

I think cudaMalloc return success even for size == 0 (and nullptr for the ptr), so we will need to check it if we remove it.

But it's safe to call cudaFree for 0 \ NULL \ nullptr. So, I think it's not really an exceptional case for this class.

I see this class as abstraction, don't you think we should hint the user if he call the ctr with zero?

In my opinion, no, we shouldn't. Since this is an abstraction over various methods of memory allocation. And in general, it is not forbidden to pass the zero size to allocators.

AFAIK cuMemCreate will fail if we pass zero.
If it's true don't you think it is strange that vmm will fail in some systems with zero and not fail in others?
I don't think user should care which API vmm used and it should get same behavior

rakhmets · 2026-03-25T09:12:02Z

examples/device/ep/csrc/vmm.cpp

@@ -0,0 +1,151 @@
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


Suggested change

* SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

* SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

rakhmets · 2026-03-25T09:12:16Z

examples/device/ep/csrc/vmm.hpp

@@ -0,0 +1,62 @@
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


Suggested change

* SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

* SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

rakhmets · 2026-03-25T09:12:25Z

examples/device/ep/csrc/cuda_utils.hpp

@@ -0,0 +1,48 @@
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


Suggested change

* SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

* SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

rakhmets · 2026-03-25T09:13:11Z

examples/device/ep/csrc/vmm.hpp

+#include <cuda_runtime.h>
+#include <cstddef>
+
+class vmm_region {


Please use an existing namespace nixl_ep.

ofirfarjun7 and others added 6 commits March 5, 2026 18:11

NIXL/EP: Use vmm API instead of cudaMalloc

b6cb209

NIXL/EP: Use vmm API instead of cudaMalloc

c36a4c2

NIXL/EP: revert

29c2c7a

NIXL/EP: Support gdr copy with vmm.

daecca9

NIXL/EP: Improve.

3f5f55f

NIXL/EP: Format.

07674ee

ofirfarjun7 requested review from a team, ebarilanM and itayalroy as code owners March 8, 2026 16:45

pull-request-size bot added the size/L label Mar 8, 2026

github-actions bot added the external-contribution label Mar 8, 2026

ofirfarjun7 changed the title ~~Topic/nixl ep use vmm api~~ NIXL EP: Use VMM API for device memory allocation. Mar 8, 2026

NIXL/EP: Merge branch 'main' into topic/nixl-ep-use-vmm-api

0771375

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.hpp Outdated Show resolved Hide resolved

itayalroy mentioned this pull request Mar 8, 2026

[2/N] Elastic EP Milestone 2: Integrating NIXL-EP vllm-project/vllm#35627

Merged

ofirfarjun7 added 2 commits March 8, 2026 22:56

NIXL/EP: check return val.

bbb8d27

NIXL/EP: Format.

7b60106

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.hpp Outdated Show resolved Hide resolved

examples/device/ep/csrc/nixl_ep.hpp Outdated Show resolved Hide resolved

itayalroy requested changes Mar 9, 2026

View reviewed changes

NIXL/EP:Improve.

25b2e5b

ofirfarjun7 marked this pull request as draft March 9, 2026 16:33

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.cpp Outdated Show resolved Hide resolved

ofirfarjun7 added 4 commits March 9, 2026 18:46

NIXL/EP: Format.

02e379b

NIXL/EP: Improve.

2c701be

NIXL/EP: Format.

329fb22

NIXL/EP: fallback to cudaMalloc if fabric not supported

ee2fd64

ofirfarjun7 marked this pull request as ready for review March 11, 2026 09:51

ofirfarjun7 requested a review from itayalroy March 11, 2026 09:51

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.cpp Outdated Show resolved Hide resolved

examples/device/ep/csrc/nixl_ep.cpp Outdated Show resolved Hide resolved

NIXL/EP: set default vals

7677e59

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.hpp Outdated Show resolved Hide resolved

NIXL/EP: Fix.

8b4b567

rakhmets reviewed Mar 16, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.cpp Outdated Show resolved Hide resolved

ofirfarjun7 added 7 commits March 17, 2026 22:33

NIXL/EP: Fix comments.

cbe8d1d

NIXL/EP: Fix.

088c960

NIXL/EP: Fix.

15f9174

NIXL/EP: not needed.

a44a094

NIXL/EP: new files terms.

f120473

NIXL/EP: Fix comments.

9ef9dca

NIXL/EP: Merge main

eab7509

rakhmets reviewed Mar 24, 2026

View reviewed changes

examples/device/ep/csrc/nixl_ep.hpp Outdated Show resolved Hide resolved

ofirfarjun7 added 2 commits March 24, 2026 19:29

NIXL/EP: Fix comment.

22c59f7

NIXL/EP: Fix.

0a58f96

rakhmets reviewed Mar 24, 2026

View reviewed changes

NIXL/EP: fix.

d2bdfea

rakhmets reviewed Mar 25, 2026

View reviewed changes

ofirfarjun7 added 3 commits March 25, 2026 19:26

NIXL/EP: Fix comment.

0405899

NIXL/EP: Fix comment.

e82d422

NIXL/EP: Fix comment.

5b30ea4

		@@ -0,0 +1,151 @@
		/*
		* SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

		@@ -0,0 +1,62 @@
		/*
		* SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

		@@ -0,0 +1,48 @@
		/*
		* SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Conversation

ofirfarjun7 commented Mar 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Why?

How?

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 8, 2026

Uh oh!

github-actions bot commented Mar 8, 2026

Uh oh!

coderabbitai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ofirfarjun7 commented Mar 9, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ofirfarjun7 commented Mar 11, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rakhmets Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rakhmets Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ofirfarjun7 commented Mar 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 8, 2026 •

edited

Loading

rakhmets Mar 24, 2026 •

edited

Loading

rakhmets Mar 24, 2026 •

edited

Loading