You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: flashinfer/comm/trtllm_mnnvl_ar.py
+23-7Lines changed: 23 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -61,11 +61,11 @@ def __init__(
61
61
62
62
Args:
63
63
mapping: Mapping configuration containing rank info
64
-
buffer_size_in_bytes: The size in bytes for each lamport buffer. The actual allocation size will be NUM_LAMPORT_BUFFERS * buffer_size_in_bytes.
64
+
buffer_size_in_bytes: The requested size in bytes for each lamport buffer. The actual allocation size may be larger due to alignment requirements. The actual usable size will be NUM_LAMPORT_BUFFERS * actual_buffer_size_per_lamport_buffer.
f"[MNNVL Allreduce] Actual allocated size: {allocated_size} bytes, Actual buffer size per lamport buffer: {self.buffer_size_bytes} bytes, total workspace: {self.workspace_size_bytes} bytes."
111
+
)
112
+
97
113
# We use FP32 for sentinel value regardless of the real dtype
0 commit comments