Skip to content

Commit 9367328

Browse files
Superjomnnv-lschneider
authored andcommitted
[https://nvbugs/5351244][fix] CHERRY-PICK test_mpi_session (#7501) (#7900)
Signed-off-by: Yan Chunwei <[email protected]> Signed-off-by: Ludwig Schneider <[email protected]> pre-commit changes Signed-off-by: Ludwig Schneider <[email protected]> clang formatting Signed-off-by: Ludwig Schneider <[email protected]> safe guarding NCCL 2.27 build Signed-off-by: Ludwig Schneider <[email protected]> fixing precommit formatting Signed-off-by: Ludwig Schneider <[email protected]> most of code rabbit comments Signed-off-by: Ludwig Schneider <[email protected]> adding missing semi-colon Signed-off-by: Ludwig Schneider <[email protected]> removing unused comment lines Signed-off-by: Ludwig Schneider <[email protected]> Clarifying the test on how to compre residual chunked and unchunked. Signed-off-by: Ludwig Schneider <[email protected]> fixing pre-commit Signed-off-by: Ludwig Schneider <[email protected]> fixing pre-commit Signed-off-by: Ludwig Schneider <[email protected]> fixing missing variable, rebase complete and tested Signed-off-by: Ludwig Schneider <[email protected]> using a grid stride loop with less blocks launched for large message sizes Signed-off-by: Ludwig Schneider <[email protected]> using functioning grid stride loop for NCCL_DEVICE. It helps with better performance at larger message sizes Signed-off-by: Ludwig Schneider <[email protected]> initial oneshot implementation Signed-off-by: Ludwig Schneider <[email protected]> minor tweaks to include one shot fixes Signed-off-by: Ludwig Schneider <[email protected]> enabling grid stride loop, but no perf benefit. Signed-off-by: Ludwig Schneider <[email protected]> addressing review feedback Signed-off-by: Ludwig Schneider <[email protected]> fix formatting Signed-off-by: Ludwig Schneider <[email protected]>
1 parent 56bf9d0 commit 9367328

File tree

16 files changed

+1305
-925
lines changed

16 files changed

+1305
-925
lines changed

.codespellignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
commIter

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1442,7 +1442,7 @@ repos:
14421442
additional_dependencies:
14431443
- tomli
14441444
# add ignore words list
1445-
args: ["-L", "Mor,ans,thirdparty", "--skip", "ATTRIBUTIONS-*.md,*.svg", "--skip", "security_scanning/*"]
1445+
args: ["-L", "Mor,ans,thirdparty", "--skip", "ATTRIBUTIONS-*.md,*.svg", "--skip", "security_scanning/*", "-I", ".codespellignore"]
14461446
- repo: https://github.com/astral-sh/ruff-pre-commit
14471447
rev: v0.9.4
14481448
hooks:
Lines changed: 15 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,28 @@
1-
# CMakeLists.txt for nccl_device
2-
# This directory contains CUDA kernels and host launcher code
1+
# CMakeLists.txt for nccl_device This directory contains CUDA kernels and host
2+
# launcher code
33

44
# Enable CUDA
55
enable_language(CUDA)
66

77
# Create CUDA library
8-
add_library(tensorrt_llm_nccl_device
9-
config.cu
10-
)
8+
add_library(tensorrt_llm_nccl_device config.cu)
119

1210
# Set properties for the CUDA library
13-
set_target_properties(tensorrt_llm_nccl_device PROPERTIES
14-
CUDA_STANDARD 17
15-
CUDA_SEPARABLE_COMPILATION ON
16-
POSITION_INDEPENDENT_CODE ON
17-
)
11+
set_target_properties(
12+
tensorrt_llm_nccl_device
13+
PROPERTIES CUDA_STANDARD 17 CUDA_SEPARABLE_COMPILATION ON
14+
POSITION_INDEPENDENT_CODE ON)
1815

1916
# Include directories
20-
target_include_directories(tensorrt_llm_nccl_device PUBLIC
21-
${CMAKE_CURRENT_SOURCE_DIR}
22-
${CMAKE_CURRENT_SOURCE_DIR}/../..
23-
)
17+
target_include_directories(
18+
tensorrt_llm_nccl_device PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}
19+
${CMAKE_CURRENT_SOURCE_DIR}/../..)
2420

2521
# Link libraries
26-
target_link_libraries(tensorrt_llm_nccl_device
27-
tensorrt_llm_common
28-
)
22+
target_link_libraries(tensorrt_llm_nccl_device tensorrt_llm_common)
2923

3024
# Install target
31-
install(TARGETS tensorrt_llm_nccl_device
32-
LIBRARY DESTINATION lib
33-
ARCHIVE DESTINATION lib
34-
)
25+
install(
26+
TARGETS tensorrt_llm_nccl_device
27+
LIBRARY DESTINATION lib
28+
ARCHIVE DESTINATION lib)

0 commit comments

Comments
 (0)