Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
bcfc5e7
adding a new nccl window buffer manager that goes with the lifetime o…
nv-lschneider Nov 13, 2025
a801e29
namespace for test fixing
nv-lschneider Nov 14, 2025
0a37759
linking stuff for tests
nv-lschneider Nov 14, 2025
71c5f27
moving NCCL helper
nv-lschneider Nov 14, 2025
3a00d39
using new NCCL util
nv-lschneider Nov 17, 2025
8f92df4
remove NCCLUBAllocator
nv-lschneider Nov 17, 2025
005b299
Changing default strategy to NCCL_SYMMETRIC
nv-lschneider Nov 18, 2025
869fe5b
cleaning up and adding tests before PR
nv-lschneider Nov 18, 2025
6faa5ae
fixing python tests
nv-lschneider Nov 18, 2025
0bdd976
fixing problems
nv-lschneider Nov 18, 2025
e5ab0e6
test hardeing
nv-lschneider Nov 18, 2025
773c0e5
redesign tests
nv-lschneider Nov 18, 2025
ec851a9
one test after the other
nv-lschneider Nov 18, 2025
8f1f69a
one test after the other
nv-lschneider Nov 18, 2025
6ce2b69
asdf
nv-lschneider Nov 18, 2025
c91f468
updating test
nv-lschneider Nov 19, 2025
ac10eca
fixing test pickling
nv-lschneider Nov 19, 2025
8565156
wrapping
nv-lschneider Nov 19, 2025
2886a79
fix remaining tests
nv-lschneider Nov 19, 2025
1dc7b28
adding simple arithmetic to the test
nv-lschneider Nov 19, 2025
bc07a49
add AR to the test
nv-lschneider Nov 19, 2025
cb863ac
rename test sensibly
nv-lschneider Nov 19, 2025
1450363
addressing coderabbit
nv-lschneider Nov 19, 2025
684f623
more code rabbit comments
nv-lschneider Nov 19, 2025
ed59495
fixes
nv-lschneider Nov 19, 2025
d38942d
addressing review comments
nv-lschneider Nov 21, 2025
5096fe3
adding empirical model explanation
nv-lschneider Nov 21, 2025
a8b0b85
removing ncclWindowTensor python interface
nv-lschneider Nov 25, 2025
fe95686
registering the new test for CI
nv-lschneider Nov 25, 2025
e184326
removing ncclWindowTensor from build system
nv-lschneider Nov 25, 2025
0ee6a9b
querying MNNVL to determine if it is worth it to copy data
nv-lschneider Nov 25, 2025
6903bd8
fixes
nv-lschneider Nov 25, 2025
1677026
cosmetic fixes
nv-lschneider Nov 25, 2025
d91623f
fix functional to include NCCL_SYMMETRIC workspace, to fix test
nv-lschneider Dec 1, 2025
38a9c5d
treating NCCL_SYMMETRIC like NCCL
nv-lschneider Dec 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions cpp/tensorrt_llm/common/customAllReduceUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@ inline AllReduceStrategyType SelectStrategyLP(size_t seq_len, size_t hidden_size
{
return AllReduceStrategyType::ONESHOT;
}
return AllReduceStrategyType::NCCL;
}

// use 1D vector to store the best strategy instead of a map for each sm version
Expand Down Expand Up @@ -143,15 +142,15 @@ inline AllReduceStrategyType selectStrategyLookUpTable(
sm_version = 100;
}

// Check if the entry is out of bounds, otherwise return NCCL as fallback
// Check if the entry is out of bounds, otherwise return NCCL_SYMMETRIC as fallback
if (AllReduceBestStrategyTable.find(sm_version) == AllReduceBestStrategyTable.end()
|| tp_index >= AllReduceBestStrategyTable.at(sm_version).size()
|| fusion_op_index >= AllReduceBestStrategyTable.at(sm_version).at(tp_index).size()
|| hidden_size_index >= AllReduceBestStrategyTable.at(sm_version).at(tp_index).at(fusion_op_index).size()
|| num_token_index
>= AllReduceBestStrategyTable.at(sm_version).at(tp_index).at(fusion_op_index).at(hidden_size_index).size())
{
return AllReduceStrategyType::NCCL;
return AllReduceStrategyType::NCCL_SYMMETRIC;
}

return static_cast<AllReduceStrategyType>(
Expand Down
Loading