Refactor the topk parallelization part for the routing kernels #5567

ChristinaZ · 2025-06-28T15:37:24Z

Refactor the topk part for the routing kernels in the MoE TrtLLMGen backend

Description

This is the first pull request (PR) for refactoring the routing kernels in the MoE TrtLLMGen backend.
In this PR, I initially relocated the topK parallelization logic from cpp/tensorrt_llm/kernels/trtllmGenKernels/blockScaleMoe/RoutingKernel.cu to a new CUDA header file: cpp/tensorrt_llm/kernels/trtllmGenKernels/blockScaleMoe/RoutingKernelTopK.cuh.

Also, to facilitate future refactoring efforts, I have adjusted the namespace configuration.

Test Coverage

cd cpp/build
make -j$(nproc) google-tests
./tests/unit_tests/kernels/routingKernelsTest

pytest -k test_moe_fp4 tests/unittest/_torch/thop/test_moe.py

ChristinaZ · 2025-06-28T15:37:39Z

/bot run

tensorrt-cicd · 2025-06-28T15:42:47Z

PR_Github #10205 [ run ] triggered by Bot

ChristinaZ · 2025-06-28T16:31:47Z

/bot run

tensorrt-cicd · 2025-06-28T16:36:47Z

PR_Github #10209 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-28T16:36:49Z

PR_Github #10205 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #7534 completed with status: 'FAILURE'

tensorrt-cicd · 2025-06-28T18:48:48Z

PR_Github #10209 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7538 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

MatthiasKohl

LGTM, just added one minor comment/note

cpp/tensorrt_llm/kernels/trtllmGenKernels/blockScaleMoe/RoutingKernel.h

ChristinaZ · 2025-07-06T09:42:55Z

/bot run

Signed-off-by: Christina Zhang <[email protected]>

ChristinaZ · 2025-07-06T14:05:31Z

/bot kill

tensorrt-cicd · 2025-07-06T14:11:02Z

PR_Github #11055 [ kill ] triggered by Bot

tensorrt-cicd · 2025-07-06T14:11:03Z

PR_Github #11055 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit 70dcb88

ChristinaZ · 2025-07-07T03:19:58Z

/bot run

tensorrt-cicd · 2025-07-07T03:24:50Z

PR_Github #11096 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-07T06:59:51Z

PR_Github #11096 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8203 completed with status: 'SUCCESS'

…A#5567) Signed-off-by: Christina Zhang <[email protected]> Signed-off-by: Yuxin <[email protected]>

ChristinaZ requested review from MatthiasKohl, byshiue and nekorobov June 28, 2025 15:37

ChristinaZ force-pushed the refactor_topK_in_routing_trtllmgen branch from 9e0bf1e to 224e42f Compare June 28, 2025 16:31

MatthiasKohl approved these changes Jul 1, 2025

View reviewed changes

cpp/tensorrt_llm/kernels/trtllmGenKernels/blockScaleMoe/RoutingKernel.h Show resolved Hide resolved

ChristinaZ force-pushed the refactor_topK_in_routing_trtllmgen branch from 224e42f to 68ddd58 Compare July 6, 2025 09:42

Refactor the topk parallelization

70dcb88

Signed-off-by: Christina Zhang <[email protected]>

ChristinaZ force-pushed the refactor_topK_in_routing_trtllmgen branch from 68ddd58 to 70dcb88 Compare July 6, 2025 10:10

byshiue approved these changes Jul 7, 2025

View reviewed changes

byshiue merged commit 12d8c7d into NVIDIA:main Jul 7, 2025
3 checks passed

zhou-yuxin pushed a commit to zhou-yuxin/TensorRT-LLM that referenced this pull request Jul 15, 2025

Refactor the topk parallelization part for the routing kernels (NVIDI…

f4dc29e

…A#5567) Signed-off-by: Christina Zhang <[email protected]> Signed-off-by: Yuxin <[email protected]>

Refactor the topk parallelization part for the routing kernels #5567

Refactor the topk parallelization part for the routing kernels #5567

Uh oh!

Conversation

ChristinaZ commented Jun 28, 2025

Refactor the topk part for the routing kernels in the MoE TrtLLMGen backend

Description

Test Coverage

Uh oh!

ChristinaZ commented Jun 28, 2025

Uh oh!

tensorrt-cicd commented Jun 28, 2025

Uh oh!

ChristinaZ commented Jun 28, 2025

Uh oh!

tensorrt-cicd commented Jun 28, 2025

Uh oh!

tensorrt-cicd commented Jun 28, 2025

Uh oh!

tensorrt-cicd commented Jun 28, 2025

Uh oh!

MatthiasKohl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ChristinaZ commented Jul 6, 2025

Uh oh!

ChristinaZ commented Jul 6, 2025

Uh oh!

tensorrt-cicd commented Jul 6, 2025

Uh oh!

tensorrt-cicd commented Jul 6, 2025

Uh oh!

ChristinaZ commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants