[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend #5752

Superjomn · 2025-07-04T07:16:09Z

PR title

Please write the PR title by following template:

[JIRA ticket link/nvbug link/github issue link][fix/feat/doc/infra/...] <summary of this PR>

For example, assume I have a PR hope to support a new feature about cache manager of Jira TRTLLM-1000 ticket, it would be like

[TRTLLM-1000][feat] Support a new feature about cache manager

Description

Please explain the issue and the solution in short.

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Superjomn · 2025-07-04T09:30:31Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-07-04T09:36:04Z

PR_Github #10985 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-04T10:41:50Z

PR_Github #10985 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8114 completed with status: 'FAILURE'

Superjomn · 2025-07-06T00:06:26Z

/bot run

tensorrt-cicd · 2025-07-06T00:12:13Z

PR_Github #11041 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-06T01:18:50Z

PR_Github #11041 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8162 completed with status: 'FAILURE'

Superjomn · 2025-07-06T07:05:48Z

/bot run --disable-fail-fast

Superjomn · 2025-07-07T02:50:00Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-07-07T02:55:05Z

PR_Github #11089 [ run ] triggered by Bot

Superjomn · 2025-07-07T02:59:10Z

/bot kill

tensorrt-cicd · 2025-07-07T03:04:04Z

PR_Github #11091 [ kill ] triggered by Bot

tensorrt-cicd · 2025-07-07T03:04:05Z

PR_Github #11089 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-07-07T03:04:36Z

PR_Github #11091 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit f9c7310

Superjomn · 2025-07-07T06:37:38Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-07-07T06:42:39Z

PR_Github #11112 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-07T09:49:46Z

PR_Github #11112 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8218 completed with status: 'FAILURE'

Superjomn · 2025-07-07T14:14:04Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-07-07T14:19:15Z

PR_Github #11158 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-07T17:35:24Z

PR_Github #11158 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8251 completed with status: 'FAILURE'

tensorrt_llm/bench/benchmark/utils/general.py

Superjomn · 2025-07-16T01:59:41Z

/bot run --only-multi-gpu-test

tensorrt-cicd · 2025-07-16T02:05:17Z

PR_Github #11990 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-16T02:26:20Z

PR_Github #11990 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #8901 (Partly Tested) completed with status: 'FAILURE'

Superjomn · 2025-07-16T03:12:43Z

/bot run --only-multi-gpu-test

tensorrt-cicd · 2025-07-16T03:18:18Z

PR_Github #12005 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-16T03:37:21Z

PR_Github #12005 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #8914 (Partly Tested) completed with status: 'FAILURE'

Signed-off-by: Superjomn <[email protected]>

Superjomn · 2025-07-16T03:54:00Z

/bot run --only-multi-gpu-test

tensorrt-cicd · 2025-07-16T03:59:00Z

PR_Github #12017 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-16T06:13:02Z

PR_Github #12017 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8924 (Partly Tested) completed with status: 'SUCCESS'

Superjomn · 2025-07-16T06:24:40Z

/bot run

tensorrt-cicd · 2025-07-16T06:31:46Z

PR_Github #12034 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-16T08:42:46Z

PR_Github #12034 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8936 completed with status: 'SUCCESS'

…lass for pytorch backend (NVIDIA#5752) Signed-off-by: Superjomn <[email protected]>

…in LLM class for pytorch backend (NVIDIA#5752)" This reverts commit a02606a.

Superjomn requested a review from a team as a code owner July 4, 2025 07:16

Superjomn changed the title ~~[BREAKING CHANGE]: unify KvCacheConfig in LLM class for pytorch backend~~ [TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend Jul 4, 2025

Superjomn force-pushed the change-kvcache-config branch from 5df018b to d0f6cb0 Compare July 6, 2025 00:06

Superjomn force-pushed the change-kvcache-config branch 2 times, most recently from 6abe190 to bf0839b Compare July 6, 2025 07:05

Superjomn force-pushed the change-kvcache-config branch from bf0839b to f9c7310 Compare July 7, 2025 02:59

Superjomn requested review from QiJune and lucaslie July 7, 2025 02:59

Superjomn force-pushed the change-kvcache-config branch from f9c7310 to 67a8191 Compare July 7, 2025 14:12

FrankD412 requested changes Jul 7, 2025

View reviewed changes

tensorrt_llm/bench/benchmark/utils/general.py Outdated Show resolved Hide resolved

Superjomn force-pushed the change-kvcache-config branch from 67a8191 to aff5c11 Compare July 8, 2025 07:20

Superjomn force-pushed the change-kvcache-config branch 4 times, most recently from 9da8f13 to 0b82b35 Compare July 16, 2025 01:56

Superjomn enabled auto-merge (squash) July 16, 2025 02:00

Superjomn force-pushed the change-kvcache-config branch from 0b82b35 to d6175c0 Compare July 16, 2025 03:12

Superjomn added 3 commits July 16, 2025 11:53

init

58d46ff

Signed-off-by: Superjomn <[email protected]>

hide quant_config from PyT

459f157

Signed-off-by: Superjomn <[email protected]>

fix

cbaf202

Signed-off-by: Superjomn <[email protected]>

Superjomn force-pushed the change-kvcache-config branch from d6175c0 to cbaf202 Compare July 16, 2025 03:53

Superjomn merged commit a02606a into NVIDIA:main Jul 16, 2025
3 checks passed

Superjomn deleted the change-kvcache-config branch July 16, 2025 08:54

evezhier pushed a commit to evezhier/TensorRT-LLM that referenced this pull request Jul 16, 2025

[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM c…

a56568a

…lass for pytorch backend (NVIDIA#5752) Signed-off-by: Superjomn <[email protected]>

FrankD412 mentioned this pull request Jul 16, 2025

[fix] Fixes KV Cache overrides in trtllm-bench #6103

Merged

yizhang-nv pushed a commit to yizhang-nv/TensorRT-LLM that referenced this pull request Jul 17, 2025

[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM c…

75e87f0

…lass for pytorch backend (NVIDIA#5752) Signed-off-by: Superjomn <[email protected]>

kaiyux mentioned this pull request Jul 21, 2025

Revert "[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend" #6208

Closed

kaiyux pushed a commit to kaiyux/TensorRT-LLM that referenced this pull request Jul 21, 2025

Revert "[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig …

d06a5c8

…in LLM class for pytorch backend (NVIDIA#5752)" This reverts commit a02606a.

kaiyux mentioned this pull request Jul 21, 2025

Revert "[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig … #6209

Closed

[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend #5752

[TRTLLM-5530][BREAKING CHANGE] refactor: unify KvCacheConfig in LLM class for pytorch backend #5752

Uh oh!

Conversation

Superjomn commented Jul 4, 2025

PR title

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Superjomn commented Jul 4, 2025

Uh oh!

tensorrt-cicd commented Jul 4, 2025

Uh oh!

tensorrt-cicd commented Jul 4, 2025

Uh oh!

Superjomn commented Jul 6, 2025

Uh oh!

tensorrt-cicd commented Jul 6, 2025

Uh oh!

tensorrt-cicd commented Jul 6, 2025

Uh oh!

Superjomn commented Jul 6, 2025

Uh oh!

Superjomn commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

Superjomn commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

Superjomn commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

Superjomn commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

Uh oh!

Superjomn commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

Superjomn commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

Superjomn commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

Superjomn commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

tensorrt-cicd commented Jul 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels