Skip to content

Conversation

@kaiyux
Copy link
Member

@kaiyux kaiyux commented Dec 2, 2025

Summary by CodeRabbit

Documentation

  • Documentation
    • Enhanced documentation with comprehensive Table of Contents for improved navigation
    • Added new Preparation section with step-by-step service setup guidance
    • Introduced dedicated Benchmarking section with updated examples and metrics
    • Added multimodal serving and benchmarking examples with configuration guidance
    • Improved cross-references and navigation to benchmarking resources
    • Standardized formatting and metric presentation throughout

✏️ Tip: You can customize this high-level summary in your review settings.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md
and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@kaiyux kaiyux requested a review from a team as a code owner December 2, 2025 06:13
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

Update

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

Update

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
@kaiyux kaiyux force-pushed the user/kaiyu/update_doc branch from b02113c to 37b9665 Compare December 2, 2025 06:14
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

📝 Walkthrough

Walkthrough

Two documentation files were updated to improve benchmarking guidance structure. The changes add comprehensive tables of contents, introduce new preparation and multimodal serving sections, reorganize benchmarking instructions, standardize formatting, and enhance cross-references.

Changes

Cohort / File(s) Change Summary
Benchmarking Documentation Restructuring
docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md, docs/source/developer-guide/perf-benchmarking.md
Added comprehensive tables of contents and reorganized sections. Introduced new "Preparation" section with NGC container and trtllm-serve startup guidance. Renamed benchmark section to focus on tensorrt_llm.serve.scripts.benchmark_serving. Added dedicated "About extra_llm_api_options" section with YAML examples. Introduced explicit multimodal serving and benchmarking sections with example outputs and metrics. Standardized prose, code block formatting, and example output presentation. Added cross-references and navigation scaffolding.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

  • Documentation-only changes with homogeneous restructuring patterns across both files
  • Mostly additive content (new sections, tables of contents) and reorganization
  • Formatting standardization is repetitive and straightforward to verify
  • No code logic or functionality changes to evaluate

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is empty except for the template structure. Required sections like 'Description' and 'Test Coverage' are not filled in with actual content. Fill in the 'Description' section with a clear explanation of why the documentation was updated. Complete the 'Test Coverage' section documenting how the changes were validated.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically identifies the main change: updating online benchmarking documentation for TensorRT-LLM.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md (1)

3-3: Fix spelling: "compatiable" → "compatible".

Line 3 uses the incorrect spelling "compatiable" when referring to OpenAI compatibility.

-TensorRT LLM provides the OpenAI-compatiable API via `trtllm-serve` command.
+TensorRT LLM provides the OpenAI-compatible API via `trtllm-serve` command.

Also applies to: 3-3

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 307afac and 37b9665.

📒 Files selected for processing (2)
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md (5 hunks)
  • docs/source/developer-guide/perf-benchmarking.md (2 hunks)
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: venkywonka
Repo: NVIDIA/TensorRT-LLM PR: 6029
File: .github/pull_request_template.md:45-53
Timestamp: 2025-08-27T17:50:13.264Z
Learning: For PR templates in TensorRT-LLM, avoid suggesting changes that would increase developer overhead, such as converting plain bullets to mandatory checkboxes. The team prefers guidance-style bullets that don't require explicit interaction to reduce friction in the PR creation process.
Learnt from: farshadghodsian
Repo: NVIDIA/TensorRT-LLM PR: 7101
File: docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md:36-36
Timestamp: 2025-08-21T00:16:56.457Z
Learning: TensorRT-LLM container release tags in documentation should only reference published NGC container images. The README badge version may be ahead of the actual published container versions.
📚 Learning: 2025-08-21T00:16:56.457Z
Learnt from: farshadghodsian
Repo: NVIDIA/TensorRT-LLM PR: 7101
File: docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md:36-36
Timestamp: 2025-08-21T00:16:56.457Z
Learning: TensorRT-LLM container release tags in documentation should only reference published NGC container images. The README badge version may be ahead of the actual published container versions.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
Repo: NVIDIA/TensorRT-LLM PR: 6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-09-09T09:40:45.658Z
Learnt from: fredricz-20070104
Repo: NVIDIA/TensorRT-LLM PR: 7645
File: tests/integration/test_lists/qa/llm_function_core.txt:648-648
Timestamp: 2025-09-09T09:40:45.658Z
Learning: In TensorRT-LLM test lists, it's common and intentional for the same test to appear in multiple test list files when they serve different purposes (e.g., llm_function_core.txt for comprehensive core functionality testing and llm_function_core_sanity.txt for quick sanity checks). This duplication allows tests to be run in different testing contexts.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-08-11T20:09:24.389Z
Learnt from: achartier
Repo: NVIDIA/TensorRT-LLM PR: 6763
File: tests/integration/defs/triton_server/conftest.py:16-22
Timestamp: 2025-08-11T20:09:24.389Z
Learning: In the TensorRT-LLM test infrastructure, the team prefers simple, direct solutions (like hard-coding directory traversal counts) over more complex but robust approaches when dealing with stable directory structures. They accept the maintenance cost of updating tests if the layout changes.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
Repo: NVIDIA/TensorRT-LLM PR: 6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-11-27T09:23:18.742Z
Learnt from: fredricz-20070104
Repo: NVIDIA/TensorRT-LLM PR: 9511
File: tests/integration/defs/examples/serve/test_serve.py:136-186
Timestamp: 2025-11-27T09:23:18.742Z
Learning: In TensorRT-LLM testing, when adding test cases based on RCCA commands, the command format should be copied exactly as it appears in the RCCA case, even if it differs from existing tests. For example, some RCCA commands for trtllm-serve may omit the "serve" subcommand while others include it.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-08-18T08:42:02.640Z
Learnt from: samuellees
Repo: NVIDIA/TensorRT-LLM PR: 6974
File: tensorrt_llm/serve/scripts/benchmark_dataset.py:558-566
Timestamp: 2025-08-18T08:42:02.640Z
Learning: In TensorRT-LLM's RandomDataset (tensorrt_llm/serve/scripts/benchmark_dataset.py), when using --random-token-ids option, sequence length accuracy is prioritized over semantic correctness for benchmarking purposes. The encode/decode operations should use skip_special_tokens=True and add_special_tokens=False to ensure exact target token lengths.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-08-26T09:49:04.956Z
Learnt from: pengbowang-nv
Repo: NVIDIA/TensorRT-LLM PR: 7192
File: tests/integration/test_lists/test-db/l0_dgx_b200.yml:56-72
Timestamp: 2025-08-26T09:49:04.956Z
Learning: In TensorRT-LLM test configuration files, the test scheduling system handles wildcard matching with special rules that prevent duplicate test execution even when the same tests appear in multiple yaml files with overlapping GPU wildcards (e.g., "*b200*" and "*gb200*").

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
📚 Learning: 2025-08-09T02:04:49.623Z
Learnt from: Fridah-nv
Repo: NVIDIA/TensorRT-LLM PR: 6760
File: tensorrt_llm/_torch/auto_deploy/models/quant_config_reader.py:81-98
Timestamp: 2025-08-09T02:04:49.623Z
Learning: In TensorRT-LLM's auto_deploy module, torch.dtype values in configuration dictionaries must be stored as string representations (e.g., "float16" instead of torch.float16) because OmegaConf.merge does not support torch.dtype types. These string representations are converted to actual torch.dtype objects in downstream code.

Applied to files:

  • docs/source/developer-guide/perf-benchmarking.md
📚 Learning: 2025-08-13T16:20:37.987Z
Learnt from: dcampora
Repo: NVIDIA/TensorRT-LLM PR: 6867
File: tensorrt_llm/_torch/pyexecutor/sampler.py:67-72
Timestamp: 2025-08-13T16:20:37.987Z
Learning: In TensorRT-LLM sampler code, performance is prioritized over additional validation checks. The beam_width helper method intentionally returns the first request's beam_width without validating consistency across all requests to avoid performance overhead from iterating through the entire batch.

Applied to files:

  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
Repo: NVIDIA/TensorRT-LLM PR: 6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.

Applied to files:

  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-09-23T14:58:05.372Z
Learnt from: nv-lschneider
Repo: NVIDIA/TensorRT-LLM PR: 7910
File: cpp/tensorrt_llm/kernels/nccl_device/config.cu:42-49
Timestamp: 2025-09-23T14:58:05.372Z
Learning: In TensorRT-LLM NCCL device kernels (cpp/tensorrt_llm/kernels/nccl_device/), the token partitioning intentionally uses ceil-like distribution (same token_per_rank for all ranks) to ensure all ranks launch the same number of blocks. This is required for optimal NCCL device API barrier performance, even though it may launch extra blocks for non-existent tokens on later ranks. Runtime bounds checking in the kernel (blockID validation) handles the overshoot cases.

Applied to files:

  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
📚 Learning: 2025-08-20T07:43:36.447Z
Learnt from: ChristinaZ
Repo: NVIDIA/TensorRT-LLM PR: 7068
File: cpp/tensorrt_llm/kernels/moeTopKFuncs.cuh:169-172
Timestamp: 2025-08-20T07:43:36.447Z
Learning: In TensorRT-LLM MOE kernels, when processing up to 128 experts across 32 threads, each thread handles at most 4 experts (N < 5 constraint), where N represents candidates per thread rather than total system capacity.

Applied to files:

  • docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md
🪛 LanguageTool
docs/source/developer-guide/perf-benchmarking.md

[grammar] ~481-~481: Ensure spelling is correct
Context: ...rking TensorRT LLM provides the OpenAI-compatiable API via trtllm-serve command, and `te...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~481-~481: Ensure spelling is correct
Context: ... measures the performance of the OpenAI-compatiable server launched by trtllm-serve. To ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)
docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md

24-24: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)


185-185: Multiple headings with the same content

(MD024, no-duplicate-heading)

🔇 Additional comments (2)
docs/source/commands/trtllm-serve/run-benchmark-with-trtllm-serve.md (1)

48-48: Verify NGC container image tag placeholder.

Line 48 uses a placeholder container tag nvcr.io/nvidia/tensorrt-llm/release:x.y.z. Ensure this is intentional documentation guidance or should be replaced with a specific published NGC container version. Based on learnings, TensorRT-LLM documentation should reference only published NGC container images.

Confirm whether x.y.z is intentional as a template for users to substitute, or if a specific version (e.g., latest, 1.1.0, etc.) should be documented instead.

Also applies to: 48-48

docs/source/developer-guide/perf-benchmarking.md (1)

12-29: Approve Table of Contents structure.

The new TOC clearly organizes the benchmarking documentation with proper nested structure and distinct sections. No structural issues identified here.

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Copy link
Collaborator

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kaiyux kaiyux enabled auto-merge (squash) December 2, 2025 07:16
@kaiyux
Copy link
Member Author

kaiyux commented Dec 2, 2025

/bot skip --comment "doc changes"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26552 [ skip ] triggered by Bot. Commit: 8dc16d6

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26552 [ skip ] completed with state SUCCESS. Commit: 8dc16d6
Skipping testing for commit 8dc16d6

@kaiyux kaiyux merged commit 6f7804f into NVIDIA:release/1.1 Dec 2, 2025
6 of 7 checks passed
@kaiyux kaiyux deleted the user/kaiyu/update_doc branch December 2, 2025 09:09
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 12, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 12, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 13, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 13, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 13, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 14, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 15, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 15, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 16, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit to mikeiovine/TensorRT-LLM that referenced this pull request Dec 16, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
mikeiovine pushed a commit that referenced this pull request Dec 16, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
codego7250 pushed a commit to codego7250/TensorRT-LLM that referenced this pull request Dec 19, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
JunyiXu-nv pushed a commit to JunyiXu-nv/TensorRT-LLM that referenced this pull request Dec 30, 2025
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants