[refactor] Clean up drafter/resource manager creation logic #5805

mikeiovine · 2025-07-07T19:31:50Z

Description

The purpose of this small refactor is to prepare for the upcoming migration of EAGLE3/DRAFT_TARGET to the new Drafter interface.

We want all of the resource managers to be created before the Drafter because the drafter might rely on those resource managers. The way things are currently set up is pretty confusing: drafter depends on the spec resource manager, but you still have to create the drafter before the spec resource manager.

In this PR:

Move code around so that the Drafter is always created after all resource managers.
For user-provided spec decode, the resource manager must be provided via the SpecConfig.

Even though it cleans up our internal code, I realize that (2) makes the user-facing API fairly clunky for the ngram use case. I have a bit of logic in the UserProvidedSpecConfig to clean things up: if a drafter is provided and the drafter has a spec_resource_manager attribute, resouce_manager will default to drafter.spec_resource_manager, you don't need to specify it twice.

Test Coverage

Existing tests.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

mikeiovine · 2025-07-07T19:31:55Z

/bot run

tensorrt-cicd · 2025-07-07T19:37:53Z

PR_Github #11177 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-07T22:16:32Z

PR_Github #11177 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8268 completed with status: 'FAILURE'

tensorrt_llm/_torch/speculative/utils.py

mikeiovine · 2025-07-08T13:48:19Z

/bot run

tensorrt-cicd · 2025-07-08T13:53:46Z

PR_Github #11311 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-09T01:55:18Z

PR_Github #11311 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8365 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

wili-65535

LGTM!
For user provided method,
user need to provide both drafter and resource_manager in the spec_config for certain method (e.g. NGram),
or only drafter if no resource_manager is needed (e.g. setting draft_token_ids is always [2,2,2,2]).

Copilot

Pull Request Overview

This refactor streamlines the creation order of speculative decoding components by ensuring resource managers are instantiated before drafters and by surfacing a resource_manager field in user-provided spec configs.

Add resource_manager field and defaulting logic to user-provided decoding/config classes.
Simplify get_spec_resource_manager and get_spec_drafter to use mode-based logic rather than passing the drafter.
Reorder creation in create_py_executor so resource managers are set up before drafters.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tensorrt_llm/llmapi/llm_args.py	Added `resource_manager` to `UserProvidedDecodingConfig` and updated instantiation.
tensorrt_llm/_torch/speculative/utils.py	Removed unused drafter param, split `get_spec_resource_manager` and `get_spec_drafter`.
tensorrt_llm/_torch/speculative/user_provided.py	Added `resource_manager` attribute and defaulting logic in `UserProvidedConfig`.
tensorrt_llm/_torch/speculative/ngram.py	Removed base-class init params; assign `spec_resource_manager` directly.
tensorrt_llm/_torch/speculative/drafter.py	Dropped the now-unnecessary `__init__` from `Drafter`.
tensorrt_llm/_torch/pyexecutor/py_executor_creator.py	Reordered resource manager creation before drafter instantiation.

Comments suppressed due to low confidence (3)

tensorrt_llm/_torch/speculative/utils.py:99

Add unit tests for the ngram branch in get_spec_resource_manager to verify that an NGramPoolManager is returned with the correct parameters.

    if spec_dec_mode.is_ngram():

tensorrt_llm/_torch/speculative/utils.py:101

Add unit tests for the user_provided branch in get_spec_resource_manager to ensure spec_config.resource_manager is returned when configured.

    if spec_dec_mode.is_user_provided():

tensorrt_llm/_torch/speculative/user_provided.py:23

Add tests for UserProvidedConfig.__post_init__ to verify that resource_manager correctly defaults to drafter.spec_resource_manager when available.

    resource_manager: Optional[BaseResourceManager] = None

tensorrt_llm/_torch/speculative/utils.py

tensorrt_llm/_torch/speculative/ngram.py

mikeiovine · 2025-07-11T16:49:01Z

/bot run

tensorrt-cicd · 2025-07-11T16:54:57Z

PR_Github #11669 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-11T19:52:50Z

PR_Github #11669 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8641 completed with status: 'FAILURE'

Signed-off-by: Mike Iovine <[email protected]>

mikeiovine · 2025-07-14T17:35:33Z

/bot run

tensorrt-cicd · 2025-07-14T17:40:50Z

PR_Github #11836 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-14T22:04:43Z

PR_Github #11836 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8772 completed with status: 'FAILURE'

mikeiovine · 2025-07-15T14:58:13Z

/bot run

tensorrt-cicd · 2025-07-15T15:03:39Z

PR_Github #11955 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-15T17:34:59Z

PR_Github #11955 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8872 completed with status: 'FAILURE'

mikeiovine · 2025-07-15T17:36:15Z

/bot run

tensorrt-cicd · 2025-07-15T17:41:20Z

PR_Github #11965 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-15T21:45:57Z

PR_Github #11965 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8883 completed with status: 'SUCCESS'

) Signed-off-by: Mike Iovine <[email protected]>

mikeiovine requested review from Funatiq and wili-65535 July 7, 2025 19:31

mikeiovine requested review from a team as code owners July 7, 2025 19:31

mikeiovine requested review from achartier, byshiue and yuxianq July 7, 2025 19:31

Funatiq reviewed Jul 8, 2025

View reviewed changes

tensorrt_llm/_torch/speculative/utils.py Outdated Show resolved Hide resolved

tensorrt_llm/_torch/speculative/utils.py Show resolved Hide resolved

mikeiovine force-pushed the drafter-changes branch from 25197f8 to 39f1e6b Compare July 8, 2025 13:48

mikeiovine requested a review from Funatiq July 9, 2025 14:34

Funatiq approved these changes Jul 10, 2025

View reviewed changes

wili-65535 approved these changes Jul 10, 2025

View reviewed changes

Funatiq requested a review from Copilot July 10, 2025 13:39

Copilot AI reviewed Jul 10, 2025

View reviewed changes

tensorrt_llm/_torch/speculative/utils.py Show resolved Hide resolved

tensorrt_llm/_torch/speculative/ngram.py Outdated Show resolved Hide resolved

mikeiovine force-pushed the drafter-changes branch 2 times, most recently from 22990d0 to 0e0fd50 Compare July 11, 2025 16:48

[refactor] Clean up drafter/resource manager creation logic

3c6a609

Signed-off-by: Mike Iovine <[email protected]>

mikeiovine force-pushed the drafter-changes branch from 0e0fd50 to 3c6a609 Compare July 14, 2025 17:35

mikeiovine requested a review from a team as a code owner July 14, 2025 17:35

mikeiovine requested a review from syuoni July 14, 2025 17:35

mikeiovine mentioned this pull request Jul 14, 2025

[TRTLLM-6352][feat] Migrate EAGLE3 and draft/target speculation to Drafter #6007

Merged

syuoni approved these changes Jul 15, 2025

View reviewed changes

Merge branch 'main' into drafter-changes

8f7a4ee

schetlur-nv approved these changes Jul 16, 2025

View reviewed changes

schetlur-nv merged commit fa34cb7 into NVIDIA:main Jul 16, 2025
3 checks passed

yizhang-nv pushed a commit to yizhang-nv/TensorRT-LLM that referenced this pull request Jul 17, 2025

[refactor] Clean up drafter/resource manager creation logic (NVIDIA#5805

f93f63f

) Signed-off-by: Mike Iovine <[email protected]>

mikeiovine deleted the drafter-changes branch July 23, 2025 18:01

[refactor] Clean up drafter/resource manager creation logic #5805

[refactor] Clean up drafter/resource manager creation logic #5805

Uh oh!

Conversation

mikeiovine commented Jul 7, 2025

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

mikeiovine commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

tensorrt-cicd commented Jul 7, 2025

Uh oh!

Uh oh!

Uh oh!

mikeiovine commented Jul 8, 2025

Uh oh!

tensorrt-cicd commented Jul 8, 2025

Uh oh!

tensorrt-cicd commented Jul 9, 2025

Uh oh!

wili-65535 left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

mikeiovine commented Jul 11, 2025

Uh oh!

tensorrt-cicd commented Jul 11, 2025

Uh oh!

tensorrt-cicd commented Jul 11, 2025

Uh oh!

mikeiovine commented Jul 14, 2025

Uh oh!

tensorrt-cicd commented Jul 14, 2025

Uh oh!

tensorrt-cicd commented Jul 14, 2025

Uh oh!

mikeiovine commented Jul 15, 2025

Uh oh!

tensorrt-cicd commented Jul 15, 2025

Uh oh!

tensorrt-cicd commented Jul 15, 2025

Uh oh!

mikeiovine commented Jul 15, 2025

Uh oh!

tensorrt-cicd commented Jul 15, 2025

Uh oh!

tensorrt-cicd commented Jul 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants