-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Draft: [TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench #9250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
📝 WalkthroughWalkthroughThis pull request adds dataset preparation infrastructure to the TensorRT LLM benchmarking suite, introducing CLI commands to generate synthetic datasets and prepare real HuggingFace datasets, along with supporting data models and utilities. It also updates the main bench CLI to register the new prepare-dataset command and adds .DS_Store to .gitignore. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant PrepareDataset as prepare_dataset<br/>(CLI Group)
participant RootArgs as RootArgs<br/>(Validation)
participant Subcommands as Subcommands<br/>(real-dataset,<br/>token_norm_dist,<br/>token_unif_dist)
participant Utils as Dataset Utils<br/>(Serialization &<br/>Sampling)
participant Output as Output File
User->>PrepareDataset: Run CLI with options
PrepareDataset->>RootArgs: Parse CLI args & build model
activate RootArgs
RootArgs->>RootArgs: Load & validate tokenizer
RootArgs-->>PrepareDataset: Return validated args
deactivate RootArgs
PrepareDataset->>Subcommands: Invoke subcommand<br/>(real-dataset or synthetic)
activate Subcommands
alt Real Dataset Path
Subcommands->>Subcommands: Load HF dataset<br/>& DatasetConfig
Subcommands->>Utils: Generate samples<br/>(text or multimodal)
else Synthetic Dataset Path
Subcommands->>Utils: Sample lengths from<br/>distribution
Subcommands->>Utils: Generate random tokens
end
Subcommands->>Utils: Serialize Workload<br/>(TextSample/MultimodalSample)
Utils->>Output: Write JSON to file
Subcommands-->>User: Complete
deactivate Subcommands
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas requiring extra attention:
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (8)
tensorrt_llm/bench/dataset/prepare_dataset.py (1)
34-49: Improve exception chaining for better debugging.The exception handling would benefit from using
raise ... from eto preserve the original traceback context.Apply this diff:
except EnvironmentError as e: - raise ValueError( + raise ValueError( "Cannot find a tokenizer from the given string because of " f"{e}\nPlease set tokenizer to the directory that contains " "the tokenizer, or set to a model name in HuggingFace." - ) + ) from etensorrt_llm/bench/dataset/prepare_synthetic_data.py (3)
15-38: Add stacklevel to warning for better debugging.The warning should include
stacklevel=2to point to the caller's code rather than this helper function.Apply this diff:
if use_task_ids and not use_lora: warnings.warn( - "Task IDs require LoRA directory. Use --lora-dir or omit task IDs.", UserWarning + "Task IDs require LoRA directory. Use --lora-dir or omit task IDs.", + UserWarning, + stacklevel=2 )
50-52: Remove unused variable initializations.These variables are immediately overwritten and never used in their initial state.
Apply this diff:
- input_ids = [] - input_lens = [] - output_lens = [] - input_lens = get_norm_dist_lengths(
110-112: Remove unused variable initializations.These variables are immediately overwritten and never used in their initial state.
Apply this diff:
- input_ids = [] - input_lens = [] - output_lens = [] - input_lens = get_unif_dist_lengths(tensorrt_llm/bench/dataset/prepare_real_data.py (2)
21-32: Use Click's BadParameter exception for better error reporting.Using
click.BadParameterprovides better integration with Click's error handling and gives users clearer feedback about which parameter failed.Apply this diff:
def validate_output_len_dist(ctx, param, value): """Validate the --output-len-dist option.""" if value is None: return value m = re.match(r"(\d+),(\d+)", value) if m: return int(m.group(1)), int(m.group(2)) else: - raise AssertionError( + raise click.BadParameter( "Incorrect specification for --output-len-dist. Correct format: " "--output-len-dist <output_len_mean>,<output_len_stdev>" )
55-61: Use ValueError instead of AssertionError in validator.Validators should raise
ValueErrorfor invalid data rather thanAssertionError, which is typically reserved for internal invariants.Apply this diff:
@model_validator(mode="after") def check_prompt(self) -> "DatasetConfig": if self.prompt_key and self.prompt: - raise AssertionError("--prompt-key and --prompt cannot be set at the same time.") + raise ValueError("--prompt-key and --prompt cannot be set at the same time.") if (not self.prompt_key) and (not self.prompt): - raise AssertionError("Either --prompt-key or --prompt must be set.") + raise ValueError("Either --prompt-key or --prompt must be set.") return selftensorrt_llm/bench/dataset/utils.py (2)
29-40: Consider using Pydantic's standard patterns.The custom
__init__could be replaced with a@model_validator(mode='after')ormodel_post_inithook for better alignment with Pydantic conventions.Example using
@model_validator:class Workload(BaseModel): metadata: dict samples: List[Union[TextSample, MultimodalSample]] = [] - def __init__(self, **kwargs) -> None: - super().__init__(**kwargs) - self.setup_workload_name() - - def setup_workload_name(self): + @model_validator(mode='after') + def setup_workload_name(self): # Keys to ignore ignore_keys = ["tokenizer"] # Create a string by concatenating keys and values with "__" workload_name = "__".join( f"{key}:{value}" for key, value in self.metadata.items() if key not in ignore_keys ) self.metadata.setdefault("workload_name", workload_name) + return self
96-104: Add strict parameter to zip for safety.Without
strict=True, mismatched list lengths would be silently truncated, potentially hiding bugs.Apply this diff:
def print_multimodal_dataset(multimodal_texts, multimodal_image_paths, output_lens): - for i, (text, image_paths) in enumerate(zip(multimodal_texts, multimodal_image_paths)): + for i, (text, image_paths) in enumerate(zip(multimodal_texts, multimodal_image_paths, strict=True)): d = { "task_id": i, "prompt": text, "media_paths": image_paths, "output_tokens": output_lens[i], } yield json.dumps(d, separators=(",", ":"), ensure_ascii=False)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.gitignore(1 hunks)tensorrt_llm/bench/dataset/prepare_dataset.py(1 hunks)tensorrt_llm/bench/dataset/prepare_real_data.py(1 hunks)tensorrt_llm/bench/dataset/prepare_synthetic_data.py(1 hunks)tensorrt_llm/bench/dataset/utils.py(1 hunks)tensorrt_llm/commands/bench.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-18T08:42:02.640Z
Learnt from: samuellees
Repo: NVIDIA/TensorRT-LLM PR: 6974
File: tensorrt_llm/serve/scripts/benchmark_dataset.py:558-566
Timestamp: 2025-08-18T08:42:02.640Z
Learning: In TensorRT-LLM's RandomDataset (tensorrt_llm/serve/scripts/benchmark_dataset.py), when using --random-token-ids option, sequence length accuracy is prioritized over semantic correctness for benchmarking purposes. The encode/decode operations should use skip_special_tokens=True and add_special_tokens=False to ensure exact target token lengths.
Applied to files:
tensorrt_llm/bench/dataset/prepare_dataset.pytensorrt_llm/bench/dataset/utils.pytensorrt_llm/bench/dataset/prepare_synthetic_data.py
🧬 Code graph analysis (4)
tensorrt_llm/commands/bench.py (1)
tensorrt_llm/bench/dataset/prepare_dataset.py (1)
prepare_dataset(73-84)
tensorrt_llm/bench/dataset/prepare_dataset.py (4)
tensorrt_llm/bench/dataset/prepare_real_data.py (1)
real_dataset(209-314)tensorrt_llm/bench/dataset/prepare_synthetic_data.py (2)
token_norm_dist(48-90)token_unif_dist(108-150)tensorrt_llm/builder.py (1)
default(45-50)tensorrt_llm/bench/benchmark/__init__.py (1)
checkpoint_path(75-76)
tensorrt_llm/bench/dataset/prepare_real_data.py (1)
tensorrt_llm/bench/dataset/utils.py (3)
get_norm_dist_lengths(122-126)print_multimodal_dataset(96-104)print_text_dataset(79-93)
tensorrt_llm/bench/dataset/prepare_synthetic_data.py (1)
tensorrt_llm/bench/dataset/utils.py (5)
gen_random_tokens(136-157)get_norm_dist_lengths(122-126)get_unif_dist_lengths(129-133)print_text_dataset(79-93)text_dataset_dump(43-57)
🪛 Ruff (0.14.5)
tensorrt_llm/bench/dataset/prepare_dataset.py
41-45: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
41-45: Avoid specifying long messages outside the exception class
(TRY003)
tensorrt_llm/bench/dataset/utils.py
97-97: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
142-142: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
tensorrt_llm/bench/dataset/prepare_real_data.py
21-21: Unused function argument: ctx
(ARG001)
21-21: Unused function argument: param
(ARG001)
29-32: Avoid specifying long messages outside the exception class
(TRY003)
58-58: Avoid specifying long messages outside the exception class
(TRY003)
60-60: Avoid specifying long messages outside the exception class
(TRY003)
109-115: Avoid specifying long messages outside the exception class
(TRY003)
150-150: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
247-247: Avoid specifying long messages outside the exception class
(TRY003)
268-268: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
tensorrt_llm/bench/dataset/prepare_synthetic_data.py
21-21: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
30-30: No explicit stacklevel keyword argument found
Set stacklevel=2
(B028)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (6)
.gitignore (1)
88-90: ✓ Sensible addition to ignore macOS system files.Adding
.DS_Storeto.gitignoreis a standard best practice that prevents macOS Finder metadata from being committed. The comment clearly delineates the macOS-specific section, and the placement is logical.tensorrt_llm/commands/bench.py (1)
9-9: LGTM! Clean integration of the new dataset preparation command.The import and command registration follow the existing pattern consistently.
Also applies to: 61-61
tensorrt_llm/bench/dataset/prepare_dataset.py (2)
25-32: LGTM! Well-structured data model.The RootArgs model cleanly captures all necessary configuration parameters for dataset preparation.
87-89: LGTM! Command registration follows Click conventions.tensorrt_llm/bench/dataset/utils.py (2)
43-76: LGTM! Clean serialization functions.Both dump functions properly handle directory creation and JSON serialization.
136-157: LGTM! Robust random token generation.The function correctly handles EOS token exclusion and uses deterministic seeding for reproducibility. Based on learnings, this aligns with TensorRT-LLM's approach of prioritizing sequence length accuracy over semantic correctness for benchmarking.
| def prepare_dataset(ctx, **kwargs): | ||
| """Prepare dataset for benchmarking with trtllm-bench.""" | ||
| model = ctx.obj.model or ctx.obj.checkpoint_path | ||
| ctx.obj = RootArgs( | ||
| tokenizer=model, | ||
| output=kwargs["output"], | ||
| random_seed=kwargs["random_seed"], | ||
| task_id=kwargs["task_id"], | ||
| rand_task_id=kwargs["rand_task_id"], | ||
| lora_dir=kwargs["lora_dir"], | ||
| trust_remote_code=kwargs["trust_remote_code"], | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Missing std_out field in RootArgs model.
The prepare_synthetic_data.py commands reference root_args.std_out (lines 70 and 130), but the RootArgs model doesn't define this field. This will cause an AttributeError at runtime.
Add the missing field to the RootArgs model:
class RootArgs(BaseModel):
tokenizer: str
output: str
random_seed: int
task_id: int
trust_remote_code: bool = False
rand_task_id: Optional[Tuple[int, int]]
lora_dir: Optional[str] = None
+ std_out: bool = FalseAnd add the corresponding CLI option:
@click.option(
"--trust-remote-code",
is_flag=True,
default=False,
envvar="TRUST_REMOTE_CODE",
help="Trust remote code.",
)
+@click.option(
+ "--std-out",
+ is_flag=True,
+ default=False,
+ help="Output to stdout instead of file.",
+)
@click.pass_context
def prepare_dataset(ctx, **kwargs):And pass it to RootArgs:
ctx.obj = RootArgs(
tokenizer=model,
output=kwargs["output"],
random_seed=kwargs["random_seed"],
task_id=kwargs["task_id"],
rand_task_id=kwargs["rand_task_id"],
lora_dir=kwargs["lora_dir"],
trust_remote_code=kwargs["trust_remote_code"],
+ std_out=kwargs["std_out"],
)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In tensorrt_llm/bench/dataset/prepare_dataset.py around lines 73-84,
prepare_dataset constructs RootArgs but omits the std_out field referenced
elsewhere, causing AttributeError; add a std_out attribute to the RootArgs model
definition (where RootArgs/dataclass is declared), add a corresponding CLI
option (e.g., --std-out / std_out) to the CLI that builds kwargs so it is
captured from user input, and pass that value into RootArgs here
(std_out=kwargs["std_out"]) so root_args.std_out is available to
prepare_synthetic_data.py.
| except ValueError as e: | ||
| if "Config" in e: | ||
| e += "\n Please add the config name to the dataset config yaml." | ||
| elif "split" in e: | ||
| e += "\n Please specify supported split in the dataset config yaml." | ||
| raise ValueError(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Fix exception handling logic.
The code attempts to concatenate strings to an exception object, which will fail. The error message should be constructed properly and re-raised with proper exception chaining.
Apply this diff:
except ValueError as e:
+ error_msg = str(e)
if "Config" in str(e):
- e += "\n Please add the config name to the dataset config yaml."
+ error_msg += "\n Please add the config name to the dataset config yaml."
elif "split" in str(e):
- e += "\n Please specify supported split in the dataset config yaml."
- raise ValueError(e)
+ error_msg += "\n Please specify supported split in the dataset config yaml."
+ raise ValueError(error_msg) from e🧰 Tools
🪛 Ruff (0.14.5)
150-150: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
In tensorrt_llm/bench/dataset/prepare_real_data.py around lines 145 to 150, the
except block incorrectly tries to concatenate strings to the exception object (e
+= ...), which will raise a TypeError; instead, build a new message string from
the original exception text (e.g., msg = str(e) + " ..."), append the
appropriate guidance for "Config" or "split", then raise a new ValueError(msg)
while preserving exception chaining using "raise ValueError(msg) from e".
| if any(key in req for key in ["image", "image_1", "video"]): | ||
| # multimodal input | ||
| if "video" in req and req["video"] is not None: | ||
| assert "Not supported yet" | ||
| assert kwargs["output_len_dist"] is not None, ( | ||
| "Output length distribution must be set for multimodal requests." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace assertions with proper exceptions.
Using bare assert statements with string messages and for input validation is incorrect. These should be proper exceptions.
Apply this diff:
if any(key in req for key in ["image", "image_1", "video"]):
# multimodal input
if "video" in req and req["video"] is not None:
- assert "Not supported yet"
- assert kwargs["output_len_dist"] is not None, (
- "Output length distribution must be set for multimodal requests."
- )
+ raise NotImplementedError("Video modality is not supported yet")
+ if kwargs["output_len_dist"] is None:
+ raise ValueError("Output length distribution must be set for multimodal requests.")🤖 Prompt for AI Agents
In tensorrt_llm/bench/dataset/prepare_real_data.py around lines 224 to 230,
replace the bare assert statements used for input validation with proper
exceptions: when "video" is present and not supported, raise
NotImplementedError("Video input not supported yet") instead of assert "Not
supported yet"; and when kwargs["output_len_dist"] is required but missing or
None, raise ValueError("Output length distribution must be set for multimodal
requests.") (also guard with kwargs.get("output_len_dist") is None to avoid
KeyError). Ensure the exceptions provide the same messages and stop execution
cleanly.
| @click.option( | ||
| "--input-min", required=True, type=int, help="uniform dist (inclusive) min for input tokens" | ||
| ) | ||
| @click.option( | ||
| "--input-max", required=True, type=int, help="normal dist (inclusive) max for input tokens" | ||
| ) | ||
| @click.option( | ||
| "--output-min", required=True, type=int, help="normal dist (inclusive) min for output tokens" | ||
| ) | ||
| @click.option( | ||
| "--output-max", required=True, type=int, help="normal dist (inclusive) max for output tokens" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix copy-paste errors in help text.
The help text incorrectly says "normal dist" instead of "uniform dist" for the uniform distribution command.
Apply this diff:
@click.option(
"--input-min", required=True, type=int, help="uniform dist (inclusive) min for input tokens"
)
@click.option(
- "--input-max", required=True, type=int, help="normal dist (inclusive) max for input tokens"
+ "--input-max", required=True, type=int, help="uniform dist (inclusive) max for input tokens"
)
@click.option(
- "--output-min", required=True, type=int, help="normal dist (inclusive) min for output tokens"
+ "--output-min", required=True, type=int, help="uniform dist (inclusive) min for output tokens"
)
@click.option(
- "--output-max", required=True, type=int, help="normal dist (inclusive) max for output tokens"
+ "--output-max", required=True, type=int, help="uniform dist (inclusive) max for output tokens"
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @click.option( | |
| "--input-min", required=True, type=int, help="uniform dist (inclusive) min for input tokens" | |
| ) | |
| @click.option( | |
| "--input-max", required=True, type=int, help="normal dist (inclusive) max for input tokens" | |
| ) | |
| @click.option( | |
| "--output-min", required=True, type=int, help="normal dist (inclusive) min for output tokens" | |
| ) | |
| @click.option( | |
| "--output-max", required=True, type=int, help="normal dist (inclusive) max for output tokens" | |
| ) | |
| @click.option( | |
| "--input-min", required=True, type=int, help="uniform dist (inclusive) min for input tokens" | |
| ) | |
| @click.option( | |
| "--input-max", required=True, type=int, help="uniform dist (inclusive) max for input tokens" | |
| ) | |
| @click.option( | |
| "--output-min", required=True, type=int, help="uniform dist (inclusive) min for output tokens" | |
| ) | |
| @click.option( | |
| "--output-max", required=True, type=int, help="uniform dist (inclusive) max for output tokens" | |
| ) |
🤖 Prompt for AI Agents
In tensorrt_llm/bench/dataset/prepare_synthetic_data.py around lines 95 to 106,
the click.option help strings were copy-pasted and incorrectly label some
uniform-distribution flags as "normal dist"; update the help text for
--input-max, --output-min, and --output-max to say "uniform dist (inclusive)
..." matching --input-min (use "uniform dist (inclusive) max for input tokens"
for --input-max and "uniform dist (inclusive) min/max for output tokens" for
--output-min and --output-max respectively) so all four flags consistently
describe a uniform distribution.
| def get_list_of_delays(delay_dist, mean_time_bet_reqs, num_reqs, random_seed): | ||
| if delay_dist == "constant": | ||
| delays = [mean_time_bet_reqs] * num_reqs | ||
| elif delay_dist == "exponential_dist": | ||
| delays = get_exponential_dist_delays(mean_time_bet_reqs, num_reqs, random_seed) | ||
|
|
||
| return delays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle unsupported delay_dist values.
If delay_dist is neither "constant" nor "exponential_dist", the function returns an undefined variable, causing a runtime error.
Apply this diff:
def get_list_of_delays(delay_dist, mean_time_bet_reqs, num_reqs, random_seed):
if delay_dist == "constant":
delays = [mean_time_bet_reqs] * num_reqs
elif delay_dist == "exponential_dist":
delays = get_exponential_dist_delays(mean_time_bet_reqs, num_reqs, random_seed)
+ else:
+ raise ValueError(f"Unsupported delay_dist: {delay_dist}. Must be 'constant' or 'exponential_dist'.")
return delays📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_list_of_delays(delay_dist, mean_time_bet_reqs, num_reqs, random_seed): | |
| if delay_dist == "constant": | |
| delays = [mean_time_bet_reqs] * num_reqs | |
| elif delay_dist == "exponential_dist": | |
| delays = get_exponential_dist_delays(mean_time_bet_reqs, num_reqs, random_seed) | |
| return delays | |
| def get_list_of_delays(delay_dist, mean_time_bet_reqs, num_reqs, random_seed): | |
| if delay_dist == "constant": | |
| delays = [mean_time_bet_reqs] * num_reqs | |
| elif delay_dist == "exponential_dist": | |
| delays = get_exponential_dist_delays(mean_time_bet_reqs, num_reqs, random_seed) | |
| else: | |
| raise ValueError(f"Unsupported delay_dist: {delay_dist}. Must be 'constant' or 'exponential_dist'.") | |
| return delays |
🤖 Prompt for AI Agents
tensorrt_llm/bench/dataset/utils.py around lines 107 to 113: the function can
leave 'delays' undefined if delay_dist is neither "constant" nor
"exponential_dist"; add an explicit else branch that raises a clear ValueError
(or TypeError) listing supported values (e.g., "constant", "exponential_dist")
so callers get a meaningful error instead of an UnboundLocalError; ensure the
error message includes the received delay_dist value for easier debugging.
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Signed-off-by: Frank Di Natale <[email protected]>
Summary by CodeRabbit
New Features
Chores
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.