Skip to content

Commit 03b5a8f

Browse files
authored
Merge branch 'main' into split_tests
Signed-off-by: Yanchao Lu <[email protected]>
2 parents 45910cc + 0680566 commit 03b5a8f

File tree

6 files changed

+9
-3
lines changed

6 files changed

+9
-3
lines changed

tensorrt_llm/_torch/auto_deploy/custom_ops/attention_interface.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -476,9 +476,6 @@ def update_input_ids_with_new_tokens(
476476
idx = self.previous_batch_indices_cuda[: len(previous_batch_indices)]
477477
idx.copy_(host_idx, non_blocking=True)
478478

479-
# sort them so that masked_scatter_ lines up correctly
480-
idx, _ = idx.sort()
481-
482479
# gather the exact values you want to write
483480
src = new_tokens[0, idx, 0]
484481

tests/integration/test_lists/test-db/l0_b200.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ l0_b200:
6969
- unittest/_torch/modeling -k "modeling_deepseek"
7070
- unittest/_torch/modeling -k "modeling_gpt_oss"
7171
- unittest/_torch/auto_deploy/unit/singlegpu -k "not test_trtllm_bench_backend_comparison"
72+
# ------------- AutoDeploy tests ---------------
73+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
7274
- condition:
7375
ranges:
7476
system_gpu_count:

tests/integration/test_lists/test-db/l0_dgx_b200.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,3 +103,4 @@ l0_dgx_b200:
103103
- accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_4gpus[dp4-TRITON]
104104
- disaggregated/test_disaggregated.py::test_disaggregated_benchmark_on_diff_backends[llama-v3-8b-hf]
105105
- disaggregated/test_disaggregated.py::test_disaggregated_benchmark_on_diff_backends[DeepSeek-V3-Lite-fp8]
106+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype

tests/integration/test_lists/test-db/l0_dgx_h100.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ l0_dgx_h100:
6161
- test_e2e.py::test_ptp_quickstart_advanced_bs1
6262
- test_e2e.py::test_ptp_quickstart_advanced_deepseek_v3_lite_4gpus_adp_balance[DeepSeek-V3-Lite-FP8-DeepSeek-V3-Lite/fp8]
6363
- unittest/_torch/modeling/test_modeling_pixtral.py::test_tensor_parallelism
64+
# ------------- AutoDeploy tests ---------------
65+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
6466
- condition:
6567
ranges:
6668
system_gpu_count:

tests/integration/test_lists/test-db/l0_dgx_h200.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ l0_dgx_h200:
3434
- unittest/_torch/multi_gpu_modeling/test_llama4.py::test_llama4[pp1-ep1-disable_adp-enable_graph-tp8-trtllm-scout]
3535
- unittest/_torch/multi_gpu_modeling/test_llama4.py::test_llama4[pp1-ep4-enable_adp-enable_graph-tp8-trtllm-scout]
3636
- unittest/llmapi/test_llm_pytorch.py::test_nemotron_nas_lora
37+
# ------------- AutoDeploy tests ---------------
38+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
3739
- condition:
3840
ranges:
3941
system_gpu_count:

tests/integration/test_lists/test-db/l0_h100.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,8 @@ l0_h100:
102102
- test_e2e.py::test_trtllm_bench_request_rate_and_concurrency[enable_concurrency-enable_request_rate] # negative test
103103
- test_e2e.py::test_trtllm_bench_help_sanity[meta-llama/Llama-3.1-8B]
104104
- test_e2e.py::test_ptp_quickstart_multimodal[gemma-3-27b-it-gemma/gemma-3-27b-it-image-True]
105+
# ------------- AutoDeploy tests ---------------
106+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
105107
- condition:
106108
ranges:
107109
system_gpu_count:

0 commit comments

Comments
 (0)