Skip to content

Commit 3492391

Browse files
authored
[None][chore] AutoDeploy: clean up accuracy test configs (#8134)
Signed-off-by: Lucas Liebenwein <[email protected]>
1 parent 98b3af4 commit 3492391

File tree

6 files changed

+9
-7
lines changed

6 files changed

+9
-7
lines changed

tests/integration/defs/accuracy/test_llm_api_autodeploy.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,11 +66,13 @@ def get_default_sampling_params(self):
6666
use_beam_search=beam_width > 1)
6767

6868
@pytest.mark.skip_less_device_memory(32000)
69-
def test_auto_dtype(self):
69+
@pytest.mark.parametrize("world_size", [1, 2, 4])
70+
def test_auto_dtype(self, world_size):
7071
kwargs = self.get_default_kwargs()
7172
sampling_params = self.get_default_sampling_params()
7273
with AutoDeployLLM(model=self.MODEL_PATH,
7374
tokenizer=self.MODEL_PATH,
75+
world_size=world_size,
7476
**kwargs) as llm:
7577
task = CnnDailymail(self.MODEL_NAME)
7678
task.evaluate(llm)

tests/integration/test_lists/test-db/l0_b200.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,8 @@ l0_b200:
7474
- unittest/_torch/modeling -k "modeling_llama"
7575
- unittest/_torch/modeling -k "modeling_mixtral"
7676
- unittest/_torch/modeling -k "modeling_gpt_oss"
77+
# ------------- AutoDeploy tests ---------------
78+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[1]
7779
- unittest/_torch/auto_deploy/unit/singlegpu
7880
- condition:
7981
ranges:

tests/integration/test_lists/test-db/l0_dgx_b200.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -181,5 +181,3 @@ l0_dgx_b200:
181181
- accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_4gpus[dp4-cutlass-auto]
182182
- accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_4gpus[dp4-triton-auto]
183183
- disaggregated/test_disaggregated.py::test_disaggregated_benchmark_on_diff_backends[llama-v3-8b-hf]
184-
# ------------- AutoDeploy tests ---------------
185-
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype

tests/integration/test_lists/test-db/l0_dgx_h100.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ l0_dgx_h100:
4141
- accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_auto_dtype[True-True-False]
4242
- accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_auto_dtype[True-True-True]
4343
# ------------- AutoDeploy tests ---------------
44-
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
44+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[2]
4545
- condition:
4646
ranges:
4747
system_gpu_count:

tests/integration/test_lists/test-db/l0_dgx_h200.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,6 @@ l0_dgx_h200:
3434
- unittest/_torch/multi_gpu_modeling/test_llama4.py::test_llama4[pp1-ep1-disable_adp-enable_graph-tp8-trtllm-scout]
3535
- unittest/_torch/multi_gpu_modeling/test_llama4.py::test_llama4[pp1-ep4-enable_adp-enable_graph-tp8-trtllm-scout]
3636
- unittest/llmapi/test_llm_pytorch.py::test_nemotron_nas_lora
37-
# ------------- AutoDeploy tests ---------------
38-
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
3937
- condition:
4038
ranges:
4139
system_gpu_count:
@@ -121,6 +119,8 @@ l0_dgx_h200:
121119
- test_e2e.py::test_trtllm_bench_llmapi_launch[pytorch_backend-llama-v3-llama3-8b]
122120
- test_e2e.py::test_trtllm_bench_mgmn
123121
- unittest/_torch/multi_gpu -m "post_merge" TIMEOUT (90)
122+
# ------------- AutoDeploy tests ---------------
123+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[4]
124124
- condition:
125125
ranges:
126126
system_gpu_count:

tests/integration/test_lists/test-db/l0_h100.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ l0_h100:
114114
- test_e2e.py::test_ptp_quickstart_multimodal[gemma-3-27b-it-gemma/gemma-3-27b-it-image-True] TIMEOUT (90)
115115
- test_e2e.py::test_trtllm_benchmark_serving[llama-3.1-model/Meta-Llama-3.1-8B]
116116
# ------------- AutoDeploy tests ---------------
117-
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype
117+
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[1]
118118
- accuracy/test_llm_api_autodeploy.py::TestNemotronH::test_auto_dtype
119119
- condition:
120120
ranges:

0 commit comments

Comments
 (0)