Skip to content

Commit 60e4d3a

Browse files
authored
[test] Add accuracy regression test for Mistral3.1 (#6322)
Signed-off-by: William Zhang <[email protected]>
1 parent 4904473 commit 60e4d3a

File tree

5 files changed

+21
-0
lines changed

5 files changed

+21
-0
lines changed

tests/integration/defs/accuracy/references/cnn_dailymail.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,8 @@ mistralai/Mistral-7B-Instruct-v0.3:
188188
accuracy: 31.457
189189
- quant_algo: W4A8_AWQ
190190
accuracy: 31.201
191+
mistralai/Mistral-Small-3.1-24B-Instruct-2503:
192+
- accuracy: 29.20
191193
mistralai/Mistral-Nemo-Base-2407:
192194
- quant_algo: FP8
193195
kv_cache_quant_algo: FP8

tests/integration/defs/accuracy/references/gsm8k.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,5 +122,7 @@ mistralai/Ministral-8B-Instruct-2410:
122122
- quant_algo: FP8
123123
kv_cache_quant_algo: FP8
124124
accuracy: 78.35
125+
mistralai/Mistral-Small-3.1-24B-Instruct-2503:
126+
- accuracy: 89.23
125127
microsoft/Phi-4-multimodal-instruct:
126128
- accuracy: 81.19

tests/integration/defs/accuracy/references/mmlu.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,8 @@ mistralai/Mixtral-8x7B-Instruct-v0.1:
9595
mistralai/Mixtral-8x22B-v0.1:
9696
- quant_algo: FP8
9797
accuracy: 77.63
98+
mistralai/Mistral-Small-3.1-24B-Instruct-2503:
99+
- accuracy: 81.7
98100
google/gemma-2-9b-it:
99101
- accuracy: 73.05
100102
google/gemma-3-27b-it:

tests/integration/defs/accuracy/test_llm_api_pytorch.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -521,6 +521,20 @@ def test_auto_dtype(self):
521521
task.evaluate(llm)
522522

523523

524+
class TestMistralSmall24B(LlmapiAccuracyTestHarness):
525+
MODEL_NAME = "mistralai/Mistral-Small-3.1-24B-Instruct-2503"
526+
MODEL_PATH = f"{llm_models_root()}/Mistral-Small-3.1-24B-Instruct-2503"
527+
528+
def test_auto_dtype(self):
529+
with LLM(self.MODEL_PATH) as llm:
530+
task = CnnDailymail(self.MODEL_NAME)
531+
task.evaluate(llm)
532+
task = MMLU(self.MODEL_NAME)
533+
task.evaluate(llm)
534+
task = GSM8K(self.MODEL_NAME)
535+
task.evaluate(llm)
536+
537+
524538
class TestMinistral8BInstruct(LlmapiAccuracyTestHarness):
525539
MODEL_NAME = "mistralai/Ministral-8B-Instruct-2410"
526540
MODEL_PATH = f"{llm_models_root()}/Ministral-8B-Instruct-2410"

tests/integration/test_lists/test-db/l0_h100.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,7 @@ l0_h100:
192192
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_fp8_block_scales[mtp=vanilla-fp8kv=False-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False]
193193
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_no_kv_cache_reuse[quant_dtype=none-mtp_nextn=2-fp8kv=False-attention_dp=True-cuda_graph=True-overlap_scheduler=True]
194194
- accuracy/test_llm_api_pytorch.py::TestGemma3_27BInstruct::test_auto_dtype
195+
- accuracy/test_llm_api_pytorch.py::TestMistralSmall24B::test_auto_dtype
195196
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_fp8_block_scales[latency]
196197
- accuracy/test_llm_api_pytorch.py::TestLlama3_1_8BInstruct::test_guided_decoding[llguidance]
197198
- test_e2e.py::test_ptp_quickstart_multimodal[mistral-small-3.1-24b-instruct-Mistral-Small-3.1-24B-Instruct-2503-image-True]

0 commit comments

Comments
 (0)