Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Qwen/Qwen2.5-VL-7B-Instruct

**vLLM Version**: vLLM: 0.10.0 ([6d8d0a2](https://github.com/vllm-project/vllm/commit/6d8d0a2)),
**vLLM Ascend Version**: v0.10.0rc1 ([4604882](https://github.com/vllm-project/vllm-ascend/commit/4604882))
**Software Environment**: CANN: 8.2.RC1, PyTorch: 2.7.1, torch-npu: 2.7.1.dev20250724
**Hardware Environment**: Atlas A2 Series
**Datasets**: mmmu_val
**Parallel Mode**: TP
**Execution Mode**: ACLGraph

**Command**:

```bash
export MODEL_ARGS='pretrained=Qwen/Qwen2.5-VL-7B-Instruct,tensor_parallel_size=1,dtype=auto,trust_remote_code=False,max_model_len=8192'
lm_eval --model vllm-vlm --model_args $MODEL_ARGS --tasks mmmu_val \
--apply_chat_template True --fewshot_as_multiturn True \
--limit None --batch_size auto
```
| Task | Metric | Value | Stderr |
|-----------------------|-------------|----------:|-------:|
| mmmu_val | acc,none |✅0.5211111111111111 | ± 0.0162 |
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Qwen/Qwen3-30B-A3B

**vLLM Version**: vLLM: 0.10.0 ([6d8d0a2](https://github.com/vllm-project/vllm/commit/6d8d0a2)),
**vLLM Ascend Version**: v0.10.0rc1 ([4604882](https://github.com/vllm-project/vllm-ascend/commit/4604882))
**Software Environment**: CANN: 8.2.RC1, PyTorch: 2.7.1, torch-npu: 2.7.1.dev20250724
**Hardware Environment**: Atlas A2 Series
**Datasets**: gsm8k
**Parallel Mode**: TP
**Execution Mode**: ACLGraph

**Command**:

```bash
export MODEL_ARGS='pretrained=Qwen/Qwen3-30B-A3B,tensor_parallel_size=2,dtype=auto,trust_remote_code=False,max_model_len=4096,gpu_memory_utilization=0.6,enable_expert_parallel=True'
lm_eval --model vllm --model_args $MODEL_ARGS --tasks gsm8k \
--apply_chat_template False --fewshot_as_multiturn False --num_fewshot 5 \
--limit None --batch_size auto
```
| Task | Metric | Value | Stderr |
|-----------------------|-------------|----------:|-------:|
| gsm8k | exact_match,strict-match |✅0.8938589840788476 | ± 0.0085 |
| gsm8k | exact_match,flexible-extract |✅0.8476118271417741 | ± 0.0099 |
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Qwen/Qwen3-8B-Base

**vLLM Version**: vLLM: 0.10.0 ([6d8d0a2](https://github.com/vllm-project/vllm/commit/6d8d0a2)),
**vLLM Ascend Version**: v0.10.0rc1 ([4604882](https://github.com/vllm-project/vllm-ascend/commit/4604882))
**Software Environment**: CANN: 8.2.RC1, PyTorch: 2.7.1, torch-npu: 2.7.1.dev20250724
**Hardware Environment**: Atlas A2 Series
**Datasets**: gsm8k
**Parallel Mode**: TP
**Execution Mode**: ACLGraph

**Command**:

```bash
export MODEL_ARGS='pretrained=Qwen/Qwen3-8B-Base,tensor_parallel_size=1,dtype=auto,trust_remote_code=False,max_model_len=4096'
lm_eval --model vllm --model_args $MODEL_ARGS --tasks gsm8k \
--apply_chat_template True --fewshot_as_multiturn True --num_fewshot 5 \
--limit None --batch_size auto
```
| Task | Metric | Value | Stderr |
|-----------------------|-------------|----------:|-------:|
| gsm8k | exact_match,strict-match |✅0.8278999241849886 | ± 0.0104 |
| gsm8k | exact_match,flexible-extract |✅0.8294162244124337 | ± 0.0104 |
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Accuracy Report

:::{toctree}
:caption: Accuracy Report
:maxdepth: 1
Qwen2.5-VL-7B-Instruct
Qwen3-30B-A3B
Qwen3-8B-Base
:::