Skip to content

[Bugfix]Fix accuracy test#953

Closed
zhangxinyuehfad wants to merge 1 commit intovllm-project:mainfrom
zhangxinyuehfad:zxy_accuracy_workflow
Closed

[Bugfix]Fix accuracy test#953
zhangxinyuehfad wants to merge 1 commit intovllm-project:mainfrom
zhangxinyuehfad:zxy_accuracy_workflow

Conversation

@zhangxinyuehfad
Copy link
Copy Markdown
Collaborator

@zhangxinyuehfad zhangxinyuehfad commented May 26, 2025

What this PR does / why we need it?

Fix accuracy test

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually

@zhangxinyuehfad zhangxinyuehfad force-pushed the zxy_accuracy_workflow branch 2 times, most recently from 0636582 to b4f8d25 Compare May 26, 2025 03:50
Comment thread .github/workflows/accuracy_test.yaml Outdated
@@ -28,6 +28,10 @@ on:
description: 'what vllm-ascend version to accuracy test?'
required: true
type: string
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options:

Comment thread .github/workflows/accuracy_test.yaml Outdated
@@ -28,6 +28,10 @@ on:
description: 'what vllm-ascend version to accuracy test?'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vLLM version:

vLLM Ascend version:

Model:

@zhangxinyuehfad zhangxinyuehfad force-pushed the zxy_accuracy_workflow branch from b4f8d25 to 8503890 Compare May 27, 2025 11:31
required: true
type: string
type: choice
options:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't support all vllm version. Please just list the version vllm-ascend support:

type: string
type: choice
options:
- main
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.7.1 is out-of-date, no need to test it.

@zhangxinyuehfad zhangxinyuehfad force-pushed the zxy_accuracy_workflow branch from 8503890 to 53bf706 Compare May 29, 2025 01:16
Signed-off-by: hfadzxy <starmoon_zhang@163.com>
@zhangxinyuehfad zhangxinyuehfad force-pushed the zxy_accuracy_workflow branch from 53bf706 to 6aa72a5 Compare May 29, 2025 07:23
@Yikun Yikun added ready read for review and removed ready read for review labels May 30, 2025
@Yikun
Copy link
Copy Markdown
Member

Yikun commented May 31, 2025

Could we add a PR triggered event to make sure the PR work?

Copy link
Copy Markdown
Member

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do pr test first

@Yikun
Copy link
Copy Markdown
Member

Yikun commented Jun 1, 2025

Replaced by #1040

@Yikun Yikun closed this in #1040 Jun 3, 2025
Yikun added a commit that referenced this pull request Jun 3, 2025
### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete


In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: #953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete

In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete

In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete

In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete

In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 4, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete

In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 4, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete

In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Oct 16, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete


In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
…ct#1040)

### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete


In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: vllm-project#953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants