Implement integration tests in CI pipeline (won't be merged) by meiyeh123 · Pull Request #532 · vllm-project/tpu-inference

meiyeh123 · 2025-08-21T04:11:09Z

Description

Based on the Integration Test requirements, this change adds a new test accuracy on TPU to the CI pipeline. The test covers the Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct models, modifying the test to support comparing EXPECTED_VALUE. It also allows users to input tensor-parallel-size and model-names parameters for greater flexibility during execution.

Based on the description of the vllm PR at vllm-project/vllm#18800, we have changed the lm_eval version used to git+https://github.com/EleutherAI/lm-evaluation-harness.git@206b7722158f58c35b7ffcd53b035fdbdda5126d#egg=lm-eval[api].

In the future, we'll support collecting expected values on GPUs and comparing them with results from TPUs. We'll achieve this by enabling both GPUs and TPUs to read and write expected JSON files between Buildkite steps. We've already started implementing some of these files.

The testing logic is based on the source code at https://github.com/vllm-project/vllm/blob/839ab00/tests/entrypoints/llm/test_accuracy.py. We are developing based on this source.

Tests

Tested on the Buildkite agent.

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.

github-actions · 2025-08-21T04:11:19Z

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

why is this change being made,
the problem being solved and any relevant context,
why this is a good solution,
some information about the specific implementation,
shortcomings of the solution and possible future improvements.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456
FIXES: #123456

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have made or will make corresponding changes to any relevant documentation.

…y on TPU to the CI pipeline. The test covers the Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct models, modifying the test to support comparing `EXPECTED_VALUE`. It also allows users to input `tensor-parallel-size` and `model-names` parameters for greater flexibility during execution

meiyeh123 · 2025-10-16T06:39:42Z

this PR is deprecated

CienetStingLin force-pushed the sting_test branch 4 times, most recently from 0e2c5af to f643160 Compare September 4, 2025 03:35

CienetStingLin changed the title ~~Testing CI script, won't be merged (intergration test )~~ Implement integration tests in CI pipeline (won't be merged) Sep 4, 2025

CienetStingLin force-pushed the sting_test branch from 9c63264 to af34594 Compare September 17, 2025 09:34

squash 32 commit for next dev

ecd93ad

CienetStingLin force-pushed the sting_test branch from 0b3273c to ecd93ad Compare September 17, 2025 09:51

CienetStingLin added 20 commits September 23, 2025 16:58

clean for test

e78c465

test new dynamic

86f4667

test

6603df2

test ssh

89ef6c4

test git ssh

f0720f3

test

7060e48

test

a337edb

test

6d96a0a

test

3d7dcd8

test

6adad37

test

7615b6e

test post command

942b25f

test

d5c417c

test all post

2c74587

test to check_results

a09184c

test

736df21

fix

e15f755

fix

1a9e636

test

5f93dd2

test

2d53e70

CienetStingLin added 16 commits September 24, 2025 15:32

test

ea6d7c8

add test models

99e502f

fix for test

3005ea2

test

6a6af3e

test

689d12b

test

281d1a8

remove gz

43352a8

ready for test

b7eed10

test accuracy

3b203be

test

f87e0c4

test

df4fd87

fix

66cada4

fix

2c7e75b

fix

277559e

test

597e943

fix

cc8f5ef

meiyeh123 closed this Oct 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement integration tests in CI pipeline (won't be merged)#532

Implement integration tests in CI pipeline (won't be merged)#532
meiyeh123 wants to merge 38 commits intomainfrom
sting_test

meiyeh123 commented Aug 21, 2025 •

edited by CienetStingLin

Loading

Uh oh!

github-actions bot commented Aug 21, 2025

Uh oh!

meiyeh123 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

meiyeh123 commented Aug 21, 2025 • edited by CienetStingLin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

github-actions bot commented Aug 21, 2025

Description

Tests

Checklist

Uh oh!

meiyeh123 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

meiyeh123 commented Aug 21, 2025 •

edited by CienetStingLin

Loading