[Core] Add multi-step support to LLMEngine #7789

alexm-neuralmagic · 2024-08-22T17:00:28Z

This PR adds multi-step support to the LLM Engine class. The new multi-step methods that were originally added to async_llm_engine were moved to llm_engine, since async_llm_engine inherits from llm_engine.

github-actions · 2024-08-22T17:00:38Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

alexm-neuralmagic · 2024-08-22T17:01:06Z

@robertgshaw2-neuralmagic @SolitaryThinker @comaniac

alexm-neuralmagic · 2024-08-22T17:11:12Z

/ready

comaniac

LGTM. cc @SolitaryThinker

comaniac · 2024-08-22T17:26:18Z

benchmarks/benchmark_throughput.py

+        num_scheduler_steps=8,
+        use_v2_block_manager=True,


Make it configurable.

Good catch, added as arg params

comaniac · 2024-08-22T17:26:27Z

examples/offline_inference.py

@@ -11,7 +11,7 @@
 sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

 # Create an LLM.
-llm = LLM(model="facebook/opt-125m")
+llm = LLM(model="facebook/opt-125m", tensor_parallel_size=2)


Revert this

comaniac · 2024-08-22T17:26:53Z

tests/multi_step/test_correctness.py

+async def test_multi_step_async_llm(example_prompts, model: str, tp_size: int,
+                                    pp_size: int, eager_mode: int,
+                                    num_scheduler_steps: int,
+                                    num_prompts: int):


Revert this as unrelated

mgoin

Excellent, great for offline throughput

SolitaryThinker

lgmt thanks @alexm-neuralmagic!

Signed-off-by: Alvant <[email protected]>

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 22, 2024

comaniac approved these changes Aug 22, 2024

View reviewed changes

mgoin approved these changes Aug 22, 2024

View reviewed changes

SolitaryThinker approved these changes Aug 22, 2024

View reviewed changes

alexm-neuralmagic added 10 commits August 23, 2024 13:41

add multi-step support to llm_engine

8b90add

format

5dca055

Cody's comments

6f1dd10

force spawn for multi-step

6ef3633

ray fix

7f39d6a

revert spawn changes - they are not necessary

9d250ee

format

70a3ca8

fix tests

9ec05ab

ruff

df9ae8d

fix tests yaml

ec2c589

alexm-neuralmagic force-pushed the multi_step_llm_engine branch from a2ee9d3 to ec2c589 Compare August 23, 2024 13:42

fix lora test

a31444f

comaniac merged commit 9db93de into vllm-project:main Aug 23, 2024
64 checks passed

comaniac mentioned this pull request Aug 23, 2024

[Tracking issue] [Help wanted]: Multi-step scheduling follow-ups #7528

Open

17 tasks

omrishiv pushed a commit to omrishiv/vllm that referenced this pull request Aug 26, 2024

[Core] Add multi-step support to LLMEngine (vllm-project#7789)

b63d237

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Core] Add multi-step support to LLMEngine (vllm-project#7789)

cdaced9

Signed-off-by: Alvant <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Core] Add multi-step support to LLMEngine (vllm-project#7789)

2183519

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] Add multi-step support to LLMEngine #7789

[Core] Add multi-step support to LLMEngine #7789

alexm-neuralmagic commented Aug 22, 2024 •

edited

Loading

github-actions bot commented Aug 22, 2024

alexm-neuralmagic commented Aug 22, 2024

alexm-neuralmagic commented Aug 22, 2024

comaniac left a comment

comaniac Aug 22, 2024

alexm-neuralmagic Aug 22, 2024

comaniac Aug 22, 2024

alexm-neuralmagic Aug 22, 2024

comaniac Aug 22, 2024

alexm-neuralmagic Aug 22, 2024

mgoin left a comment

SolitaryThinker left a comment

[Core] Add multi-step support to LLMEngine #7789

[Core] Add multi-step support to LLMEngine #7789

Conversation

alexm-neuralmagic commented Aug 22, 2024 • edited Loading

github-actions bot commented Aug 22, 2024

alexm-neuralmagic commented Aug 22, 2024

alexm-neuralmagic commented Aug 22, 2024

comaniac left a comment

Choose a reason for hiding this comment

comaniac Aug 22, 2024

Choose a reason for hiding this comment

alexm-neuralmagic Aug 22, 2024

Choose a reason for hiding this comment

comaniac Aug 22, 2024

Choose a reason for hiding this comment

alexm-neuralmagic Aug 22, 2024

Choose a reason for hiding this comment

comaniac Aug 22, 2024

Choose a reason for hiding this comment

alexm-neuralmagic Aug 22, 2024

Choose a reason for hiding this comment

mgoin left a comment

Choose a reason for hiding this comment

SolitaryThinker left a comment

Choose a reason for hiding this comment

alexm-neuralmagic commented Aug 22, 2024 •

edited

Loading