Fixed single GPU issue without setting up mp. Added toggles for server request batching parameters #114

gshtras · 2024-07-31T22:44:52Z

Fix for the issue where a method from multiprocess gpu executor was being called on a regular gpu executor

Additional toggles to tweak the sync openai server request batching to not have the decode interrupted by prefill too often

…r request batching parameters

mawong-amd · 2024-08-02T18:14:52Z

vllm/entrypoints/fast_sync_llm.py

                logger.info("No unfinished requests. Waiting...")
                (request_id, prompt, sampling_params) = self.input_queue.get()
-                if self.need_restart:
+                if self.need_restart and isinstance(


Would suggest syncing this logic with line 101, or at least a comment clarifying that need_restart is produced but not consumed in the single-GPU GPUExecutor case.

Not having the if isinstance where this value is set is a micro-optimization :p

Comments have no performance impact 😼 Will leave this convo up in the event some confused person noses around this part of the code.

shajrawi

ship it

…r request batching parameters (#114) * Fixed single GPU issue without setting up mp. Added toggles for server request batching parameters * Adding HTTP headers

gshtras added 2 commits July 31, 2024 22:42

Fixed single GPU issue without setting up mp. Added toggles for serve…

d753f9f

…r request batching parameters

Adding headers

d4a401d

gshtras requested review from divakar-amd, mawong-amd and shajrawi August 2, 2024 18:12

mawong-amd approved these changes Aug 2, 2024

View reviewed changes

shajrawi approved these changes Aug 2, 2024

View reviewed changes

gshtras merged commit 3e480e9 into main Aug 2, 2024

gshtras deleted the greg/server_tweaks branch August 2, 2024 18:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixed single GPU issue without setting up mp. Added toggles for server request batching parameters #114

Fixed single GPU issue without setting up mp. Added toggles for server request batching parameters #114

Uh oh!

gshtras commented Jul 31, 2024

Uh oh!

mawong-amd Aug 2, 2024

Uh oh!

gshtras Aug 2, 2024

Uh oh!

mawong-amd Aug 2, 2024

Uh oh!

shajrawi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fixed single GPU issue without setting up mp. Added toggles for server request batching parameters #114

Fixed single GPU issue without setting up mp. Added toggles for server request batching parameters #114

Uh oh!

Conversation

gshtras commented Jul 31, 2024

Uh oh!

mawong-amd Aug 2, 2024

Choose a reason for hiding this comment

Uh oh!

gshtras Aug 2, 2024

Choose a reason for hiding this comment

Uh oh!

mawong-amd Aug 2, 2024

Choose a reason for hiding this comment

Uh oh!

shajrawi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants