Fix echo/logprob OpenAI completion bug by dylanwhawk · Pull Request #3441 · vllm-project/vllm

dylanwhawk · 2024-03-16T01:25:00Z

Fixes #2703 and allows echo to be used with a list of tokens

vllm/entrypoints/openai/serving_engine.py

vllm/entrypoints/openai/serving_completion.py

dylanwhawk · 2024-03-16T23:53:05Z

@ywang96 Thanks for the recommendations!

simon-mo · 2024-03-21T21:17:22Z

vllm/entrypoints/openai/serving_engine.py

+            if top_logprobs is None or top_logprobs[i] is None:
+                token = self.tokenizer.decode(token_id)
+                logprobs.tokens.append(token)
+                logprobs.token_logprobs.append(None)
+                logprobs.top_logprobs.append(None)


can you explain when would this be the case and why is the decode needed?

This is the case when echo == True and logprobs > 0 for the first prompt token because there is no sampling metadata (top_logprobs[i]) about it. The decode is needed to add the token to the logprob token list because the sampling metadata normally has it.

The top_logprobs is None was actually never possible so I removed it

simon-mo · 2024-03-21T21:17:41Z

vllm/entrypoints/openai/serving_chat.py

+            prompt_ids, prompt_text = self._validate_prompt_and_tokenize(
+                request, prompt=prompt)


I looked through the diff but I can't find why is prompt_text needed

On line 79, prompt was what was given by user (either text or token), but generate expected it to be text for handling echo

Ah can you add a comment

simon-mo

plz fix merge conflict

simon-mo · 2024-03-25T19:05:46Z

vllm/entrypoints/openai/serving_chat.py

+            prompt_ids, prompt_text = self._validate_prompt_and_tokenize(
+                request, prompt=prompt)


Ah can you add a comment

simon-mo · 2024-03-25T19:06:43Z

vllm/entrypoints/openai/serving_engine.py

+            if step_top_logprobs is None:
+                token = self.tokenizer.decode(token_id)
+                logprobs.tokens.append(token)
+                logprobs.token_logprobs.append(None)
+                logprobs.top_logprobs.append(None)
            else:
-                token_logprob = None
-            token = step_top_logprobs[token_id].decoded_token
-            logprobs.tokens.append(token)
-            logprobs.token_logprobs.append(token_logprob)
+                token_logprob = step_top_logprobs[token_id].logprob
+                token = step_top_logprobs[token_id].decoded_token
+                logprobs.tokens.append(token)
+                logprobs.token_logprobs.append(token_logprob)
+
+                if num_output_top_logprobs:
+                    logprobs.top_logprobs.append({
+                        p.decoded_token: p.logprob
+                        for i, p in step_top_logprobs.items()
+                    } if step_top_logprobs else None)
+


I don't think we need to call tokenizer here? cc @Yard1

…_logprob_bug

simon-mo

This does fixes the bug so I'm merging it for now. But the our excessive use of the tokenizer is not good here. (FYI @Yard1 and @njhill, we should get ready to refactor the server for performance soon).

njhill

The tokenizer used is also not necessarily correct, which I have a pending fix for in #3512

njhill · 2024-04-12T10:33:28Z

vllm/entrypoints/openai/serving_engine.py

+        input_text = prompt if prompt is not None else self.tokenizer.decode(
+            prompt_ids)


I don't think we should be unconditionally detokenizing the prompt ids in this case. It would be better to do this if/where it's actually needed. At minimum only if echo is specified.

Co-authored-by: Dylan Hawk <dylanwawk@gmail.com>

DarkLight1337 · 2024-04-12T14:49:37Z

I've noticed flaky behaviour in buildkite/ci/pr/entrypoints-test and buildkite/ci/pr/lora-test today. Looking back at the history of main branch, it appears that this PR may be to blame.

Edit: Hopefully #3512 would fix this issue.

Co-authored-by: Dylan Hawk <dylanwawk@gmail.com>

Dylan Hawk added 2 commits March 15, 2024 18:00

Fixed echo/logprob bug

c80bfcf

Yapf

213d7da

simon-mo self-assigned this Mar 16, 2024

ywang96 reviewed Mar 16, 2024

View reviewed changes

vllm/entrypoints/openai/serving_engine.py Show resolved Hide resolved

vllm/entrypoints/openai/serving_engine.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/serving_completion.py Outdated Show resolved Hide resolved

Clean up

0de7e79

ruff

c05cd50

dylanwhawk force-pushed the echo_logprob_bug branch from c05cd50 to 0de7e79 Compare March 21, 2024 20:21

simon-mo reviewed Mar 21, 2024

View reviewed changes

dylanwhawk added 2 commits March 21, 2024 20:17

Removed unnecessary condition

2ef2221

Added test coverage

34a490e

simon-mo reviewed Mar 25, 2024

View reviewed changes

simon-mo added the action-required label Mar 26, 2024

Dylan Hawk added 5 commits April 3, 2024 09:35

Merged with maiin

8027f20

Added comment and ruff

701ed7a

Merged test

a9581a6

Merge branch 'main' of https://github.com/vllm-project/vllm into echo…

c361f1f

…_logprob_bug

Formatting

082520e

simon-mo approved these changes Apr 11, 2024

View reviewed changes

simon-mo enabled auto-merge (squash) April 11, 2024 21:29

simon-mo merged commit 95e7d4a into vllm-project:main Apr 11, 2024

njhill reviewed Apr 12, 2024

View reviewed changes

andy-neuma pushed a commit to neuralmagic/nm-vllm that referenced this pull request Apr 12, 2024

Fix echo/logprob OpenAI completion bug (vllm-project#3441)

1091308

Co-authored-by: Dylan Hawk <dylanwawk@gmail.com>

DarkLight1337 mentioned this pull request Apr 12, 2024

[Bugfix] Fix LoRA bug #4032

Merged

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request Apr 22, 2024

Fix echo/logprob OpenAI completion bug (vllm-project#3441)

794d61f

Co-authored-by: Dylan Hawk <dylanwawk@gmail.com>

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

DarkLight1337 mentioned this pull request Jun 2, 2024

Fix: Echo without asking for new tokens or logprobs in OpenAI Completions API #2995

Closed

DarkLight1337 mentioned this pull request Aug 13, 2024

properly handle situation when step_top_logprobs is None #3946

Closed

hmellor mentioned this pull request Aug 27, 2024

The echo parameters and request logs seem to have some issues in vLLM v0.3.0 version (/v1/completions) #2706

Closed

		prompt_ids, prompt_text = self._validate_prompt_and_tokenize(
		request, prompt=prompt)

		input_text = prompt if prompt is not None else self.tokenizer.decode(
		prompt_ids)

Uh oh!

Comments

Conversation

dylanwhawk commented Mar 16, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dylanwhawk commented Mar 16, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simon-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simon-mo left a comment

Choose a reason for hiding this comment

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Apr 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

DarkLight1337 commented Apr 12, 2024 •

edited

Loading