Skip to content

Comments

Fix echo/logprob OpenAI completion bug#3441

Merged
simon-mo merged 11 commits intovllm-project:mainfrom
dylanwhawk:echo_logprob_bug
Apr 11, 2024
Merged

Fix echo/logprob OpenAI completion bug#3441
simon-mo merged 11 commits intovllm-project:mainfrom
dylanwhawk:echo_logprob_bug

Conversation

@dylanwhawk
Copy link
Contributor

Fixes #2703 and allows echo to be used with a list of tokens

@simon-mo simon-mo self-assigned this Mar 16, 2024
@dylanwhawk
Copy link
Contributor Author

@ywang96 Thanks for the recommendations!

Comment on lines 101 to 105
if top_logprobs is None or top_logprobs[i] is None:
token = self.tokenizer.decode(token_id)
logprobs.tokens.append(token)
logprobs.token_logprobs.append(None)
logprobs.top_logprobs.append(None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain when would this be the case and why is the decode needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the case when echo == True and logprobs > 0 for the first prompt token because there is no sampling metadata (top_logprobs[i]) about it. The decode is needed to add the token to the logprob token list because the sampling metadata normally has it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top_logprobs is None was actually never possible so I removed it

Comment on lines +64 to +65
prompt_ids, prompt_text = self._validate_prompt_and_tokenize(
request, prompt=prompt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the diff but I can't find why is prompt_text needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 79, prompt was what was given by user (either text or token), but generate expected it to be text for handling echo

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah can you add a comment

Copy link
Collaborator

@simon-mo simon-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz fix merge conflict

Comment on lines +64 to +65
prompt_ids, prompt_text = self._validate_prompt_and_tokenize(
request, prompt=prompt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah can you add a comment

Comment on lines +102 to +118
if step_top_logprobs is None:
token = self.tokenizer.decode(token_id)
logprobs.tokens.append(token)
logprobs.token_logprobs.append(None)
logprobs.top_logprobs.append(None)
else:
token_logprob = None
token = step_top_logprobs[token_id].decoded_token
logprobs.tokens.append(token)
logprobs.token_logprobs.append(token_logprob)
token_logprob = step_top_logprobs[token_id].logprob
token = step_top_logprobs[token_id].decoded_token
logprobs.tokens.append(token)
logprobs.token_logprobs.append(token_logprob)

if num_output_top_logprobs:
logprobs.top_logprobs.append({
p.decoded_token: p.logprob
for i, p in step_top_logprobs.items()
} if step_top_logprobs else None)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to call tokenizer here? cc @Yard1

Copy link
Collaborator

@simon-mo simon-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does fixes the bug so I'm merging it for now. But the our excessive use of the tokenizer is not good here. (FYI @Yard1 and @njhill, we should get ready to refactor the server for performance soon).

@simon-mo simon-mo enabled auto-merge (squash) April 11, 2024 21:29
@simon-mo simon-mo merged commit 95e7d4a into vllm-project:main Apr 11, 2024
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tokenizer used is also not necessarily correct, which I have a pending fix for in #3512

Comment on lines +195 to +196
input_text = prompt if prompt is not None else self.tokenizer.decode(
prompt_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be unconditionally detokenizing the prompt ids in this case. It would be better to do this if/where it's actually needed. At minimum only if echo is specified.

andy-neuma pushed a commit to neuralmagic/nm-vllm that referenced this pull request Apr 12, 2024
Co-authored-by: Dylan Hawk <dylanwawk@gmail.com>
@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 12, 2024

I've noticed flaky behaviour in buildkite/ci/pr/entrypoints-test and buildkite/ci/pr/lora-test today. Looking back at the history of main branch, it appears that this PR may be to blame.

Edit: Hopefully #3512 would fix this issue.

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request Apr 22, 2024
Co-authored-by: Dylan Hawk <dylanwawk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Streaming logprob & echo combo.

5 participants