Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix output token drop issue #9

Merged
merged 3 commits into from
Mar 14, 2024
Merged

Fix output token drop issue #9

merged 3 commits into from
Mar 14, 2024

Conversation

JoeZijunZhou
Copy link
Collaborator

@JoeZijunZhou JoeZijunZhou commented Mar 10, 2024

  • The decode thread sends complete signal after it completes all token generation, but before the grpc channel consumes all the tokens in the return_channel queue of the ActiveRequest. It means that the grpc server channel exit the response streaming before it streams back all the generated tokens (some tokens may be left in the return_channel queue of the ActiveRequest).
  • Adding a return_channel empty check resolves the issue.
  • Use generated_token_list directly in benchmark script; tokenizing the joint list caused minor diff of token number.

@FanhaiLu1 FanhaiLu1 self-requested a review March 11, 2024 02:48
@JoeZijunZhou JoeZijunZhou merged commit 2b9db52 into main Mar 14, 2024
3 checks passed
@JoeZijunZhou JoeZijunZhou deleted the zijun/token-drop branch March 14, 2024 21:04
@JoeZijunZhou JoeZijunZhou restored the zijun/token-drop branch March 14, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants