Skip to content

Fix token usage with jump forward#174

Merged
comaniac merged 3 commits intomainfrom
cody/usage
Feb 10, 2024
Merged

Fix token usage with jump forward#174
comaniac merged 3 commits intomainfrom
cody/usage

Conversation

@comaniac
Copy link
Copy Markdown
Contributor

@comaniac comaniac commented Feb 9, 2024

close #173

This PR fixes the incorrect token usage when jump forward is enabled. Specifically, we introduce a new field orig_prompt_tokens, which will be set when the first jump forward happens so that we could know the original number of prompt tokens. When returning a response (a chunk in streaming or a complete response), we use the following equations to correct the token usage:

completion_tokens = curr_prompt_token - orig_prompt_tokens + completion_tokens
prompt_tokens = orig_prompt_tokens

@comaniac comaniac requested review from hnyls2002 and merrymercy and removed request for merrymercy February 9, 2024 18:59
@hnyls2002
Copy link
Copy Markdown
Collaborator

@comaniac Hi, I suggest initializing the orig_prompt_tokens when constructing the Req so that we can simplify the code.

@comaniac
Copy link
Copy Markdown
Contributor Author

Thanks that makes sense. Meanwhile, can you add some comments back to explain why we need to calculate the token usage in this way?

@comaniac comaniac merged commit 4d303c4 into main Feb 10, 2024
@comaniac comaniac deleted the cody/usage branch February 10, 2024 04:06
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
lujangus pushed a commit to tails-mpt/sglang that referenced this pull request Mar 31, 2026
* added CI

* poslih

* poslih

* poslih

* poslih

* poslih

* fixed some broken tests

* fix broken test case and fix llama4.py

* precommit llama4.py

* add init to make test discoverable (sgl-project#173)

* Devops/ci (sgl-project#174)

* add init to make test discoverable

* lower atol rtol and fix random

* Devops/ci used normed tensor for testing, worked locally (sgl-project#175)

* add init to make test discoverable

* lower atol rtol and fix random

* pass locally... why fail on ci

* done

* remove requires grad

* Reduce qwen tp process count to 2 (sgl-project#177)

* add init to make test discoverable

* lower atol rtol and fix random

* pass locally... why fail on ci

* done

* remove requires grad

* fix qwen3 tp test

* remove dup

* Should pass CI without a problem (sgl-project#178)

* add init to make test discoverable

* lower atol rtol and fix random

* pass locally... why fail on ci

* done

* remove requires grad

* fix qwen3 tp test

* remove dup

* fix tp size

* recursively search for tests

---------

Co-authored-by: ZhengHSI <zhenghsi@qq.com>
Co-authored-by: Yubo Wang <yubowang2019@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect token usage with jump forward

2 participants