Include usage key in create_completion when streaming #1498

zhudotexe · 2024-05-30T18:16:22Z

Is your feature request related to a problem? Please describe.
Since create_completion may yield text chunks comprised of multiple tokens per yield (e.g. in the case of multi-byte Unicode characters), counting the number of yields may not equal the number of tokens actually generated by a model. To accurately get the usage statistics of a streamed completion, one has to run the final text through the tokenizer again, despite create_completion already tracking the number of tokens generated by the model.

Describe the solution you'd like
When stream=True in create_completion, the final chunk yielded should include the usage statistics in the 'usage' key.

Describe alternatives you've considered

Saving full generated text and running it through the tokenizer again (seems wasteful)
Counting the number of yields and hoping we don't have any multi-byte characters (hacky and fragile)

Additional context
The OpenAI API has recently added similar support in their streaming API with the stream_options key: https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options

The text was updated successfully, but these errors were encountered:

zhudotexe mentioned this issue May 30, 2024

Tracking token usage? zhudotexe/kani#29

Open

abetlen added the enhancement New feature or request label Jun 4, 2024

tpfau linked a pull request Jun 25, 2024 that will close this issue

Add stream_options support according to OpenAI API #1552

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include usage key in create_completion when streaming #1498

Include usage key in create_completion when streaming #1498

zhudotexe commented May 30, 2024

Include usage key in create_completion when streaming #1498

Include usage key in create_completion when streaming #1498

Comments

zhudotexe commented May 30, 2024