You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Since create_completion may yield text chunks comprised of multiple tokens per yield (e.g. in the case of multi-byte Unicode characters), counting the number of yields may not equal the number of tokens actually generated by a model. To accurately get the usage statistics of a streamed completion, one has to run the final text through the tokenizer again, despite create_completion already tracking the number of tokens generated by the model.
Describe the solution you'd like
When stream=True in create_completion, the final chunk yielded should include the usage statistics in the 'usage' key.
Describe alternatives you've considered
Saving full generated text and running it through the tokenizer again (seems wasteful)
Counting the number of yields and hoping we don't have any multi-byte characters (hacky and fragile)
Is your feature request related to a problem? Please describe.
Since
create_completion
may yield text chunks comprised of multiple tokens per yield (e.g. in the case of multi-byte Unicode characters), counting the number of yields may not equal the number of tokens actually generated by a model. To accurately get the usage statistics of a streamed completion, one has to run the final text through the tokenizer again, despitecreate_completion
already tracking the number of tokens generated by the model.Describe the solution you'd like
When
stream=True
increate_completion
, the final chunk yielded should include the usage statistics in the'usage'
key.Describe alternatives you've considered
Additional context
The OpenAI API has recently added similar support in their streaming API with the
stream_options
key: https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_optionsThe text was updated successfully, but these errors were encountered: