You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using AsyncInferenceClient.chat_completion(stream=True) the client expects the endpoint to emit [DONE]\n token at the end of the stream (which TGI does). However, due to the differences in the implementations of token parsing between InferenceClient and AsyncInferenceClient, line
receives either [DONE] (regular client) or [DONE]\n (async client). Naturally, async client fails the check (due to the extra "\n") and submits "[DONE]\n" for further JSON parsing (which fails).
Suggested fix is trivial: calling .rstrip() for byte_payload in
Unfortunately, I cannot provide a merge request for this, because the unit tests involve auto-generated (?) cassettes that do not have [DONE] tokens. Doesn't look like a good idea to edit them manually (although I tried it, and that reproduced the bug).
Thanks for reporting @cordawyn! I've applied the suggested change in #2458 + fixed the tests accordingly. FYI, to (re-)record a cassette you can delete the file and relaunch the test locally. These cassettes are not ideal (we missed a bug a because of them) but that will be for another PR.
Describe the bug
When using
AsyncInferenceClient.chat_completion(stream=True)
the client expects the endpoint to emit[DONE]\n
token at the end of the stream (which TGI does). However, due to the differences in the implementations of token parsing betweenInferenceClient
andAsyncInferenceClient
, linehuggingface_hub/src/huggingface_hub/inference/_common.py
Line 347 in e9cd695
[DONE]
(regular client) or[DONE]\n
(async client). Naturally, async client fails the check (due to the extra "\n") and submits "[DONE]\n" for further JSON parsing (which fails).Suggested fix is trivial: calling
.rstrip()
forbyte_payload
inhuggingface_hub/src/huggingface_hub/inference/_common.py
Line 358 in e9cd695
Unfortunately, I cannot provide a merge request for this, because the unit tests involve auto-generated (?) cassettes that do not have
[DONE]
tokens. Doesn't look like a good idea to edit them manually (although I tried it, and that reproduced the bug).Reproduction
or edit https://github.com/huggingface/huggingface_hub/blob/e9cd695d7bd9e81b4eceb8f4da557a0cfa387b99/tests/cassettes/test_async_chat_completion_with_stream.yaml by adding "[DONE]" to the end of the body and run https://github.com/huggingface/huggingface_hub/blob/e9cd695d7bd9e81b4eceb8f4da557a0cfa387b99/tests/test_inference_async_client.py test (
test_async_chat_completion_with_stream
will fail).Logs
No response
System info
The text was updated successfully, but these errors were encountered: