-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passing a HF endpoint URL to client.chat_completion() doesn't seem to work anymore #2484
Comments
Hi @MoritzLaurer , thanks for reporting. It should work if you pass the URL as |
In the meantime you can just do: client = InferenceClient(base_url=API_URL) or client = InferenceClient(API_URL + "/v1/chat/completions") |
I wanted raise a related issue, I posted in langchain previously but I got no response so far from maintainers. I believe this is a huggingface-hub issue, so I'm posting it here again: langchain-ai/langchain#24720 I got the same 422 error when using the ChatHuggingFace and HuggingFaceEndpoint APIs in langchain-huggingface. I got this error when using huggingface-hub=0.24.6, but no such error when I downgrade to 0.24.0 |
Also: langchain-ai/langchain#25675 There also seems to be a bind issue with parameters, but most likely a And my nightmare with dealing with HuggingFace dedicated endpoints: langchain-ai/langchain#25675 (reply in thread) (It appears the JSON schema is sent as a tool option?) The |
Even i am getting the same error error:
|
Hi @Saisri534, thanks for reporting this! Looks like you're running into a different bug than what this issue is about. Could you open up a new issue for that? 😄 |
Hi @hanouticelina , messages = [ response_format = { response = client.chat_completion( print(response.choices[0].message.content)` output while using mistral: error while using Falcon: |
@Saisri534 can you summarize this in a new separate issue please? Thanks! https://github.com/huggingface/huggingface_hub/issues/new |
@Wauplin @hanouticelina Could you please help look into the issue that I described? I got this error when using huggingface-hub=0.24.6, but no such error when I downgrade to 0.24.0. Thanks! |
Hi @minmin-intel, sorry for the unresponsiveness. This issue should have been fixed by #2540 that just got merged. I'm not sure your issue is related so could you try to install from source and test if your error is still happening? If it's the case, can you open a new issue on this repo? Thanks in advance! |
Describe the bug
I could previously use the following code to the inference client and it worked (e.g. in this cookbook recipe for the hf endpoints)
This code now results in this error:
(Additional observation: if the endpoint is scaled to zero, then the code first works, by making the endpoint start up again, but then once the endpoint is started up, the error is thrown)
I still get correct outputs via HTTP requests, so it doesn't seem to be an issue with the endpoint or my token
Reproduction
No response
Logs
No response
System info
The text was updated successfully, but these errors were encountered: