Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 'name' field on messages not working when using Mistral.AI's API #2748

Closed
marklysze opened this issue May 21, 2024 · 6 comments
Closed
Labels
models Pertains to using alternate, non-GPT, models (e.g., local models, llama, etc.)

Comments

@marklysze
Copy link
Collaborator

Describe the bug

Mistral.AI's API is strict on what keys are allowed on the messages.

When running a GroupChat, the name key on each message is causing an exception when using the Mistral.AI API through the standard client:

openai.UnprocessableEntityError: Error code: 422 - {'object': 'error', 'message': {'detail': [{'type': 'extra_forbidden', 'loc': ['body', 'messages', 1, 'user', 'name'], 'msg': 'Extra inputs are not permitted', 'input': 'User_proxy'}]}, 'type': 'invalid_request_error', 'param': None, 'code': None}

If the name key is removed from each message no exception is thrown.

Mistral.AI API's documentation on message:
image

Creating this issue in relation to testing from PR #2635, specific messages 1 and 2


Would be good to have an endpoint that developers could use for Mistral.AI, and other non-OpenAI inference, where the name isn't accepted/used.

On a side-note, my testing shows that the name field isn't being used for Ollama, LiteLLM+Ollama, or together.ai.

Steps to reproduce

No response

Model Used

No response

Expected Behavior

No response

Screenshots and logs

No response

Additional Information

No response

@marklysze marklysze added the bug label May 21, 2024
@ekzhu ekzhu added the models Pertains to using alternate, non-GPT, models (e.g., local models, llama, etc.) label May 22, 2024
@ekzhu
Copy link
Collaborator

ekzhu commented May 22, 2024

A possible solution is to have a built-in model client for Mistral AI's API. It's a thin client that does a simple pop of the name key. See example of model client: https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-anthropic

@marklysze
Copy link
Collaborator Author

A possible solution is to have a built-in model client for Mistral AI's API. It's a thin client that does a simple pop of the name key. See example of model client: https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-anthropic

Thanks @ekzhu, for Mistral, they have their own library, like the Anthropic one, so would you see this using that library or the openai library?

from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage

@ekzhu
Copy link
Collaborator

ekzhu commented May 22, 2024

I think using their library as an optional dependency makes sense. So when api_type=mistral_ai, it is going to check if the mistral client library has been installed if not then raise runtime error.

@marklysze
Copy link
Collaborator Author

I think using their library as an optional dependency makes sense. So when api_type=mistral_ai, it is going to check if the mistral client library has been installed if not then raise runtime error.

Okay, I've started creating a Mistral client model class.

@marklysze marklysze mentioned this issue Jun 8, 2024
3 tasks
@marklysze
Copy link
Collaborator Author

Hey @ekzhu, I've created a Mistral Client class. This class does cater for the removal of the name field. PR #2892.

@marklysze
Copy link
Collaborator Author

Yay, Mistral client class merged - please use api_type='mistral' in your configs to use it :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models Pertains to using alternate, non-GPT, models (e.g., local models, llama, etc.)
Projects
None yet
Development

No branches or pull requests

2 participants