Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Problem]: Groupchat fails to run with Mistral models due the name field of message not accepted. #2457

Closed
Tracked by #2012
rrrozhd opened this issue Apr 19, 2024 · 10 comments
Labels
group chat/teams group-chat-related issues

Comments

@rrrozhd
Copy link

rrrozhd commented Apr 19, 2024

Describe the bug

Apparently I cannot use Mistral models in a group chat setting. When a mistral model is set as the manager itself (using 'auto' speaker selection), it fails with the error openai.BadRequestError: Error code: 400 'Expected last role to be one of: [user, tool] but got system'. But when used as one of the agents inside a GroupChat class, it outputs openai.UnprocessableEntityError: Error code: 422 'Extra inputs are not permitted' . After some digging around, I've found that the Mistral API does not support parameters for the messages dictionary other than 'role' and 'content', while open AI does support an optional 'name' parameter. After commenting out these lines of code in the GroupChat class, it does not error out with 422 anymore, however I still haven't gotten around to the first error. Question: is the 'name' parameter safe to remove if I wish to keep using the Mistral API? Thanks in advance!

Steps to reproduce

import autogen 
import tempfile
from autogen.coding import LocalCommandLineCodeExecutor
from autogen.agentchat.contrib.society_of_mind_agent import SocietyOfMindAgent  


temp_dir = tempfile.TemporaryDirectory()


executor = LocalCommandLineCodeExecutor(
    timeout=10, 
    work_dir='coding', 
)

config_list = autogen.config_list_from_json(
    env_or_file="OAI_CONFIG_LIST",
)

llm_config = {'config_list': config_list}

 
code_writer = autogen.AssistantAgent(
    "Code_Writer",
    system_message='''
    You are the Code_Writer agent, specializing in crafting solutions through code. Follow these steps:
    1. Briefly analyze the task at hand.
    2. Develop a step-by-step plan outlining how to tackle the task. This plan should be conceptual and not include any code.
    3. Transform your plan into executable Python code. Include comments within your code to explain your thought process and any specific implementation details.
    4. If the task involves generating visualizations or files, ensure your code saves these to a specified path instead of attempting to display or output them directly.
    End your response with the code block, clearly separating your planning notes from the actual code using comments. Example:

    # Step-by-Step Plan:
    # 1. Load the data from source X
    # 2. Process data to format Y
    # 3. Analyze data for Z
    # 4. Save output to path A

    ***Python code here***
    ''',
    llm_config=llm_config,
    is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)

code_executor = autogen.UserProxyAgent(
    "Code_Executor",
    human_input_mode="NEVER",
    code_execution_config={"executor": executor},
    default_auto_reply="",
    llm_config=False,
    # is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)

presenter = autogen.AssistantAgent(
    "Presenter",
    system_message='''
    You are the Presenter agent. Your primary role is to interpret and summarize the results produced by executing code. Follow these guidelines:
    1. Review the output from the Code_Executor carefully.
    2. Provide a clear, concise summary of what the executed code accomplished, including key results and any file outputs. Highlight important findings and insights derived from the code execution.
    3. Your summary should be understandable even to those not familiar with the technical details of the implementation.
    4. Conclude your summary with "TERMINATE" to indicate that your presentation of results is complete and ready for the user.

    Remember, your goal is to bridge the gap between code execution and user understanding, presenting the information in an accessible and informative manner.
    ''',
    human_input_mode="NEVER",
    code_execution_config={"executor": executor},
    default_auto_reply="",
    is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)

def state_transition(last_speaker, groupchat):
    messages = groupchat.messages

    if last_speaker is code_writer:
       return code_executor
   
    elif last_speaker is code_executor:
        if messages[-1]["content"] == "exitcode: 0":
            return presenter
        if messages[-1]["content"] == "exitcode: 1":
           return code_executor
    elif last_speaker is presenter:
        return None
    else:
        return 'auto'


coder_groupchat = autogen.GroupChat(
    agents=[code_writer, code_executor, presenter],
    messages=[],
    speaker_selection_method=state_transition,
    allow_repeat_speaker=False,
    max_round=8,
)

coder_group_manager = autogen.GroupChatManager(
    name='Manager',
    groupchat=coder_groupchat,
    is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
    llm_config=llm_config,
)


task = f"Plot the returns of Microsoft vs Nvidia stock"

som_coder = SocietyOfMindAgent(
    "society_of_mind_coder",
    chat_manager=coder_group_manager,
    llm_config=llm_config,
)


user_proxy = autogen.UserProxyAgent(
    "user_proxy",
    human_input_mode="NEVER",
    default_auto_reply="",
    is_termination_msg=lambda x: True,
)

user_proxy.initiate_chat(som_coder, message=task)

Model Used

mistral-large-latest

Expected Behavior

No response

Screenshots and logs

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/contrib/society_of_mind_agent.py", line 191, in generate_inner_monologue_reply
self.initiate_chat(self.chat_manager, message=messages[-1], clear_history=False)
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py", line 973, in initiate_chat
self.send(msg2send, recipient, silent=silent)
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py", line 620, in send
recipient.receive(message, self, request_reply, silent)
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py", line 779, in receive
reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py", line 1862, in generate_reply
final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/groupchat.py", line 614, in run_chat
speaker = groupchat.select_speaker(speaker, self)
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/groupchat.py", line 430, in select_speaker
final, name = selector.generate_oai_reply(messages)
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py", line 1261, in generate_oai_reply
extracted_response = self._generate_oai_reply_from_client(
File "/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py", line 1280, in _generate_oai_reply_from_client
response = llm_client.create(
File "/usr/local/lib/python3.10/dist-packages/autogen/oai/client.py", line 625, in create
response = client.create(params)
File "/usr/local/lib/python3.10/dist-packages/autogen/oai/client.py", line 278, in create
response = completions.create(**params)
File "/usr/local/lib/python3.10/dist-packages/openai/_utils/_utils.py", line 275, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/openai/resources/chat/completions.py", line 581, in create
return self._post(
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1234, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 922, in request
return self._request(
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1014, in _request
raise self._make_status_error_from_response(err.response) from None
openai.UnprocessableEntityError: Error code: 422 - {'object': 'error', 'message': {'detail': [{'type': 'extra_forbidden', 'loc': ['body', 'messages', 1, 'user', 'name'], 'msg': 'Extra inputs are not permitted', 'input': 'society_of_mind_coder', 'url': 'https://errors.pydantic.dev/2.6/v/extra_forbidden'}]}, 'type': 'invalid_request_error', 'param': None, 'code': None}
{'object': 'error', 'message': {'detail': [{'type': 'extra_forbidden', 'loc': ['body', 'messages', 1, 'user', 'name'], 'msg': 'Extra inputs are not permitted', 'input': 'society_of_mind_coder', 'url': 'https://errors.pydantic.dev/2.6/v/extra_forbidden'}]}, 'type': 'invalid_request_error', 'param': None, 'code': None}

Additional Information

pyautogen=='0.2.26'

@rrrozhd rrrozhd added the bug label Apr 19, 2024
@ekzhu
Copy link
Collaborator

ekzhu commented Apr 20, 2024

Thanks for the issue. For the second error regarding system role see: https://microsoft.github.io/autogen/docs/topics/non-openai-models/best-tips-for-nonopenai-models#chat-template

For the first error regarding the "name" field, can you update this Issue's name to indicate this is the problem? Are you using Mistral AI API? You can open a PR to add an option to the GroupChat class to customize how name field is handled when creating the select speaker messages: (1) use OpenAI style name field, (2) ignore it (3) prepend to each message, etc.

@marklysze would like your insight into this first error.

@ekzhu ekzhu added the group chat/teams group-chat-related issues label Apr 20, 2024
@marklysze
Copy link
Collaborator

Hi @turgor111 and @ekzhu...

Okay, I've run through the code with the Mistral AI and am experiencing the issues as noted.

Here is my understanding of the issues.

"system" role

The Mistral AI does expect just user or tool to be the role for the last message.

Adding role_for_select_speaker_messages='user' as a parameter to the instantiation of the GroupChat object does help this for the group chat select speaker part.

coder_groupchat = autogen.GroupChat(
    agents=[code_writer, code_executor, presenter],
    messages=[],
    speaker_selection_method=state_transition,
    allow_repeat_speaker=False,
    max_round=8,
    role_for_select_speaker_messages='user',
)

Unfortunately, the SocietyOfMindAgent class in society_of_mind_agent.py uses system as the role in a couple of places and that's causing the API crash.

Looking through the codebase, there are quite a few locations where "system" is used so I think to address this issue we need to expand the scope beyond GroupChat.

I'm wondering whether there's a possibility to use a higher-level client/LLM config based setting that replaces 'system' roles with a user-defined value/function?

"name" key on messages

Googling, I found Mistral AI's changelog from January that notes they used to ignore unused parameters but they're now throwing an error:

Jan. 11, 2024

We have enhanced the API's strictness. Previously the API would silently ignores unsupported parameters in the requests, but it now strictly enforces the validity of all parameters. If you have unsupported parameters in your request, you will see the error message "Extra inputs are not permitted".

I like @ekzhu's suggestion of implementing a change that would give some flexibility around the use of the "name" key. I do think it would be worth considering having this at a higher level as all messages going out could need this handled.

On a side note, this does make an important point about the "name" parameter - it's not clear to me if it's actually used outside of OpenAI. I found in other testing that putting the agent's name into the content (prepend) helped with speaker selection. So, it may make sense to have the implementation ideas suggested by @ekzhu also to improve the effectiveness with non-OpenAI models.

@rrrozhd rrrozhd changed the title [Bug]: Groupchat fails to run with Mistral models [Probelm]: Groupchat fails to run with Mistral models Apr 20, 2024
@rrrozhd rrrozhd changed the title [Probelm]: Groupchat fails to run with Mistral models [Problem]: Groupchat fails to run with Mistral models Apr 20, 2024
@rrrozhd
Copy link
Author

rrrozhd commented Apr 20, 2024

Thanks for the issue. For the second error regarding system role see: https://microsoft.github.io/autogen/docs/topics/non-openai-models/best-tips-for-nonopenai-models#chat-template

For the first error regarding the "name" field, can you update this Issue's name to indicate this is the problem? Are you using Mistral AI API? You can open a PR to add an option to the GroupChat class to customize how name field is handled when creating the select speaker messages: (1) use OpenAI style name field, (2) ignore it (3) prepend to each message, etc.

@marklysze would like your insight into this first error.

Hi, @marklysze @ekzhu I am using the Mistral API. Does the Agent or the Groupchat object have some reference to the model name under the hood? Otherwise for this logic some additional properties or arguments to the method itself may need to be introduced...

@rrrozhd
Copy link
Author

rrrozhd commented Apr 20, 2024

Hi @turgor111 and @ekzhu...

Okay, I've run through the code with the Mistral AI and am experiencing the issues as noted.

Here is my understanding of the issues.

"system" role

The Mistral AI does expect just user or tool to be the role for the last message.

Adding role_for_select_speaker_messages='user' as a parameter to the instantiation of the GroupChat object does help this for the group chat select speaker part.

coder_groupchat = autogen.GroupChat(
    agents=[code_writer, code_executor, presenter],
    messages=[],
    speaker_selection_method=state_transition,
    allow_repeat_speaker=False,
    max_round=8,
    role_for_select_speaker_messages='user',
)

Unfortunately, the SocietyOfMindAgent class in society_of_mind_agent.py uses system as the role in a couple of places and that's causing the API crash.

Looking through the codebase, there are quite a few locations where "system" is used so I think to address this issue we need to expand the scope beyond GroupChat.

I'm wondering whether there's a possibility to use a higher-level client/LLM config based setting that replaces 'system' roles with a user-defined value/function?

"name" key on messages

Googling, I found Mistral AI's changelog from January that notes they used to ignore unused parameters but they're now throwing an error:

Jan. 11, 2024

We have enhanced the API's strictness. Previously the API would silently ignores unsupported parameters in the requests, but it now strictly enforces the validity of all parameters. If you have unsupported parameters in your request, you will see the error message "Extra inputs are not permitted".

I like @ekzhu's suggestion of implementing a change that would give some flexibility around the use of the "name" key. I do think it would be worth considering having this at a higher level as all messages going out could need this handled.

On a side note, this does make an important point about the "name" parameter - it's not clear to me if it's actually used outside of OpenAI. I found in other testing that putting the agent's name into the content (prepend) helped with speaker selection. So, it may make sense to have the implementation ideas suggested by @ekzhu also to improve the effectiveness with non-OpenAI models.

Thanks for the feedback @marklysze. I've read their change log, but the funny thing is that I ran the exact same code about two weeks ago and everything worked as intended. My only guess is that they've started enforcing parameters a little later than reflected in their logs :D

@ekzhu
Copy link
Collaborator

ekzhu commented Apr 21, 2024

@turgor111 are you interested in creating a PR to add option for how to specify speaker name in message?

@rrrozhd
Copy link
Author

rrrozhd commented Apr 22, 2024

@turgor111 are you interested in creating a PR to add option for how to specify speaker name in message?

Hey there @ekzhu ! Unfortunately I don't have a lot of time to implement massive fixes, however I'll share the fix I'm using anyway and if it suits your requirements I'd be happy to :)

I'm now using this option in my project. Not sure what your needs are in terms of efficiency (it might not be the most efficient solution), but looking through the codebase, I've found that it's the quickest and the easiest fix to implement. Because when it comes to GroupChat and ConversibleAgent objects (for instance, if we wanted an additional property to implement conditional logic inside append) there are no, as far I see, ways to retrieve definite information about the model being used for completion. On top of that, some additional logic would also be needed to check which model is the current receiver of the messages list to avoid making a bad request. That's why I think the best thing to do here (at least for now) is to check model type at the time of making the request, and cleaning the messages list of any extra parameters depending on the model being called.

Another thing to note is the first error I've encountered is still in place, but I've just switched to an OpenAI model for the group chat manager, which fixes it.

Plus, as far as I've seen, Mistral is currently one of the only if not the only provider that enforces parameters for the chat completion request, so in the meantime I might even look into using other providers altogether to avoid dealing with these interoperability issues.

@ekzhu
Copy link
Collaborator

ekzhu commented Apr 22, 2024

Thanks @turgor111. I think if we were to add it to the release, the option should be surfaced at the GroupChat constructor level.

@ekzhu ekzhu changed the title [Problem]: Groupchat fails to run with Mistral models [Problem]: Groupchat fails to run with Mistral models due the name field of message not accepted. Apr 22, 2024
@marklysze
Copy link
Collaborator

An update - the Mistral client class is now available (see #2892). And I have added an issue to discuss the impact and ideas to solve the absence of the name field, see #2989.

I'll close this for now, please try the new Mistral client class and continue the discussion on the name field in #2989.

@jhachirag7
Copy link

openai.BadRequestError: Error code: 400 - {'type': 'urn:inference-service:problem-details:bad-request', 'title': 'Bad Request', 'status': 400, 'detail': "[{'type': 'extra_forbidden', 'loc': ('body', 'tools'), 'msg': 'Extra inputs are not permitted', 'input': [{'type': 'function', 'function': {'description': 'research about a given topic, return the research material including reference links', 'name': 'research', 'parameters': {'type': 'object', 'properties': {'query': {'type': 'string', 'description': 'The topic to be researched about'}}, 'required': ['query']}}}, {'type': 'function', 'function': {'description': 'rite content based on the given research material & topic', 'name': 'write_content', 'parameters': {'type': 'object', 'properties': {'research_material': {'type': 'string', 'description': 'research material of a given topic, including reference links when available'}, 'topic': {'type': 'string', 'description': 'The topic of the content'}}, 'required': ['research_material', 'topic']}}}], 'url': 'https://errors.pydantic.dev/2.6/v/extra_forbidden'}]", 'instance': '/v2/nvcf/pexec/functions/767b5b9a-3f9d-4c1d-86e8-fa861988cee7', 'requestId': 'a489f3b1-c962-45cb-af41-833f4f281a45'}

i got similar error, model i am using is mistralai/mistral-large

@marklysze
Copy link
Collaborator

i got similar error, model i am using is mistralai/mistral-large

If you haven't tried the new AutoGen Mistral AI class, can you try that:

https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-mistralai/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
group chat/teams group-chat-related issues
Projects
None yet
Development

No branches or pull requests

4 participants