[Issue]: Unable to run notebook: Agent Chat with Multimodal Models - GPT-4V #965

antoan · 2023-12-13T00:25:50Z

Describe the issue

When executing cell 3 of https://github.com/microsoft/autogen/blob/main/notebook/agentchat_lmm_gpt-4v.ipynb


User_proxy (to image-explainer):

What's the breed of this dog?
<image>.

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
---------------------------------------------------------------------------
BadRequestError                           Traceback (most recent call last)
/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb Cell 7 line 1
      7 user_proxy = autogen.UserProxyAgent(
      8     name="User_proxy",
      9     system_message="A human admin.",
     10     human_input_mode="NEVER", # Try between ALWAYS or NEVER
     11     max_consecutive_auto_reply=0
     12 )
     14 # Ask the question with an image
---> 15 user_proxy.initiate_chat(image_agent,
     16                          message="""What's the breed of this dog?
     17 <img https://th.bing.com/th/id/R.422068ce8af4e15b0634fe2540adea7a?rik=y4OcXBE%2fqutDOw&pid=ImgRaw&r=0>.""")

File ~/dev/ai/anomaly-detection/.venv/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py:556, in ConversableAgent.initiate_chat(self, recipient, clear_history, silent, **context)
    542 """Initiate a chat with the recipient agent.
    543
    544 Reset the consecutive auto reply counter.
   (...)
    553         "message" needs to be provided if the `generate_init_message` method is not overridden.
    554 """
    555 self._prepare_chat(recipient, clear_history)
--> 556 self.send(self.generate_init_message(**context), recipient, silent=silent)

File ~/dev/ai/anomaly-detection/.venv/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py:354, in ConversableAgent.send(self, message, recipient, request_reply, silent)
...
--> 898     raise self._make_status_error_from_response(err.response) from None
    899 except httpx.TimeoutException as err:
    900     if response is not None:

BadRequestError: Error code: 400 - {'error': {'message': 'Invalid content type. image_url is only supported by certain models.', 'type': 'invalid_request_error', 'param': 'messages.[1].content.[1].type', 'code': None}}

The BadRequestError seems to suggest that I do not have access to the openai gpt-4-vision-preview model API. However, I was able to succesfully get a response directly via openai api with the following (using a custom image):

import base64
from openai import OpenAI

client = OpenAI()


def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")


encoded_string = encode_image("anomaly.png")

system_prompt = "You are an expert at analyzing images."
response = client.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=[
        {
            "role": "system",
            "content": [
                {"type": "text", "text": system_prompt},
            ],
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{encoded_string}"},
                }
            ],
        },
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this art style."},
            ],
        },
    ],
    max_tokens=1000,
)

print(response.choices[0].message.content)
print(response.usage.model_dump())

Please advise.

Steps to reproduce

Executite cells 1-3

Screenshots and logs

Additional Information

Python 3.10.12
"pyautogen[lmm]==0.2.0b4" , "pyautogen[lmm]==0.2.0 and "pyautogen[lmm]==0.2.2 all yield the same error.
Ubuntu 20.04

The text was updated successfully, but these errors were encountered:

julianakiseleva · 2023-12-15T00:58:24Z

@BeibinLi can you please take a look?

BeibinLi · 2023-12-15T01:38:09Z

Thanks! Let me take a deeper look. I tried to rerun the notebook but did not encounter any problems.

One possibility is the "llm_config" parameter for the image_agent, which should use the "config_list_4v" oai setting. After failing cell 3, can you try running the following code?

print(image_agent.llm_config)
assert image_agent.llm_config["config_list"][0]["model"] == 'gpt-4-vision-preview'

If results from the above code looks ok, can you try to create image_agent with the following way?

image_agent = MultimodalConversableAgent(
    name="image-explainer",
    max_consecutive_auto_reply=10,
    llm_config={"config_list": config_list_4v, "temperature": 0.5, "max_tokens": 300, "model": "gpt-4-vision-preview"}
)

julianakiseleva · 2023-12-15T02:09:14Z

thanks @BeibinLi! @antoan did it help?

antoan · 2023-12-15T23:16:24Z

@BeibinLi @julianakiseleva

Thank you very much for the suggestion:

The output from cell 3, after I added the first code block (print&assert statements)

[autogen.oai.client: 12-15 22:59:50] {74} WARNING - openai client was provided with an empty config_list, which may not be intended.
{'model': 'gpt-4', 'config_list': [], 'temperature': 0.5, 'max_tokens': 300}
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb](https://file+.vscode-resource.vscode-cdn.net/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb) Cell 7 line 8
      [1](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=0) image_agent = MultimodalConversableAgent(
      [2](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=1)     name="image-explainer",
      [3](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=2)     max_consecutive_auto_reply=10,
      [4](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=3)     llm_config={"config_list": config_list_4v, "temperature": 0.5, "max_tokens": 300, }
      [5](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=4) )
      [7](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=6) print(image_agent.llm_config)
----> [8](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=7) assert image_agent.llm_config["config_list"][0]["model"] == 'gpt-4-vision-preview'
     [10](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=9) user_proxy = autogen.UserProxyAgent(
     [11](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=10)     name="User_proxy",
     [12](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=11)     system_message="A human admin.",
     [13](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=12)     human_input_mode="NEVER", # Try between ALWAYS or NEVER
     [14](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=13)     max_consecutive_auto_reply=0
     [15](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=14) )
     [17](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=16) # Ask the question with an image

IndexError: list index out of range

It seems that the config is not being picked up from the OAI_CONFIG_LIST file I have in my notebook project. I'm not sure why, as this has worked in my previous experiments with autogen, perhaps it is something to do with with my jupyter setup.

I was able to resolve this by including the "model": "gpt-4-vision-preview" to the llm_config as suggested in your second code block.

Thanks again.

antoan closed this as completed Dec 15, 2023

BeibinLi mentioned this issue Dec 20, 2023

Change default model in LMM #1032

Merged

3 tasks

BeibinLi mentioned this issue Dec 30, 2023

Agentchat with Multimodal Model not working #1059

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: Unable to run notebook: Agent Chat with Multimodal Models - GPT-4V #965

[Issue]: Unable to run notebook: Agent Chat with Multimodal Models - GPT-4V #965

antoan commented Dec 13, 2023 •

edited

Loading

julianakiseleva commented Dec 15, 2023

BeibinLi commented Dec 15, 2023

julianakiseleva commented Dec 15, 2023

antoan commented Dec 15, 2023

[Issue]: Unable to run notebook: Agent Chat with Multimodal Models - GPT-4V #965

[Issue]: Unable to run notebook: Agent Chat with Multimodal Models - GPT-4V #965

Comments

antoan commented Dec 13, 2023 • edited Loading

Describe the issue

Steps to reproduce

Screenshots and logs

Additional Information

julianakiseleva commented Dec 15, 2023

BeibinLi commented Dec 15, 2023

julianakiseleva commented Dec 15, 2023

antoan commented Dec 15, 2023

antoan commented Dec 13, 2023 •

edited

Loading