Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Unable to run notebook: Agent Chat with Multimodal Models - GPT-4V #965

Closed
antoan opened this issue Dec 13, 2023 · 4 comments
Closed

Comments

@antoan
Copy link

antoan commented Dec 13, 2023

Describe the issue

When executing cell 3 of https://github.com/microsoft/autogen/blob/main/notebook/agentchat_lmm_gpt-4v.ipynb


User_proxy (to image-explainer):

What's the breed of this dog?
<image>.

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
---------------------------------------------------------------------------
BadRequestError                           Traceback (most recent call last)
/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb Cell 7 line 1
      7 user_proxy = autogen.UserProxyAgent(
      8     name="User_proxy",
      9     system_message="A human admin.",
     10     human_input_mode="NEVER", # Try between ALWAYS or NEVER
     11     max_consecutive_auto_reply=0
     12 )
     14 # Ask the question with an image
---> 15 user_proxy.initiate_chat(image_agent,
     16                          message="""What's the breed of this dog?
     17 <img https://th.bing.com/th/id/R.422068ce8af4e15b0634fe2540adea7a?rik=y4OcXBE%2fqutDOw&pid=ImgRaw&r=0>.""")

File ~/dev/ai/anomaly-detection/.venv/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py:556, in ConversableAgent.initiate_chat(self, recipient, clear_history, silent, **context)
    542 """Initiate a chat with the recipient agent.
    543
    544 Reset the consecutive auto reply counter.
   (...)
    553         "message" needs to be provided if the `generate_init_message` method is not overridden.
    554 """
    555 self._prepare_chat(recipient, clear_history)
--> 556 self.send(self.generate_init_message(**context), recipient, silent=silent)

File ~/dev/ai/anomaly-detection/.venv/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py:354, in ConversableAgent.send(self, message, recipient, request_reply, silent)
...
--> 898     raise self._make_status_error_from_response(err.response) from None
    899 except httpx.TimeoutException as err:
    900     if response is not None:

BadRequestError: Error code: 400 - {'error': {'message': 'Invalid content type. image_url is only supported by certain models.', 'type': 'invalid_request_error', 'param': 'messages.[1].content.[1].type', 'code': None}}

The BadRequestError seems to suggest that I do not have access to the openai gpt-4-vision-preview model API. However, I was able to succesfully get a response directly via openai api with the following (using a custom image):

import base64
from openai import OpenAI

client = OpenAI()


def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")


encoded_string = encode_image("anomaly.png")

system_prompt = "You are an expert at analyzing images."
response = client.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=[
        {
            "role": "system",
            "content": [
                {"type": "text", "text": system_prompt},
            ],
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{encoded_string}"},
                }
            ],
        },
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this art style."},
            ],
        },
    ],
    max_tokens=1000,
)

print(response.choices[0].message.content)
print(response.usage.model_dump())

Please advise.

Steps to reproduce

Executite cells 1-3

Screenshots and logs

Screenshot from 2023-12-13 00-23-20
Screenshot from 2023-12-13 00-20-02

Additional Information

Python 3.10.12
"pyautogen[lmm]==0.2.0b4" , "pyautogen[lmm]==0.2.0 and "pyautogen[lmm]==0.2.2 all yield the same error.
Ubuntu 20.04

@julianakiseleva
Copy link
Contributor

@BeibinLi can you please take a look?

@BeibinLi
Copy link
Collaborator

Thanks! Let me take a deeper look. I tried to rerun the notebook but did not encounter any problems.

One possibility is the "llm_config" parameter for the image_agent, which should use the "config_list_4v" oai setting. After failing cell 3, can you try running the following code?

print(image_agent.llm_config)
assert image_agent.llm_config["config_list"][0]["model"] == 'gpt-4-vision-preview'

If results from the above code looks ok, can you try to create image_agent with the following way?

image_agent = MultimodalConversableAgent(
    name="image-explainer",
    max_consecutive_auto_reply=10,
    llm_config={"config_list": config_list_4v, "temperature": 0.5, "max_tokens": 300, "model": "gpt-4-vision-preview"}
)

@julianakiseleva
Copy link
Contributor

thanks @BeibinLi! @antoan did it help?

@antoan
Copy link
Author

antoan commented Dec 15, 2023

@BeibinLi @julianakiseleva

Thank you very much for the suggestion:

The output from cell 3, after I added the first code block (print&assert statements)

[autogen.oai.client: 12-15 22:59:50] {74} WARNING - openai client was provided with an empty config_list, which may not be intended.
{'model': 'gpt-4', 'config_list': [], 'temperature': 0.5, 'max_tokens': 300}
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb](https://file+.vscode-resource.vscode-cdn.net/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb) Cell 7 line 8
      [1](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=0) image_agent = MultimodalConversableAgent(
      [2](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=1)     name="image-explainer",
      [3](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=2)     max_consecutive_auto_reply=10,
      [4](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=3)     llm_config={"config_list": config_list_4v, "temperature": 0.5, "max_tokens": 300, }
      [5](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=4) )
      [7](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=6) print(image_agent.llm_config)
----> [8](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=7) assert image_agent.llm_config["config_list"][0]["model"] == 'gpt-4-vision-preview'
     [10](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=9) user_proxy = autogen.UserProxyAgent(
     [11](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=10)     name="User_proxy",
     [12](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=11)     system_message="A human admin.",
     [13](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=12)     human_input_mode="NEVER", # Try between ALWAYS or NEVER
     [14](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=13)     max_consecutive_auto_reply=0
     [15](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=14) )
     [17](vscode-notebook-cell:/home/antoan/dev/ai/anomaly-detection/agentchat_lmm_gpt-4v.ipynb#W6sZmlsZQ%3D%3D?line=16) # Ask the question with an image

IndexError: list index out of range

It seems that the config is not being picked up from the OAI_CONFIG_LIST file I have in my notebook project. I'm not sure why, as this has worked in my previous experiments with autogen, perhaps it is something to do with with my jupyter setup.

I was able to resolve this by including the "model": "gpt-4-vision-preview" to the llm_config as suggested in your second code block.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants