Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIL Image conversion issues with Gemini API Parts #5033

Closed
xuefei-wang opened this issue Jan 14, 2025 · 7 comments
Closed

PIL Image conversion issues with Gemini API Parts #5033

xuefei-wang opened this issue Jan 14, 2025 · 7 comments

Comments

@xuefei-wang
Copy link

xuefei-wang commented Jan 14, 2025

What happened?

I encountered type compatibility issues when trying to pass images to Gemini API. The images were converted into PIL Images, but not to Part, therefore causing issues.

I created this conversion function (see below) that works for my use case and added it to autogen/oai/gemini.py. Just wanted to post it in case anyone needs it.

What did you expect to happen?

Error message:

TypeError: Parameter to MergeFrom() must be instance of same class: expected Part got PIL.PngImagePlugin.PngImageFile.

How can we reproduce it (as minimally and precisely as possible)?

from dotenv import load_dotenv

load_dotenv()

import os
from autogen import UserProxyAgent
from autogen.agentchat.contrib.multimodal_conversable_agent import (
    MultimodalConversableAgent,
)


visual_critic_agent = MultimodalConversableAgent(
    "visual_critic_agent",
    llm_config={
        "config_list": [
            {
                "model": "gemini-1.5-flash",
                "api_key": os.environ["GEMINI_API_KEY"],
                "api_type": "google",
            }
        ],
        "cache_seed": None,
    },
)

user_agent = UserProxyAgent(
    "user_agent", human_input_mode="ALWAYS", max_consecutive_auto_reply=0
)


user_agent.initiate_chat(
    visual_critic_agent,
    message="""Please tell me what is in this image?
    <img https://goldenmeadowsretrievers.com/wp-content/uploads/2023/08/golden-retriever-dog-with-newborn-golden-retriever.jpg>
""",
)

AutoGen version

0.4.1

Which package was this bug in

Core

Model used

gemini

Python version

No response

Operating system

No response

Any additional info you think would be helpful for fixing this bug

def _pil_to_part(image: Image.Image) -> Part:
    byte_arr = BytesIO()
    image.save(byte_arr, format=image.format or 'PNG')
    image_bytes = byte_arr.getvalue()
    
    blob = Blob(
        mime_type=f"image/{image.format.lower() if image.format else 'png'}", 
        data=image_bytes
    )
    
    return Part(inline_data=blob)


def _convert_pil_images_in_parts(curr_parts):
    """
    Converts any PIL Images in a list of parts to Part objects while preserving other parts.
    
    Args:
        curr_parts: List of mixed content (PIL Images and Parts)
        
    Returns:
        List where all PIL Images have been converted to Parts
    """
    updated_parts = []
    for part in curr_parts:
        if isinstance(part, Image.Image):
            updated_parts.append(_pil_to_part(part))
        else:
            updated_parts.append(part)
    return updated_parts
@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2025

Thanks for the issue.

I believe this has already been fixed in 0.4.1. The code you are showing is using 0.2 API.

Would you like to submit a fix to the 0.2 package?

Make sure you are using autogen-agentchat and autogen-ext. See readme.

@ekzhu ekzhu added the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label Jan 14, 2025
@xuefei-wang
Copy link
Author

I went back and double-checked - I was using the following versions:

autogen                       0.4
autogen-agentchat             0.4.1
autogen-core                  0.4.1

Following your comment, I created a new virtual env with autogen version 0.4.1 and was able to reproduce the error. I will post the error message and my pip list below. Could you take another look? Thanks!

Error message:

Traceback (most recent call last):                                                                                                                                                                          
  File "/home/xwang3/Projects/test/main.py", line 31, in <module>                                                                                                                                           
    user_agent.initiate_chat(                                                                                                                                                                               
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py", line 1106, in initiate_chat                                                       
    self.send(msg2send, recipient, silent=silent)                                                                                                                                                           
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py", line 741, in send                                                                 
    recipient.receive(message, self, request_reply, silent)                                                                                                                                                 
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py", line 906, in receive                                                              
    reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)                                                                                                                         
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py", line 2060, in generate_reply                                                      
    final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])                                                                                                    
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/agentchat/contrib/multimodal_conversable_agent.py", line 120, in generate_oai_reply                                
    response = client.create(                                                                                                                                                                               
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/oai/client.py", line 832, in create                                                                                
    response = client.create(params)                                                                                                                                                                        
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/oai/gemini.py", line 205, in create                                                                                
    gemini_messages = self._oai_messages_to_gemini_messages(messages)                                                                                                                                       
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/autogen/oai/gemini.py", line 381, in _oai_messages_to_gemini_messages                                                      
    rst.append(Content(parts=curr_parts, role=role))                                                                                                                                                        
  File "/home/xwang3/Projects/test/test-autogen-ver/lib/python3.10/site-packages/proto/message.py", line 734, in __init__                                                                                   
    super().__setattr__("_pb", self._meta.pb(**params))                                                                                                                                                     
TypeError: Parameter to MergeFrom() must be instance of same class: expected <class 'Part'> got <class 'PIL.PngImagePlugin.PngImageFile'>. 

Pip list

Package                       Version
----------------------------- -----------
aioconsole                    0.8.1
annotated-types               0.7.0
anyio                         4.8.0
autogen                       0.4.1
autogen-agentchat             0.4.1
autogen-core                  0.4.1
autogen-ext                   0.4.1
cachetools                    5.5.0
certifi                       2024.12.14
charset-normalizer            3.4.1
Deprecated                    1.2.15
diskcache                     5.6.3
distro                        1.9.0
docker                        7.1.0
docstring_parser              0.16
exceptiongroup                1.2.2
FLAML                         2.3.3
google-ai-generativelanguage  0.6.10
google-api-core               2.24.0
google-api-python-client      2.159.0
google-auth                   2.37.0
google-auth-httplib2          0.2.0
google-cloud-aiplatform       1.71.1
google-cloud-bigquery         3.27.0
google-cloud-core             2.4.1
google-cloud-resource-manager 1.14.0
google-cloud-storage          2.19.0
google-crc32c                 1.6.0
google-generativeai           0.8.3
google-resumable-media        2.7.2
googleapis-common-protos      1.66.0
grpc-google-iam-v1            0.14.0
grpcio                        1.69.0
grpcio-status                 1.69.0
h11                           0.14.0
httpcore                      1.0.7
httplib2                      0.22.0
httpx                         0.28.1
idna                          3.10
importlib_metadata            8.5.0
jiter                         0.8.2
jsonref                       1.1.0
numpy                         1.26.4
openai                        1.59.7
opentelemetry-api             1.29.0
packaging                     24.2
pillow                        11.1.0
pip                           22.0.2
proto-plus                    1.25.0
protobuf                      5.29.3
pyasn1                        0.6.1
pyasn1_modules                0.4.1
pydantic                      2.10.5
pydantic_core                 2.27.2
pyparsing                     3.2.1
python-dateutil               2.9.0.post0
python-dotenv                 1.0.1
regex                         2024.11.6
requests                      2.32.3
rsa                           4.9
setuptools                    59.6.0
shapely                       2.0.6
six                           1.17.0
sniffio                       1.3.1
termcolor                     2.5.0
tiktoken                      0.8.0
tqdm                          4.67.1
typing_extensions             4.12.2
uritemplate                   4.1.1
urllib3                       2.3.0
vertexai                      1.71.1
wrapt                         1.17.2
zipp                          3.21.0

@github-actions github-actions bot removed the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label Jan 14, 2025
@jackgerrits
Copy link
Member

We do not publish the autogen package. We publish the autogen-* packages.

For the code you are referencing you need to use autogen-agentchat~=0.2 or you can upgrade to using 0.4 which is the new architecture.

@xuefei-wang
Copy link
Author

I am confused. My current version is 0.4.1 for autogen as well as autogen-* packages, as stated above and I was able to run it and get the same error. Are you saying that I code snippet that I use is outdated and the classes/functions are no longer supported in 0.4? Could you elaborate?

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 15, 2025

@xuefei-wang, your code is using the autogen package's classes. They are identical to our autogen-agentchat for version 0.2.x. Please, go to our README.md and look at our quick start examples for the actual imports and classes we provide.

autogen is published by a fork of AutoGen, and they have been pumping their version number recently very quickly, so it is quite confusing.

**Anyone can publish a package to PyPI and use our import name spaces. So, it is very important to check what you actually installed. **

My suggestion is:

  1. delete the autogen package: pip uninstall autogen
  2. Install the autogen-agentchat and autogen-ext packages, with updates: pip install -U autogen-agentchat autogen-ext[openai]
  3. Go to our README and try the quick start examples. Follow the tutorial documentation to get started.

@rysweet
Copy link
Collaborator

rysweet commented Jan 15, 2025

@xuefei-wang please see #4217
the package "autogen" no dash is not affiliated with this project (unfortunately). it has been co-opted by a fork.

@xuefei-wang
Copy link
Author

Got it. Thanks for the clarification, everyone! It was very helpful.
I will close the issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants