Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: ragproxyagent always return InvalidCollectionException: Collection autogen-docs does not exist. #3551

Closed
inoue0426 opened this issue Sep 20, 2024 · 5 comments · Fixed by #3557

Comments

@inoue0426
Copy link

inoue0426 commented Sep 20, 2024

Describe the issue

I am following this sample https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat/ and it always return below error. Could you teach me how to resolve this?

[Sep 20] This also shows the same error. https://microsoft.github.io/autogen/docs/topics/retrieval_augmentation/

ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem="What is autogen?")
Trying to create collection.
---------------------------------------------------------------------------
InvalidCollectionException                Traceback (most recent call last)
Cell In[10], line 1
----> 1 ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem="What is autogen?")

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py:1084, in ConversableAgent.initiate_chat(self, recipient, clear_history, silent, cache, max_turns, summary_method, summary_args, message, **kwargs)
   1082 self._prepare_chat(recipient, clear_history)
   1083 if isinstance(message, Callable):
-> 1084     msg2send = message(_chat_info["sender"], _chat_info["recipient"], kwargs)
   1085 else:
   1086     msg2send = self.generate_init_message(message, **kwargs)

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/retrieve_user_proxy_agent.py:636, in RetrieveUserProxyAgent.message_generator(sender, recipient, context)
    633 n_results = context.get("n_results", 20)
    634 search_string = context.get("search_string", "")
--> 636 sender.retrieve_docs(problem, n_results, search_string)
    637 sender.problem = problem
    638 sender.n_results = n_results

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/retrieve_user_proxy_agent.py:560, in RetrieveUserProxyAgent.retrieve_docs(self, problem, n_results, search_string)
    558 if not self._collection or not self._get_or_create:
    559     print("Trying to create collection.")
--> 560     self._init_db()
    561     self._collection = True
    562     self._get_or_create = True

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/retrieve_user_proxy_agent.py:339, in RetrieveUserProxyAgent._init_db(self)
    336 else:
    337     IS_TO_CHUNK = True
--> 339 self._vector_db.active_collection = self._vector_db.create_collection(
    340     self._collection_name, overwrite=self._overwrite, get_or_create=self._get_or_create
    341 )
    343 docs = None
    344 if IS_TO_CHUNK:

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/vectordb/chromadb.py:86, in ChromaVectorDB.create_collection(self, collection_name, overwrite, get_or_create)
     84         collection = self.active_collection
     85     else:
---> 86         collection = self.client.get_collection(collection_name, embedding_function=self.embedding_function)
     87 except ValueError:
     88     collection = None

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/chromadb/api/client.py:142, in Client.get_collection(self, name, id, embedding_function, data_loader)
    132 @override
    133 def get_collection(
    134     self,
   (...)
    140     data_loader: Optional[DataLoader[Loadable]] = None,
    141 ) -> Collection:
--> 142     model = self._server.get_collection(
    143         id=id,
    144         name=name,
    145         tenant=self.tenant,
    146         database=self.database,
    147     )
    148     return Collection(
    149         client=self._server,
    150         model=model,
    151         embedding_function=embedding_function,
    152         data_loader=data_loader,
    153     )

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/chromadb/telemetry/opentelemetry/__init__.py:146, in trace_method.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    144 global tracer, granularity
    145 if trace_granularity < granularity:
--> 146     return f(*args, **kwargs)
    147 if not tracer:
    148     return f(*args, **kwargs)

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/chromadb/api/segment.py:251, in SegmentAPI.get_collection(self, name, id, tenant, database)
    249     return existing[0]
    250 else:
--> 251     raise InvalidCollectionException(f"Collection {name} does not exist.")

InvalidCollectionException: Collection autogen-docs does not exist.

Steps to reproduce

Below is the code. This also happened with OPEN AI API.

import autogen
import litellm

from autogen import AssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

llm_config = {
    "config_list": [
        {
            "model": "NotRequired",  # Loaded with LiteLLM command
            "api_key": "NotRequired",  # Not needed
            "base_url": "http://0.0.0.0:4000/",  # Your LiteLLM URL
            "price": [0, 0],  # Put in price per 1K tokens [prompt, response] as free!
        }
    ],
    "cache_seed": None,  # Turns off caching, useful for testing different models
}

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config=llm_config,
)

ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    retrieve_config={
        "task": "qa",
        "docs_path": "https://raw.githubusercontent.com/microsoft/autogen/main/README.md",
    }, 
    code_execution_config=False
)

assistant.reset()
ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem="What is autogen?")

Screenshots and logs

Trying to create collection.
---------------------------------------------------------------------------
InvalidCollectionException                Traceback (most recent call last)
Cell In[10], line 1
----> 1 ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem="What is autogen?")

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py:1084, in ConversableAgent.initiate_chat(self, recipient, clear_history, silent, cache, max_turns, summary_method, summary_args, message, **kwargs)
   1082 self._prepare_chat(recipient, clear_history)
   1083 if isinstance(message, Callable):
-> 1084     msg2send = message(_chat_info["sender"], _chat_info["recipient"], kwargs)
   1085 else:
   1086     msg2send = self.generate_init_message(message, **kwargs)

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/retrieve_user_proxy_agent.py:636, in RetrieveUserProxyAgent.message_generator(sender, recipient, context)
    633 n_results = context.get("n_results", 20)
    634 search_string = context.get("search_string", "")
--> 636 sender.retrieve_docs(problem, n_results, search_string)
    637 sender.problem = problem
    638 sender.n_results = n_results

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/retrieve_user_proxy_agent.py:560, in RetrieveUserProxyAgent.retrieve_docs(self, problem, n_results, search_string)
    558 if not self._collection or not self._get_or_create:
    559     print("Trying to create collection.")
--> 560     self._init_db()
    561     self._collection = True
    562     self._get_or_create = True

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/retrieve_user_proxy_agent.py:339, in RetrieveUserProxyAgent._init_db(self)
    336 else:
    337     IS_TO_CHUNK = True
--> 339 self._vector_db.active_collection = self._vector_db.create_collection(
    340     self._collection_name, overwrite=self._overwrite, get_or_create=self._get_or_create
    341 )
    343 docs = None
    344 if IS_TO_CHUNK:

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/autogen/agentchat/contrib/vectordb/chromadb.py:86, in ChromaVectorDB.create_collection(self, collection_name, overwrite, get_or_create)
     84         collection = self.active_collection
     85     else:
---> 86         collection = self.client.get_collection(collection_name, embedding_function=self.embedding_function)
     87 except ValueError:
     88     collection = None

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/chromadb/api/client.py:142, in Client.get_collection(self, name, id, embedding_function, data_loader)
    132 @override
    133 def get_collection(
    134     self,
   (...)
    140     data_loader: Optional[DataLoader[Loadable]] = None,
    141 ) -> Collection:
--> 142     model = self._server.get_collection(
    143         id=id,
    144         name=name,
    145         tenant=self.tenant,
    146         database=self.database,
    147     )
    148     return Collection(
    149         client=self._server,
    150         model=model,
    151         embedding_function=embedding_function,
    152         data_loader=data_loader,
    153     )

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/chromadb/telemetry/opentelemetry/__init__.py:146, in trace_method.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    144 global tracer, granularity
    145 if trace_granularity < granularity:
--> 146     return f(*args, **kwargs)
    147 if not tracer:
    148     return f(*args, **kwargs)

File ~/miniconda3/envs/torch/lib/python3.10/site-packages/chromadb/api/segment.py:251, in SegmentAPI.get_collection(self, name, id, tenant, database)
    249     return existing[0]
    250 else:
--> 251     raise InvalidCollectionException(f"Collection {name} does not exist.")

InvalidCollectionException: Collection autogen-docs does not exist.

Additional Information

pyautogen 0.2.34
Python    3.10.0
macOS.    13.2.1 (22D68)
@vijaygill
Copy link

@inoue0426 - Coincidently I was trying the same (autogen RAG) around the time you posted the issue and I faced same issue too.
I sorted it out by kinda hacky way by creating the collection using chromadb client. Code snippets shown below:

CHROMA_DB_PATH="/app/tmp/chromadb"
CHROMA_COLLECTION="autogen-docs-test"

chroma_client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
collection = chroma_client.get_or_create_collection(name=CHROMA_COLLECTION)

ollama_ef = embedding_functions.OllamaEmbeddingFunction(
        url="http://<my local ollama host and port>/api/embeddings",
        model_name="mxbai-embed-large",
        )
vector_db = ChromaVectorDB(path=CHROMA_DB_PATH, embedding_function = ollama_ef)

and usage of db path and collection name while creating agent

ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    human_input_mode="NEVER",
    llm_config=llm_config,
    code_execution_config=False,
    retrieve_config={
        "model": config_list[0]["model"],
        "task": "qa",
        "update_context": True,
        "n_results": 3,
        "docs_path":[
            "./qa.txt",
            ],
       "get_or_create": True,
       "overwrite": False,
       "vector_db": vector_db,
       "collection_name": CHROMA_COLLECTION,
       "embedding_function": ollama_ef,
    },
)

@inoue0426
Copy link
Author

@vijaygill This works! Thanks.

@thinkall
Copy link
Collaborator

Hi @inoue0426 , @vijaygill ,thanks a lot for reporting the issue. It caused by an error type change inside chromadb. For now, you can also downgrade chromadb to <=0.5.0 (maybe some versions >0.5.0 will work, but I didn't try all of them). I'll raise a PR to fix it.

@vijaygill
Copy link

@thinkall - Thanks! I am working on a POC application so this hack is working for me and I will revisit this part later. May be a new version with your PR might be already available by then!
Thanks for all the great work!

@inoue0426
Copy link
Author

@thinkall Thanks! I will try to use it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants