Skip to content

Commit dc79b15

Browse files
ranfysvalle02Fabian ValleHk669thinkallcaseyclements
authored andcommitted
+mdb atlas vectordb [clean_final] (#3000)
* +mdb atlas * Update test/agentchat/contrib/vectordb/test_mongodb.py Co-authored-by: HRUSHIKESH DOKALA <[email protected]> * update test_mongodb.py; we dont need to do the assert .collection_name vs .name * Try fix mongodb service * Try fix mongodb service * Update username and password * Update autogen/agentchat/contrib/vectordb/mongodb.py * closer --- but im not super thrilled about the solution... * PYTHON-4506 Expanded tests and simplified vector search pipelines * Update mongodb.py * Update mongodb.py - Casey * search_index_magic index_name change; keeping track of lucene indexes is tricky * Fix format * Fix tests * hacking trying to figure this out * Streamline checks for indexes in construction and restructure tests * Add tests for score_threshold, embedding inclusion, and multiple query tests * refactored create_collection to meet base object requirements * lint * change the localhost port to 27017 * add test to check that no embedding is there unless explicitly provided * Update logger * Add test get docs with ids=None * Rename and update notebook * have index management include waiting behaviors * Adds further optional waits or users and tests. Cleans up upsert. * ensure the embedding size for multiple embedding inputs is equal to dimensions * fix up tests and add configuration to ensure documents and indexes are READY for querying * fix import failure * adjust typing for 3.9 * fix up the notebook output * changed language to communicate time taken on first init_chat call * replace environment variable usage --------- Co-authored-by: Fabian Valle <[email protected]> Co-authored-by: HRUSHIKESH DOKALA <[email protected]> Co-authored-by: Li Jiang <[email protected]> Co-authored-by: Casey Clements <[email protected]> Co-authored-by: Jib <[email protected]> Co-authored-by: Jib <[email protected]> Co-authored-by: Cozypet <[email protected]>
1 parent 1d86c89 commit dc79b15

File tree

6 files changed

+1561
-2
lines changed

6 files changed

+1561
-2
lines changed

.github/workflows/contrib-tests.yml

+7
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,10 @@ jobs:
8787
--health-retries 5
8888
ports:
8989
- 5432:5432
90+
mongodb:
91+
image: mongodb/mongodb-atlas-local:latest
92+
ports:
93+
- 27017:27017
9094
steps:
9195
- uses: actions/checkout@v4
9296
- name: Set up Python ${{ matrix.python-version }}
@@ -104,6 +108,9 @@ jobs:
104108
- name: Install pgvector when on linux
105109
run: |
106110
pip install -e .[retrievechat-pgvector]
111+
- name: Install mongodb when on linux
112+
run: |
113+
pip install -e .[retrievechat-mongodb]
107114
- name: Install unstructured when python-version is 3.9 and on linux
108115
if: matrix.python-version == '3.9'
109116
run: |

autogen/agentchat/contrib/vectordb/base.py

+7-2
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,8 @@ def get_docs_by_ids(
186186
ids: List[ItemID] | A list of document ids. If None, will return all the documents. Default is None.
187187
collection_name: str | The name of the collection. Default is None.
188188
include: List[str] | The fields to include. Default is None.
189-
If None, will include ["metadatas", "documents"], ids will always be included.
189+
If None, will include ["metadatas", "documents"], ids will always be included. This may differ
190+
depending on the implementation.
190191
kwargs: dict | Additional keyword arguments.
191192
192193
Returns:
@@ -200,7 +201,7 @@ class VectorDBFactory:
200201
Factory class for creating vector databases.
201202
"""
202203

203-
PREDEFINED_VECTOR_DB = ["chroma", "pgvector", "qdrant"]
204+
PREDEFINED_VECTOR_DB = ["chroma", "pgvector", "mongodb", "qdrant"]
204205

205206
@staticmethod
206207
def create_vector_db(db_type: str, **kwargs) -> VectorDB:
@@ -222,6 +223,10 @@ def create_vector_db(db_type: str, **kwargs) -> VectorDB:
222223
from .pgvectordb import PGVectorDB
223224

224225
return PGVectorDB(**kwargs)
226+
if db_type.lower() in ["mdb", "mongodb", "atlas"]:
227+
from .mongodb import MongoDBAtlasVectorDB
228+
229+
return MongoDBAtlasVectorDB(**kwargs)
225230
if db_type.lower() in ["qdrant", "qdrantdb"]:
226231
from .qdrant import QdrantVectorDB
227232

0 commit comments

Comments
 (0)