-
Notifications
You must be signed in to change notification settings - Fork 15.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Community: Adding bulk_size as a setable param for OpenSearchVectorSearch #28325
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
@@ -618,7 +615,6 @@ def add_embeddings( | |||
text_embeddings: Iterable[Tuple[str, List[float]]], | |||
metadatas: Optional[List[dict]] = None, | |||
ids: Optional[List[str]] = None, | |||
bulk_size: int = 500, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
breaking change - can we keep this, and use the passed-in as an override? can still default to the self.bulk_size
behavior if it's None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more breaking changes to the interface
could you revert these to something like bulk_size: Optional[int] = None
, and replace usage of bulk size with bulk_size if bulk_size is not None else self.bulk_size
?
@@ -596,7 +594,6 @@ async def aadd_texts( | |||
texts: Iterable[str], | |||
metadatas: Optional[List[dict]] = None, | |||
ids: Optional[List[str]] = None, | |||
bulk_size: int = 500, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break
@@ -1085,7 +1081,6 @@ def from_texts( | |||
texts: List[str], | |||
embedding: Embeddings, | |||
metadatas: Optional[List[dict]] = None, | |||
bulk_size: int = 500, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break
@@ -1150,7 +1145,6 @@ async def afrom_texts( | |||
texts: List[str], | |||
embedding: Embeddings, | |||
metadatas: Optional[List[dict]] = None, | |||
bulk_size: int = 500, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break
Hi @efriis , thanks for reviewing my request, really appreciate it. :) |
Description:
When using langchain.retrievers.parent_document_retriever.py with vectorstore is OpenSearchVectorSearch, I found that the bulk_size param I passed into OpenSearchVectorSearch class did not work on my ParentDocumentRetriever.add_documents() function correctly, it will be overwrite with int 500 the function which OpenSearchVectorSearch class had (e.g., add_texts(), add_embeddings()...).
So I made this PR requset to fix this, thanks!