Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Community: Adding bulk_size as a setable param for OpenSearchVectorSearch #28325

Merged
merged 5 commits into from
Dec 12, 2024

Conversation

manukychen
Copy link
Contributor

Description:
When using langchain.retrievers.parent_document_retriever.py with vectorstore is OpenSearchVectorSearch, I found that the bulk_size param I passed into OpenSearchVectorSearch class did not work on my ParentDocumentRetriever.add_documents() function correctly, it will be overwrite with int 500 the function which OpenSearchVectorSearch class had (e.g., add_texts(), add_embeddings()...).

So I made this PR requset to fix this, thanks!

Copy link

vercel bot commented Nov 24, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Dec 12, 2024 1:43am

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. community Related to langchain-community Ɑ: vector store Related to vector store module labels Nov 24, 2024
@@ -618,7 +615,6 @@ def add_embeddings(
text_embeddings: Iterable[Tuple[str, List[float]]],
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
bulk_size: int = 500,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking change - can we keep this, and use the passed-in as an override? can still default to the self.bulk_size behavior if it's None

Copy link
Member

@efriis efriis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more breaking changes to the interface

could you revert these to something like bulk_size: Optional[int] = None, and replace usage of bulk size with bulk_size if bulk_size is not None else self.bulk_size?

@@ -596,7 +594,6 @@ async def aadd_texts(
texts: Iterable[str],
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
bulk_size: int = 500,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break

@@ -1085,7 +1081,6 @@ def from_texts(
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
bulk_size: int = 500,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break

@@ -1150,7 +1145,6 @@ async def afrom_texts(
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
bulk_size: int = 500,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break

@efriis efriis self-assigned this Dec 9, 2024
@manukychen
Copy link
Contributor Author

Hi @efriis , thanks for reviewing my request, really appreciate it. :)
I've followed your suggestions and modified the code, unless I misinterpreted anything.
If you have any new questions or suggestions, please let me know. Thanks again!

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Dec 12, 2024
@efriis efriis enabled auto-merge (squash) December 12, 2024 01:43
@efriis efriis merged commit ba9b95c into langchain-ai:master Dec 12, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community lgtm PR looks good. Use to confirm that a PR is ready for merging. size:S This PR changes 10-29 lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants