Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add OpenSearch VectorStore Component with Ingest and Search Capabilities #3799

Merged
merged 8 commits into from
Oct 1, 2024

Conversation

joaoguilhermeS
Copy link
Collaborator

@joaoguilhermeS joaoguilhermeS commented Sep 13, 2024

This PR introduces a new OpenSearch VectorStore component that enables data ingestion and search functionalities within our system.

Fixes #3735

Key features include:

  • Authentication and SSL Certificate Support: Configurable options for secure connections, including username/password authentication and SSL certificate verification.
  • Data Ingestion: Ability to ingest documents into the OpenSearch vector store, converting data objects into LangChain documents.
  • Search Capabilities: Supports both similarity and Maximal Marginal Relevance (MMR) search types with configurable parameters like the number of results and score thresholds.
  • Error Handling and Logging: Enhanced error messages and logging for easier troubleshooting and debugging.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Sep 13, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 13, 2024
Copy link
Contributor

autofix-ci bot commented Sep 13, 2024

Hi! I'm autofix logoautofix.ci, a bot that automatically fixes trivial issues such as code formatting in pull requests.

I would like to apply some automated changes to this pull request, but it looks like I don't have the necessary permissions to do so. To get this pull request into a mergeable state, please do one of the following two things:

  1. Allow edits by maintainers for your pull request, and then re-trigger CI (for example by pushing a new commit).
  2. Manually fix the issues identified for your pull request (see the GitHub Actions output for details on what I would like to change).

Copy link

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-3799.dmtpw4p5recq1.amplifyapp.com

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2024
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Sep 23, 2024
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Sep 23, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Sep 24, 2024
Copy link
Collaborator

@jordanrfrazier jordanrfrazier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome, thanks for this. Couple questions around the search functionality

if query:
search_type = self.search_type.lower()
if search_type == "similarity":
results = vector_store.similarity_search_with_score(query, **search_kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this should be the default. Langchain's pattern is to handle three main types: similarity, similarity_score_threshold, and mmr. The second handles the score.

https://github.com/langchain-ai/langchain/blob/5346c7b27ec50ddf156c6aff15854185a62a1af4/libs/core/langchain_core/vectorstores/base.py#L315

Unless there's a particular difference using OpenSearch, is there any reason we can't use the default search impl?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was wrong, I also fixed it to match the default implementation..

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Oct 1, 2024
@jordanrfrazier jordanrfrazier enabled auto-merge (squash) October 1, 2024 21:39
@jordanrfrazier jordanrfrazier merged commit b9bcb09 into langflow-ai:main Oct 1, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenSearch Component
3 participants