Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: fixed bug in model2vec embedding code #28670

Merged
merged 1 commit into from
Dec 11, 2024

Conversation

Pringled
Copy link
Contributor

This PR fixes a bug with the current implementation for Model2Vec embeddings where embed_documents does not work as expected.

  • Description: the current implementation uses encode_as_sequence for encoding documents. This is incorrect, as encode_as_sequence creates token embeddings and not mean embeddings. The normal encode function handles both single and batched inputs and should be used instead. The return type was also incorrect, as encode returns a NumPy array. This PR converts the embedding to a list so that the output is consistent with the Embeddings ABC.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Dec 11, 2024
Copy link

vercel bot commented Dec 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Dec 11, 2024 3:55pm

@dosubot dosubot bot added community Related to langchain-community Ɑ: embeddings Related to text embedding models module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Dec 11, 2024
Copy link
Member

@efriis efriis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Dec 11, 2024
@efriis efriis merged commit ee640d6 into langchain-ai:master Dec 11, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature community Related to langchain-community Ɑ: embeddings Related to text embedding models module lgtm PR looks good. Use to confirm that a PR is ready for merging. size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants