Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadEvaluator does not pass the embedding option to PairwiseEmbeddingDistanceEvalChain #7738

Open
5 tasks done
sflanker opened this issue Feb 21, 2025 · 1 comment
Open
5 tasks done
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@sflanker
Copy link

sflanker commented Feb 21, 2025

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { HuggingFaceInferenceEmbeddings } from '@langchain/community/embeddings/hf'
import { loadEvaluator } from 'langchain/evaluation'

const embedding = new HuggingFaceInferenceEmbeddings({
  apiKey: process.env['HUGGING_FACE_API_KEY'],
  endpointUrl: process.env['HUGGING_FACE_ENDPOINT_URL']
});

const chain = await loadEvaluator('pairwise_embedding_distance', { embedding });

// This fixes the issue
// ;(chain as any).embedding = embedding

process.stdin.on('data', async (data) => {
  const line = data.toString().trim();
  const result = await chain.evaluateStringPairs({
    prediction: 'This is a test',
    predictionB: line
  })
  console.log(`Similarity: ${result.score}`)
})

Error Message and Stack Trace (if applicable)

OpenAIError: The OPENAI_API_KEY environment variable is missing or empty; either provide it, or instantiate the OpenAI client with an apiKey option, like new OpenAI({ apiKey: 'My API Key' }).
    at new OpenAI (/home/runner/workspace/node_modules/openai/src/index.ts:227:13)
    at OpenAIEmbeddings.embeddingWithRetry (file:///home/runner/workspace/node_modules/@langchain/openai/dist/embeddings.js:177:27)
    at file:///home/runner/workspace/node_modules/@langchain/openai/dist/embeddings.js:125:25
    at Array.map (<anonymous>)
    at OpenAIEmbeddings.embedDocuments (file:///home/runner/workspace/node_modules/@langchain/openai/dist/embeddings.js:117:39)
    at PairwiseEmbeddingDistanceEvalChain._call (file:///home/runner/workspace/node_modules/langchain/dist/evaluation/embedding_distance/base.js:151:46)
    at PairwiseEmbeddingDistanceEvalChain.invoke (/home/runner/workspace/node_modules/langchain/dist/chains/base.js:64:24)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async PairwiseEmbeddingDistanceEvalChain._evaluateStringPairs (file:///home/runner/workspace/node_modules/langchain/dist/evaluation/embedding_distance/base.js:138:24)

Description

I'm trying to use the PairwiseEmbeddingDistanceEvalChain as documented here: https://js.langchain.com/v0.1/docs/guides/evaluation/comparison/pairwise_embedding_distance/ with a custom embedding implementation, however, loadEvaluator is incorrectly ignoring the embedding option (Note: I'm using the latest version of langchainjs, but the new docs do not include any mention of loadEvaluator).

This is pretty obviously a bug right here:

evaluator = new PairwiseEmbeddingDistanceEvalChain({});

For embedding_distance the embedding option is applied, but for pairwise_embedding_distance it is not even though it is supported by PairwiseEmbeddingDistanceEvalChain as is distanceMetric.

System Info

[email protected] | MIT | deps: 12 | versions: 314
Typescript bindings for langchain
https://github.com/langchain-ai/langchainjs/tree/main/langchain/

keywords: llm, ai, gpt3, chain, prompt, prompt engineering, chatgpt, machine learning, ml, openai, embeddings, vectorstores

dist
.tarball: https://registry.npmjs.org/langchain/-/langchain-0.3.19.tgz
.shasum: a0329036c4c870018897b91af2ac442c1f07294b
.integrity: sha512-aGhoTvTBS5ulatA67RHbJ4bcV5zcYRYdm5IH+hpX99RYSFXG24XF3ghSjhYi6sxW+SUnEQ99fJhA5kroVpKNhw==
.unpackedSize: 4.8 MB

dependencies:
@langchain/openai: >=0.1.0 <0.5.0        openapi-types: ^12.1.3                   
@langchain/textsplitters: >=0.0.0 <0.2.0 p-retry: 4                               
js-tiktoken: ^1.0.12                     uuid: ^10.0.0                            
js-yaml: ^4.1.0                          yaml: ^2.2.1                             
jsonpointer: ^5.0.1                      zod-to-json-schema: ^3.22.3              
langsmith: >=0.2.8 <0.4.0                zod: ^3.22.4                             

maintainers:
- nfcampos <[email protected]>
- jacoblee93 <[email protected]>
- andrewnguonly <[email protected]>
- davidduong <[email protected]>
- hwchase17 <[email protected]>
- basproul <[email protected]>

dist-tags:
latest: 0.3.19                             tag-for-publishing-older-releases: 0.2.20  
next: 0.3.2-rc.0                           

published 2 days ago by basproul <[email protected]>

node version: v20.18.1

uname: Linux 60d905d1af20 6.2.16 #1-NixOS x86_64 GNU/Linux

Copy link

dosubot bot commented Feb 21, 2025

Hey there, @sflanker! I'm here to help you out with any bugs, questions, or contributions you have. Let's tackle this together!

The issue you're encountering is indeed due to the loadEvaluator function not passing the embedding option to the PairwiseEmbeddingDistanceEvalChain. This is a bug in the LangChain.js codebase. The PairwiseEmbeddingDistanceEvalChain does support the embedding option, but the current implementation of loadEvaluator does not utilize it when initializing this evaluator type [1].

To work around this issue, you can manually set the embedding property on the evaluator instance after it is created, as you've done in your example code:

(chain as any).embedding = embedding;

This workaround should allow you to use your custom embedding implementation until the bug is fixed in the codebase.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant