Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAIEmbeddings - embed_documents - chunk size bug #29759

Closed
5 tasks done
AndreiDumitrescu26 opened this issue Feb 12, 2025 · 0 comments
Closed
5 tasks done

OpenAIEmbeddings - embed_documents - chunk size bug #29759

AndreiDumitrescu26 opened this issue Feb 12, 2025 · 0 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate Flagged for investigation.

Comments

@AndreiDumitrescu26
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

No example code needed. The bug is in the current version of the repo.

Error Message and Stack Trace (if applicable)

No response

Description

In the following code:

chunk_size_ = chunk_size or self.chunk_size
if not self.check_embedding_ctx_length:
embeddings: List[List[float]] = []
for i in range(0, len(texts), self.chunk_size):
response = self.client.create(
input=texts[i : i + chunk_size_], **self._invocation_params
)
if not isinstance(response, dict):
response = response.dict()
embeddings.extend(r["embedding"] for r in response["data"])

at line 576 you should use chunk_size_ rather than self.chunk_size.

System Info

System Information

OS: Linux
OS Version: #17~22.04.1-Ubuntu SMP Sat Mar 9 04:50:38 UTC 2024
Python Version: 3.12.8 (main, Jan 5 2025, 05:33:15) [Clang 19.1.6 ]

Package Information

langchain_core: 0.3.34
langchain: 0.3.17
langchain_community: 0.3.16
langsmith: 0.3.6
langchain_chroma: 0.2.1
langchain_openai: 0.3.5
langchain_text_splitters: 0.3.6
langgraph_sdk: 0.1.51

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.11.12
async-timeout: Installed. No version info available.
chromadb: 0.6.3
dataclasses-json: 0.6.7
fastapi: 0.115.8
httpx: 0.28.1
httpx-sse: 0.4.0
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-core<1.0.0,>=0.3.34: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith<0.4,>=0.1.125: Installed. No version info available.
numpy: 1.26.4
openai<2.0.0,>=1.58.1: Installed. No version info available.
orjson: 3.10.15
packaging<25,>=23.2: Installed. No version info available.
pydantic: 2.10.6
pydantic-settings: 2.7.1
pydantic<3.0.0,>=2.5.2;: Installed. No version info available.
pydantic<3.0.0,>=2.7.4;: Installed. No version info available.
pytest: 8.3.4
PyYAML: 6.0.2
PyYAML>=5.3: Installed. No version info available.
requests: 2.32.3
requests-toolbelt: 1.0.0
rich: 13.9.4
SQLAlchemy: 2.0.38
tenacity: 9.0.0
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken<1,>=0.7: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0

@langcarl langcarl bot added the investigate Flagged for investigation. label Feb 12, 2025
@dosubot dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Feb 12, 2025
chaymaeelaattabi added a commit to chaymaeelaattabi/langchain that referenced this issue Feb 12, 2025
@ccurme ccurme closed this as completed in 4b08a7e Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate Flagged for investigation.
Projects
None yet
Development

No branches or pull requests

1 participant