You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to ingest data from PDF file to Astra DB using Azure OpenAI embeddings, but no matter if I use LangFlow on-line from datastax domain or via docker-compose the result is the same
docker:
datastax online:
Reproduction
open RAG flow from LangFlow template
focus on the ingestion part at the bottom
replace OpenAI Embeddings by Azure OpenAI Embeddings
run the flow
you encounter an error: Error building Component Astra DB: Error initializing AstraDBVectorStore: Unsupported data type
I checked the docker Logs but it's nothing there except:
ValueError: Error initializing
AstraDBVectorStore: Unsupported data type
Expected behavior
The flow builds successfully and the content is sent to Astra DB.
Hey @josefpolach! 👋 I'm here to help you with any bugs, questions, or contributions you have. Let's work together to resolve this issue.
I found a similar issue that was discussed and resolved. The issue was related to Astra DB Vectorize not allowing for Bring-Your-Own-Embedding (BYOE), which was addressed in a pull request that introduced changes to handle custom embeddings [1].
To resolve the issue you're facing with Azure OpenAI embeddings not being accepted by the Astra DB component, ensure that your configuration settings are correct. Specifically, you need to:
Set the embedding_service to "Embedding Model".
Configure the provider to "Azure OpenAI".
Choose an appropriate model name, such as "text-embedding-3-small" or "text-embedding-ada-002".
Provide the necessary API key and authentication parameters.
These settings are crucial for the AstraVectorStoreComponent to correctly handle Azure OpenAI embeddings [2][3]. If the error persists, it might be due to a data type mismatch, which should be logged using the loguru logger for further debugging [4].
Thank you for your contribution! This issue will be closed. If you have any questions or encounter another problem, please open a new issue and we will be ready to help you.
Bug Description
I am trying to ingest data from PDF file to Astra DB using Azure OpenAI embeddings, but no matter if I use LangFlow on-line from datastax domain or via docker-compose the result is the same
docker:
datastax online:
Reproduction
I checked the docker Logs but it's nothing there except:
Expected behavior
The flow builds successfully and the content is sent to Astra DB.
Who can help?
No response
Operating System
Docker Compose
Langflow Version
image: langflowai/langflow:latest
Python Version
None
Screenshot
No response
Flow File
https://astra.datastax.com/langflow/ebf1ade5-1471-4ecb-948c-1055de0f9e9a/flow/be5c72e1-802f-4be8-959c-404f5ccb3055/folder/469dd6bc-8ac4-4bc3-8cea-756c61cedb09
OpenAI Astra Ingestion (1).json
The text was updated successfully, but these errors were encountered: