Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How to improve RAG Accuracy with RAGFlow? #1337

Open
BennisonDevadoss opened this issue Jul 1, 2024 · 1 comment
Open

[Question]: How to improve RAG Accuracy with RAGFlow? #1337

BennisonDevadoss opened this issue Jul 1, 2024 · 1 comment
Labels
question Further information is requested

Comments

@BennisonDevadoss
Copy link

Describe your problem

I've been using RAGFlow with the RAG system for the past few months, and I have a couple of questions based on my usage so far.

Question 1:
When querying a database that stores document embeddings (e.g., Elasticsearch), retrieving specific information can be challenging if the query terms do not explicitly match the document keywords. For instance, searching a resume for a candidate's name might fail if the resume does not explicitly contain terms like 'candidate' or 'name'. The challenge here is how to extract relevant information from the vector database in such cases.

Example Scenario:

  • File Upload: A resume is uploaded and stored as embeddings in a vector database like Elasticsearch.
  • Query: A user queries the database with, "What is the candidate's name?"
  • Challenge: The resume may not explicitly mention 'candidate' or 'name', complicating retrieval from the vector database.

In such scenarios, how can we improve RAGFlow's accuracy?

Question 2:
Does RAGFlow store documents in both Elasticsearch and Minio? If so, why is it necessary to store user-uploaded files in both systems?


@BennisonDevadoss BennisonDevadoss added the question Further information is requested label Jul 1, 2024
@KevinHuSh
Copy link
Collaborator

A resume is actually a piece of structured data though it looks like a bunch of unstructured text.
So, try the demo. It apply a resume parser to turn it to structured data which will be retrievaled by SQL.
The SQL is transformed from user's question by LLM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants