This project is a Retrieval-Augmented Generation (RAG)-based AI chatbot, capable of querying PDF documents using the OpenAI API integrated with LangChain. With LangGraph and FAISS, vector-based data queries are performed on PDF files, allowing natural language interaction with the data.
📄 Medium Article: How to Create a RAG-based PDF Chatbot with LangChain
🤖 Live Demo: RAG PDF Chatbot
📂 Project Structure:
- LangChain: LangChain is used to process the information within the PDF files and pass it to the language model. The user's questions are enriched with relevant sections from the PDF to produce more accurate answers.
- Vectorization: FAISS is used to vectorize PDF files, allowing for efficient and accurate data retrieval.
- OPENAI API: The OpenAI API is used to generate responses to the user's questions based on the information extracted from the PDF files.
- LangGraph: LangGraph is used to generate the graph structure of the chain, which is then used to enrich the user's questions with relevant information from the PDF files.
🕸️ Graph Map
🎯 Use Cases:
- Research: Quickly find relevant information from research papers and articles.
- Education: Get answers to questions from textbooks and study materials.
- Business: Extract data from reports and documents for analysis and decision-making.
- Legal: Search for specific information in legal documents and contracts.
- Healthcare: Retrieve information from medical journals and reports.
- Finance: Extract data from financial reports and documents.
- Customer Support: Provide quick and accurate answers to customer queries.
- General Knowledge: Get answers to general questions from a wide range of sources.
- And more...
📦 Installation:
- Clone the repository:
git clone https://github.com/mesutdmn/Chat-With-Your-PDF.git cd Chat-With-Your-PDF
- Install the required libraries:
pip install -r requirements.txt
📚 Requirements:
- Python 3.12+
- OpenAI API Key (Get it from OpenAI)
- PDF files to query
📋 Used Libraries:
faiss-cpu==1.8.0.post1
langchain==0.3.1
langchain-community==0.3.1
langchain-core==0.3.6
langchain-openai==0.2.1
langchain-text-splitters==0.3.0
langgraph==0.2.28
langgraph-checkpoint==1.0.12
pypdf==5.0.1
streamlit==1.38.0
🚀 Running the Project:
- Start the Streamlit server:
streamlit run app.py
- Open the browser and go to
http://localhost:8501
to access the chatbot interface. - Upload the PDF file you want to query and start chatting with the chatbot.
- Ask questions related to the content of the PDF file, and the chatbot will provide answers based on the information in the document.
- Enjoy interacting with the RAG PDF Chatbot!
📝 Note: The chatbot is still in development, and improvements are being made to enhance its performance and capabilities. If you encounter any issues or have suggestions for improvement, please feel free to open an issue submit a pull request, or contact me on LinkedIn.
👨💻 Developed by: Mesut Duman
📄 License: This project is licensed under the Apache License 2.0.