LangChain Chat App with Deepseek-R1 and Others

Welcome to this simple demonstration of how to integrate various LLMs and embeddings into a Streamlit-powered application. This project shows how you can upload and parse documents, store them in a vector database, and query them with different language models.

Key Points

Main Goal: Retrieve reasoning tokens from the DeepSeek API, which come through a separate channel from the primary content, and display them in a nice UI while streaming.
Streaming Fix: Solve the streaming issue with DeepSeek API by covering ChatOpenAI for LangChain integration, which is not yet standardized.
Multi-turn Retrieval: Support multi-turn conversation with retrieval logic.
File-based Persistence: Enable uploads of multiple data types (PDF, DOCX, TXT) and persist embeddings in the file system.

Overview

This project focuses on a conversational interface that:

Allows users to upload .docx, .pdf, or .txt files.
Splits the document content into manageable chunks and stores them in a Chroma vectorstore.
Provides an interactive chat interface where users can ask questions that leverage the stored content.
Integrates with various models, including:
- Deepseek (both Chat and Reasoning flavors)
- Ollama for local inference
- OpenAI (e.g., GPT-4 variants)
- Groq for specialized deployments

Installation

Clone the Repository:

git clone https://github.com/yigit353/DeepSeekRAGChat.git
cd DeepSeekRAGChat

Create and Activate a Virtual Environment (Optional but Recommended):
```
python3 -m venv venv
source venv/bin/activate
```
Install Dependencies:
```
pip install -r requirements.txt
```
Set Up Environment Variables:
- Create a .env file in your project root. It should contain the environment variables needed:
```
OPENAI_API_KEY=your_openai_api_key
DEEPSEEK_API_KEY=your_deepseek_api_key
GROQ_API_KEY=your_groq_api_key
EMBEDDING_MODEL=openai  # or 'modernbert'
```
- Adjust these environment variables based on your desired configuration.
Chroma Persistence Directory:
- By default, the vector database is persisted at ./db. If you wish to change this location, edit the code accordingly.

Usage

Start the Streamlit App:
```
streamlit run chat.py
```
Open the App:
- By default, Streamlit apps open at http://localhost:8501.
Upload Documents:
- In the web interface, upload .docx, .pdf, or .txt files.
- The application extracts text using the appropriate loader, splits it into chunks, and inserts those chunks into the Chroma vectorstore.
Chat With Your Data:
- Choose which model you want to use for question answering.
- Enter your queries in the chat input box.
- The app retrieves the most relevant documents to provide context.
- The chosen model generates an answer based on that context.
View Reasoning (Optional):
- When the model responds, an expander titled "View Reasoning Process" will appear.
- Click the expander to see the step-by-step reasoning tokens as they stream from the model.

Supported Models

1. Deepseek

DeepseekChatOpenAI: A specialized wrapper for Deepseek's chat endpoint.
DeepseekChatOpenAI (reasoner): Similar to chat but optimized for more advanced reasoning.

2. Ollama

OllamaLLM: Uses your local Ollama models.

3. OpenAI

ChatOpenAI: GPT-4o model for conversational chat.

4. Groq

ChatGroq: Specialized client with Groq. 70b DeepSeek Reasoner model with super fast generation.

Customization

Chunk Size / Overlap: Modify the RecursiveCharacterTextSplitter parameters to suit your document size and retrieval needs.
Retrieval: Explore LangChain's retrieval features to implement advanced logic.
Model Configuration: Adjust temperature, max tokens, etc., when instantiating your models.

Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request.

License

MIT License. See LICENSE for more information.

Contact

If you have any questions or need further help:

Email: [email protected]
GitHub Issues: Issues Page

Enjoy chatting with your documents!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chat.py		chat.py
deepseek.py		deepseek.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LangChain Chat App with Deepseek-R1 and Others

Key Points

Overview

Installation

Usage

Supported Models

1. Deepseek

2. Ollama

3. OpenAI

4. Groq

Customization

Contributing

License

Contact

About

Releases

Packages

Languages

License

yigit353/DeepSeekRAGChat

Folders and files

Latest commit

History

Repository files navigation

LangChain Chat App with Deepseek-R1 and Others

Key Points

Overview

Installation

Usage

Supported Models

1. Deepseek

2. Ollama

3. OpenAI

4. Groq

Customization

Contributing

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages