- Overview
- Demo
- Technical Flow Chart
- Key Features
- Tech Stack
- Installation and Setup
- Usage
- Contributions
- License
- Citing
- Contact
The Multi-Agent Medical Assistant is an AI-powered chatbot designed to assist with medical diagnosis, research, and patient interactions.
🚀 Powered by Multi-Agent Intelligence, this system integrates:
- 🤖 Large Language Models (LLMs)
- 🖼️ Computer Vision Models for medical imaging analysis
- 📚 Retrieval-Augmented Generation (RAG) leveraging vector databases
- 🌐 Real-time Web Search for up-to-date medical insights
- 👨⚕️ Human-in-the-Loop Validation to verify AI-based medical image diagnoses
🔹 👨💻 Multi-Agent Orchestration with structured graph workflows
🔹 🔍 Advanced RAG Techniques – hybrid retrieval, semantic chunking, and vector search
🔹 ⚡ Confidence-Based Routing & Agent-to-Agent Handoff
🔹 🔒 Scalable, Production-Ready AI with Modularized Code & Robust Exception Handling
📂 For learners: Check out agents/README.md
for a detailed breakdown of the agentic workflow! 🎯
Multi-Agent-Medical-Assistant-v1-with-voiceover-compressed.mp4
If you like what you see and would want to support the project's developer, you can ! :)
📂 For an even more detailed demo video: Check out Multi-Agent-Medical-Assistant-v1.9
. 📽️
-
🤖 Multi-Agent Architecture : Specialized agents working in harmony to handle diagnosis, information retrieval, reasoning, and more
-
🔍 Advanced RAG Retrieval System :
- Unstructured.io parsing to extract and embed text along with tables from PDFs.
- Semantic chunking with structural boundary awareness.
- Qdrant hybrid search combining BM25 sparse keyword search along with dense embedding vector search.
- Query expansion with related terms to enhance search results.
- Metadata enrichment to add context and improve seach accuracy.
- Input-output guardrails to ensure safe and relevant responses.
- Confidence-based agent-to-agent handoff between RAG and Web Search to prevent hallucinations.
- Supported file types for RAG ingestion and retrieval: .txt, .csv, .json, .pdf.
-
🏥 Medical Imaging Analysis
- Brain Tumor Detection (TBD)
- Chest X-ray Disease Classification
- Skin Lesion Segmentation
-
🌐 Real-time Research Integration : Web search agent that retrieves the latest medical research papers and findings
-
📊 Confidence-Based Verification : Log probability analysis ensures high accuracy in medical recommendations
-
🎙️ Voice Interaction Capabilities : Seamless speech-to-text and text-to-speech powered by Eleven Labs API
-
👩⚕️ Expert Oversight System : Human-in-the-loop verification by medical professionals before finalizing outputs
-
⚔️ Input & Output Guardrails : Ensures safe, unbiased, and reliable medical responses while filtering out harmful or misleading content
-
💻 Intuitive User Interface : Designed for healthcare professionals with minimal technical expertise
Note
Upcoming features:
- Brain Tumor Medical Computer Vision model integration.
- Streaming of LLM responses to UI.
Component | Technologies |
---|---|
🔹 Backend Framework | FastAPI, Flask |
🔹 Agent Orchestration | LangGraph |
🔹 Knowledge Storage | Qdrant Vector Database |
🔹 Medical Imaging | Computer Vision Models |
• Brain Tumor: Object Detection (PyTorch) | |
• Chest X-ray: Image Classification (PyTorch) | |
• Skin Lesion: Semantic Segmentation (PyTorch) | |
🔹 Guardrails | LangChain |
🔹 Speech Processing | Eleven Labs API |
🔹 Frontend | HTML, CSS, JavaScript |
🔹 Deployment | Docker, CI/CD Pipeline |
git clone https://github.com/souvikmajumder26/Multi-Agent-Medical-Assistant.git
cd Multi-Agent-Medical-Assistant
- Create a
.env
file and add the following API keys:
Note
You may use any llm and embedding model of your choice...
- If using Azure OpenAI, no modification required.
- If using direct OpenAI, modify the llm and embedding model definitions in the 'config.py' and provide appropriate env variables.
- If using local models, appropriate code changes might be required throughout the codebase especially in 'agents'.
Warning
If all necessary env variables are not provided, errors will be thrown in console.
# LLM Configuration (Azure Open AI - gpt-4o used in development)
# If using any other LLM API key or local LLM, appropriate code modification is required
deployment_name =
model_name = gpt-4o
azure_endpoint =
openai_api_key =
openai_api_version =
# Embedding Model Configuration (Azure Open AI - text-embedding-ada-002 used in development)
# If using any other embedding model, appropriate code modification is required
embedding_deployment_name =
embedding_model_name = text-embedding-ada-002
embedding_azure_endpoint =
embedding_openai_api_key =
embedding_openai_api_version =
# Speech API Key (Free credits available with new Eleven Labs Account)
ELEVEN_LABS_API_KEY =
# Web Search API Key (Free credits available with new Tavily Account)
TAVILY_API_KEY =
# Hugging Face Token - using reranker model "ms-marco-TinyBERT-L-6"
HUGGINGFACE_TOKEN =
# (OPTIONAL) If using Qdrant server version, local does not require API key
QDRANT_URL =
QDRANT_API_KEY =
docker-compose up -d
This will start two services:
- fastapi-backend: Runs the FastAPI backend on port 8000
- main-app: Runs the main application (app.py)
# To ingest a single file
docker-compose run --rm fastapi-backend ingest --file ./data/raw/your_file.pdf
# To ingest all files in a directory
docker-compose run --rm fastapi-backend ingest --dir ./data/raw
The application will be available at: http://localhost:8000
docker-compose down
git clone https://github.com/souvikmajumder26/Multi-Agent-Medical-Assistant.git
cd Multi-Agent-Medical-Assistant
- If using conda:
conda create --name <environment-name> python=3.11
conda activate <environment-name>
- If using python venv:
python -m venv <environment-name>
source <environment-name>/bin/activate # For Mac/Linux
<environment-name>\Scripts\activate # For Windows
Important
- ffmpeg is required for speech service to work.
- Poppler and Tesseract OCR are essential for table extraction from PDFs using Unstructured.IO.
- To install poppler and tesseract OCR for Ubuntu/Debian/macOS:
# if on Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y poppler-utils tesseract-ocr
# if on macOS
brew install poppler tesseract
- Install Poppler for Windows:
Download the latest poppler release for Windows from: https://github.com/oschwartz10612/poppler-windows/releases/
Extract the ZIP file to a location on your computer (e.g., 'C:\Program Files\poppler')
Add the bin directory to your PATH environment variable (e.g., 'C:\Program Files\poppler\bin')
- Install Tesseract OCR for Windows:
Download the Tesseract installer from: https://github.com/UB-Mannheim/tesseract/wiki
Run the installer and complete the installation
By default, it installs to 'C:\Program Files\Tesseract-OCR'
Make sure to add it to your PATH during installation or add it manually afterward
- Verify your installation:
Open a new command prompt (to ensure it has the updated PATH)
Run 'tesseract --version' to verify Tesseract is properly installed
Run 'pdfinfo -h' or 'pdftoppm -h' to verify Poppler is properly installed
- If using conda:
conda install -c conda-forge ffmpeg
pip install -r requirements.txt
- If using python venv:
winget install ffmpeg
pip install -r requirements.txt
- Might be required:
pip install unstructured[pdf]
- Create a
.env
file and add the required API keys as shown inOption 1
.
- Run the following commands one after another in separate windows with same directorty and virtual environment. Keep both running simultanesouly.
uvicorn api.fastapi_backend:app --reload
python app.py
- Run any one of the following commands as required. First one to ingest one document at a time, second one to ingest multiple documents from a directory.
python ingest_rag_data.py --file ./data/raw/brain_tumors_ucni.pdf
python ingest_rag_data.py --dir ./data/raw
The vector database data is stored in Docker volumes:
vector-db-processed
: Contains data from thedata/processed
directoryvector-db-qdrant
: Contains data from thedata/qdrantdb
directoryupload-data
: Contains uploaded files in theuploads
directory
This ensures your data persists even if you remove the containers.
- If the containers aren't starting properly, check logs:
docker-compose logs fastapi-backend
docker-compose logs main-app
- Make sure all required environment variables are set in the
.env
file - To completely clean up and restart:
docker-compose down -v
docker-compose up -d --build
Note
- The first run can be jittery and may get errors - be patient and check the console for ongoing downloads and installations.
- On the first run, many models will be downloaded - yolo for tesseract ocr, computer vision agent models, cross-encoder reranker model, etc.
- Once they are completed, retry. Everything should work seamlessly since all of it is thoroughly tested.
- Upload medical images for AI-based diagnosis. Task specific Computer Vision model powered agents - upload images from 'sample_images' folder to try out.
- Ask medical queries to leverage retrieval-augmented generation (RAG) if information in memory or web-search to retrieve latest information.
- Use voice-based interaction (speech-to-text and text-to-speech).
- Review AI-generated insights with human-in-the-loop verification.
Contributions are welcome! Please check the issues tab for feature requests and improvements.
This project is licensed under the Apache-2.0 License. See the LICENSE file for details.
@misc{Souvik2025,
Author = {Souvik Majumder},
Title = {Multi Agent Medical Assistant},
Year = {2025},
Publisher = {GitHub},
Journal = {GitHub repository},
Howpublished = {\url{https://github.com/souvikmajumder26/Multi-Agent-Medical-Assistant}}
}
For any questions or collaboration inquiries, reach out to Souvik Majumder on:
🔗 LinkedIn: https://www.linkedin.com/in/souvikmajumder26
🔗 GitHub: https://github.com/souvikmajumder26