Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .cursor/rules/frontend-rule.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ alwaysApply: false
- When asking the user for sensitive information - you must use password style text-entry boxes in the UI.
- You should use Next.js as it works best with Vercel.
- This frontend will ultimately be deployed on Vercel, but it should be possible to test locally.
- Always provide users with a way to run the created UI once you have created it.
- Always provide users with a way to run the created UI once you have created it.
- I want the theme colors to be #9E72C3, #924DBF, #7338A0, #4A2574, #0F0529
7 changes: 5 additions & 2 deletions .cursor/rules/general-rule.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,10 @@ alwaysApply: true
---
## Rules to Follow

- You must always commit your changes whenever you update code.
- You must always try and write code that is well documented. (self or commented is fine)
- You must only work on a single feature at a time.
- You must explain your decisions thouroughly to the user.
- You must explain your decisions thouroughly to the user.

You always prefer to use branch development. Before writing any code - you create a feature branch to hold those changes.

After you are done - provide instructions in a "MERGE.md" file that explains how to merge the changes back to main with both a GitHub PR route and a GitHub CLI route.
141 changes: 141 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,144 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.vercel

# =============================================================================
# Node.js / Frontend Dependencies
# =============================================================================

# Dependency directories
node_modules/
jspm_packages/

# NPM package lock files (keep package-lock.json for reproducible builds)
# Uncomment the line below if you want to ignore package-lock.json
# package-lock.json

# Yarn lock file (if using Yarn)
yarn.lock

# PNPM lock file (if using PNPM)
pnpm-lock.yaml

# NPM debug logs
npm-debug.log*
yarn-debug.log*
yarn-error.log*
lerna-debug.log*

# Runtime data
pids
*.pid
*.seed
*.pid.lock

# Coverage directory used by tools like istanbul
coverage/
*.lcov

# nyc test coverage
.nyc_output

# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# Bower dependency directory (https://bower.io/)
bower_components

# Node-waf configuration
.lock-wscript

# Compiled binary addons (https://nodejs.org/api/addons.html)
build/Release

# Dependency directories
node_modules/
jspm_packages/

# Snowpack dependency directory (https://snowpack.dev/)
web_modules/

# TypeScript cache
*.tsbuildinfo

# Optional npm cache directory
.npm

# Optional eslint cache
.eslintcache

# Microbundle cache
.rpt2_cache/
.rts2_cache_cjs/
.rts2_cache_es/
.rts2_cache_umd/

# Optional REPL history
.node_repl_history

# Output of 'npm pack'
*.tgz

# Yarn Integrity file
.yarn-integrity

# parcel-bundler cache (https://parceljs.org/)
.cache
.parcel-cache

# Next.js build output
.next/
out/

# Nuxt.js build / generate output
.nuxt
dist

# Gatsby files
.cache/
public

# Vuepress build output
.vuepress/dist

# Serverless directories
.serverless/

# FuseBox cache
.fusebox/

# DynamoDB Local files
.dynamodb/

# TernJS port file
.tern-port

# Stores VSCode versions used for testing VSCode extensions
.vscode-test

# Temporary folders
tmp/
temp/

# OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# Editor directories and files
.vscode/
.idea/
*.swp
*.swo
*~

# =============================================================================
# Vercel
# =============================================================================
.vercel
*.vercel.app
.vercel/
161 changes: 161 additions & 0 deletions MERGE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# RAG PDF Chat Implementation - Merge Instructions

## Overview
This feature branch implements a complete transformation of the chat application into a RAG (Retrieval-Augmented Generation) system for PDF question-answering.

## Changes Made

### Backend (`api/app.py`)
- **Complete rewrite** from simple OpenAI chat to RAG system
- **New endpoints**:
- `POST /api/upload-pdf` - Upload and store PDF files
- `POST /api/chat` - RAG-based question answering
- `GET /api/status` - Check PDF upload and processing status
- `DELETE /api/pdf` - Clear current PDF
- **New dependencies**: PyPDF2, numpy, python-dotenv
- **RAG implementation** using aimakerspace library components:
- PDFLoader for text extraction
- CharacterTextSplitter for chunking (1000 chars, 200 overlap)
- VectorDatabase with OpenAI embeddings
- Context-only response system

### Frontend (`frontend/app/components/ChatInterface.tsx`)
- **Complete interface redesign** for PDF-based chat
- **New features**:
- Drag-and-drop PDF upload area
- PDF status indicator with chunk count
- Single question input (replaced dual developer/user inputs)
- Visual processing indicators
- Source count display for answers
- **Enhanced UX**:
- Real-time status updates
- Upload progress indicators
- Context-aware placeholders and help text

### Dependencies (`api/requirements.txt`)
- Added PyPDF2==3.0.1
- Added numpy==1.24.3
- Added python-dotenv==1.0.0

## Key Features Implemented

### ✅ RAG System
- PDF content is chunked and embedded using OpenAI embeddings
- Vector similarity search retrieves top 3 relevant chunks per question
- LLM responds only based on provided context from PDF

### ✅ Context-Only Responses
- Strict prompt engineering ensures answers come only from PDF content
- Returns "I am not sure." when no relevant context is found
- No general knowledge responses allowed

### ✅ PDF Management
- Single PDF replacement system (upload new PDF replaces previous)
- Temporary in-memory storage during session
- Lazy loading (PDF processed on first question)

### ✅ Professional UI/UX
- Clean, modern interface with visual status indicators
- Drag-and-drop upload with file validation
- Real-time processing status
- Source attribution in responses

## Merge Instructions

### Option 1: GitHub Pull Request
1. Push the feature branch to GitHub:
```bash
git push origin feature/rag-pdf-chat
```

2. Create a Pull Request:
- Go to your GitHub repository
- Click "New Pull Request"
- Select `feature/rag-pdf-chat` → `main`
- Title: "feat: Transform chat app into RAG PDF question-answering system"
- Add description with overview of changes
- Review changes and merge when ready

### Option 2: GitHub CLI
1. Push the feature branch:
```bash
git push origin feature/rag-pdf-chat
```

2. Create and merge PR using GitHub CLI:
```bash
# Create PR
gh pr create --title "feat: Transform chat app into RAG PDF question-answering system" \
--body "Implements complete RAG system for PDF-based Q&A using aimakerspace library. See MERGE.md for details." \
--base main --head feature/rag-pdf-chat

# Review and merge (optional)
gh pr merge --merge --delete-branch
```

### Option 3: Direct Git Merge (Local)
```bash
# Switch to main branch
git checkout main

# Merge feature branch
git merge feature/rag-pdf-chat

# Push to origin
git push origin main

# Clean up feature branch (optional)
git branch -d feature/rag-pdf-chat
git push origin --delete feature/rag-pdf-chat
```

## Testing the Implementation

### 1. Backend Setup
```bash
cd api
pip install -r requirements.txt
python app.py
```

### 2. Frontend Setup
```bash
cd frontend
npm install
npm run dev
```

### 3. Testing Workflow
1. Set OpenAI API key
2. Upload a PDF file (drag-and-drop or click to browse)
3. Wait for processing (status will show "Ready for questions")
4. Ask questions about the PDF content
5. Verify responses are context-only

### 4. Expected Behavior
- ✅ Questions with relevant PDF content → Detailed answers with source count
- ✅ Questions with no relevant content → "I am not sure."
- ✅ General knowledge questions → "I am not sure."

## Architecture Notes

### Data Flow
1. **PDF Upload** → Extract text → Split into chunks → Generate embeddings → Store in vector DB
2. **User Question** → Generate question embedding → Search vector DB → Retrieve top 3 chunks → Generate context-aware response

### Security Considerations
- User-provided OpenAI API keys (no server-side key storage)
- File size limit: 50MB
- PDF-only file validation
- Temporary in-memory storage (no persistent file storage)

### Performance
- Lazy PDF processing (only on first question)
- Efficient vector similarity search
- Optimized chunk size for embedding context

---

**Branch**: `feature/rag-pdf-chat`
**Ready for merge**: ✅ All tests passing, no linting errors
**Breaking changes**: ⚠️ Complete API and UI redesign - not backward compatible
Empty file added aimakerspace/__init__.py
Empty file.
Empty file.
66 changes: 66 additions & 0 deletions aimakerspace/openai_utils/chatmodel.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
import os
from typing import Any, AsyncIterator, Iterable, List, MutableMapping

from dotenv import load_dotenv
from openai import AsyncOpenAI, OpenAI

load_dotenv()

ChatMessage = MutableMapping[str, Any]


class ChatOpenAI:
"""Thin wrapper around the OpenAI chat completion APIs."""

def __init__(self, model_name: str = "gpt-4o-mini"):
self.model_name = model_name
self.openai_api_key = os.getenv("OPENAI_API_KEY")
if self.openai_api_key is None:
raise ValueError("OPENAI_API_KEY is not set")

self._client = OpenAI()
self._async_client = AsyncOpenAI()

def run(
self,
messages: Iterable[ChatMessage],
text_only: bool = True,
**kwargs: Any,
) -> Any:
"""Execute a chat completion request.

``messages`` must be an iterable of ``{"role": ..., "content": ...}``
dictionaries. When ``text_only`` is ``True`` (the default) only the
completion text is returned; otherwise the full response object is
provided.
"""

message_list = self._coerce_messages(messages)
response = self._client.chat.completions.create(
model=self.model_name, messages=message_list, **kwargs
)

if text_only:
return response.choices[0].message.content

return response

async def astream(
self, messages: Iterable[ChatMessage], **kwargs: Any
) -> AsyncIterator[str]:
"""Yield streaming completion chunks as they arrive from the API."""

message_list = self._coerce_messages(messages)
stream = await self._async_client.chat.completions.create(
model=self.model_name, messages=message_list, stream=True, **kwargs
)

async for chunk in stream:
content = chunk.choices[0].delta.content
if content is not None:
yield content

def _coerce_messages(self, messages: Iterable[ChatMessage]) -> List[ChatMessage]:
if isinstance(messages, list):
return messages
return list(messages)
Loading