A sophisticated enterprise-grade RAG (Retrieval-Augmented Generation) customer service system for electronics retail. Combines structured database queries with vector-based document retrieval to deliver intelligent, contextual responses in both Arabic and English with real-time streaming capabilities.
This application implements a hybrid RAG architecture that intelligently combines multiple data sources:
- Structured Data Retrieval: Real-time product, service, and store information from PostgreSQL
- Unstructured Data Search: Document-based knowledge retrieval using Pinecone vector database
- AI Generation: GPT-4 powered response synthesis with streaming support
- Multilingual Processing: Native Arabic and English language detection and response generation
graph TB
A[User Query] --> B[Language Detection]
B --> C[Query Analysis & Intent Recognition]
C --> D[Structured Data Query]
C --> E[Vector Search]
D --> F[PostgreSQL Database]
E --> G[Pinecone Vector Store]
F --> H[RAG Response Generator]
G --> H
H --> I[GPT-4 Synthesis]
I --> J[Streaming Response]
J --> K[Vue.js Frontend]
- Framework: FastAPI (Python 3.11+)
- AI/ML: OpenAI GPT-4, OpenAI Embeddings (text-embedding-3-large)
- Vector Database: Pinecone (serverless, cosine similarity)
- Traditional Database: PostgreSQL 16+ with SQLAlchemy ORM
- Caching: Redis (session management, optional)
- Document Processing: PyPDF, LangChain text splitters
- Framework: Vue.js 3 with Composition API
- Build Tool: Vite
- State Management: Pinia
- HTTP Client: Axios with interceptors
- Real-time Features: Server-Sent Events (SSE) for streaming
- Containerization: Docker & Docker Compose
- Database Migrations: SQLAlchemy with Alembic
- Environment Management: python-dotenv
- API Documentation: FastAPI automatic OpenAPI/Swagger
- Hybrid Retrieval: Combines structured database queries with vector similarity search
- Intent Recognition: GPT-4 powered query analysis and entity extraction
- Context Fusion: Intelligent merging of multiple data sources for comprehensive responses
- Confidence Scoring: Multi-factor confidence assessment based on data quality and retrieval success
- Token-level Streaming: Server-Sent Events (SSE) for real-time response generation
- Progressive Enhancement: Responses build progressively with typing indicators
- Stream Error Handling: Graceful fallback mechanisms for connection issues
- Language Detection: Automatic Arabic/English detection with customizable thresholds
- Bilingual Responses: Context-aware language selection for responses
- RTL Support: Full right-to-left text support in frontend
- Product Catalog: Complete inventory with specifications, pricing, and availability
- Service Offerings: Installation, support, and warranty services
- Supplier Relations: Supply chain and procurement information
- Document Intelligence: PDF ingestion with intelligent chunking and embedding
- Python 3.11+
- Node.js 18+
- Docker & Docker Compose
- OpenAI API key
- Pinecone API key
git clone <repository-url>
cd techmart-ai-assistant
# Copy environment template
cp .env.example .env
# Edit .env with your API keys
OPENAI_API_KEY=your_openai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_CLOUD=aws
PINECONE_REGION=us-east-1# Start PostgreSQL and Redis
docker-compose up postgres redis -d
# Wait for databases to be ready
sleep 10# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Seed database with sample data
python scripts/database/seed_database.py
# Start FastAPI server
uvicorn app.main:app --reload# Navigate to frontend directory
cd frontend/store-assistant-ui
# Install dependencies
npm install
# Start development server
npm run dev- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- Language Detection: Automatic Arabic/English identification
- Intent Analysis: GPT-4 powered query understanding and entity extraction
- Dual Retrieval: Parallel structured and unstructured data retrieval
- Response Synthesis: Context-aware response generation with streaming
# Structured data from PostgreSQL
products_found = await query_products(entities, intent, db)
services_found = await query_services(entities, db)
store_info = await query_store_info(db)
# Unstructured data from Pinecone
vector_results = await vector_service.search_similar(
query_text=enhanced_query,
top_k=max_results,
filter_dict=metadata_filter
)
# Hybrid response generation
response = await generate_hybrid_response(
structured_data, unstructured_data, query_analysis
)- Embedding Model: OpenAI text-embedding-3-large (1024 dimensions)
- Similarity Metric: Cosine similarity
- Metadata Fields: source, language, category, document_type, chunk_index
- Chunking Strategy: RecursiveCharacterTextSplitter with overlap
POST /channels/webchat/message
Content-Type: application/json
{
"text": "What's the price of iPhone 15?",
"session_id": "optional-session-id",
"locale": "en"
}POST /channels/webchat/message/stream
Content-Type: application/json
{
"text": "Tell me about your products",
"session_id": "optional-session-id",
"locale": "auto"
}POST /documents/upload
Content-Type: multipart/form-data
file: [PDF file]- Products: SKU, specifications, pricing, inventory
- Services: Installation, support, warranty offerings
- Suppliers: Vendor information and relationships
- Documents: Metadata for ingested documents
- Conversations: Chat history and analytics
The system includes comprehensive sample data for a TechMart electronics store:
- 9 products across smartphones, laptops, home appliances
- 7 service offerings including installation and support
- 5 suppliers with realistic contact information
- Store location with operating hours and delivery zones
# Test Pinecone integration
python scripts/testing/test_pinecone.py
# Test document ingestion
python scripts/testing/test_document_ingestion.py
# Test complete RAG pipeline
python scripts/testing/test_rag_service.py
# Test web chat functionality
python scripts/testing/test_web_chat.py- Arabic Language Debug:
scripts/testing/debug_arabic.py - RAG Step-by-Step Debug:
scripts/testing/debug_rag_issue.py - API Upload Testing:
scripts/testing/test_upload_api.py
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
MAX_TOKENS=4000
# Pinecone Configuration
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_CLOUD=aws
PINECONE_REGION=us-east-1
PINECONE_INDEX=store-assistant
EMBED_DIM=1024
# Database Configuration
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/store_assistant
REDIS_URL=redis://localhost:6379/0
# RAG Settings
CHUNK_SIZE=1000
CHUNK_OVERLAP=150
SIMILARITY_THRESHOLD=0.75# Production deployment
docker-compose up -d
# Access services
# PostgreSQL: localhost:5432
# Redis: localhost:6379
# PgAdmin: localhost:5050- Vector Search: Pinecone serverless with auto-scaling
- Database: PostgreSQL with proper indexing on products and services
- Caching: Redis for session management and frequent queries
- Streaming: Chunked response delivery for improved UX
- Confidence Scoring: Multi-factor assessment of response quality
- Query Analytics: Intent classification and response time tracking
- Error Handling: Graceful fallbacks with detailed logging
- Compute: 2+ CPU cores, 4GB+ RAM
- Storage: 20GB+ for databases and logs
- Network: HTTPS with proper CORS configuration
- Monitoring: Application logs and performance metrics
- API Keys: Secure storage of OpenAI and Pinecone credentials
- Database: Encrypted connections and proper access controls
- CORS: Configured for production domains
- Rate Limiting: Implemented at API gateway level
-
Vector Search Returns No Results
- Check Pinecone index status and API credits
- Verify embedding model consistency
- Review similarity threshold settings
-
Arabic Language Detection Issues
- Adjust language detection thresholds
- Check system prompt Arabic content
- Verify UTF-8 encoding support
-
Database Connection Errors
- Ensure PostgreSQL is running
- Check connection string format
- Verify network accessibility
# Check system status
python scripts/testing/debug_rag_issue.py
# Test language detection
python scripts/testing/debug_arabic.py
# Validate API endpoints
python scripts/testing/test_webhook_fix.pyThis project is licensed under the MIT License - see the LICENSE file for details.