Skip to content

Latest commit

 

History

History
124 lines (105 loc) · 3.67 KB

TESTING.md

File metadata and controls

124 lines (105 loc) · 3.67 KB

Testing Documentation - Single Agent RAG Router

Overview

This document outlines the testing performed on the Single Agent RAG Router system to validate its core functionalities and demonstrate its capabilities. The system successfully implements a single-agent RAG architecture with routing capabilities to different tools based on query type.

Test Environment

  • Infrastructure: Docker containers
  • Services:
    • Qdrant (Vector Database)
    • Ollama (LLM Service)
    • FastAPI (Main Application)
  • Model: Llama2 (via Ollama)

Functional Testing

1. Query Routing

The system was tested with different types of queries to verify correct routing behavior:

Web Search Query

Query: "What are the latest developments in AI?"
Result:
- Correctly identified as requiring current information
- Routed to web search tool
- Successfully retrieved and synthesized current information
- Query Type: WEB_SEARCH
- Confidence: 0.9

Knowledge Retrieval Query

Query: "Can you tell me what is RAG and how it works?"
Result:
- Correctly identified as requiring knowledge base access
- Retrieved relevant information from technical collection
- Provided accurate explanation based on retrieved context
- Query Type: RETRIEVAL
- Confidence: 0.8

Direct Query

Query: "Can you tell me a funny joke?"
Result:
- Correctly identified as requiring direct LLM response
- Routed to direct response without retrieval
- Generated appropriate humorous response
- Query Type: DIRECT
- Confidence: 1.0

Calculation Query

Query: "What is 15*25?"
Result:
- Correctly identified as mathematical operation
- Routed to calculator tool
- Provided accurate calculation: 375
- Query Type: CALCULATION
- Confidence: 0.95

System Components Validation

Vector Store Integration

  • Successfully initialized technical and business collections
  • Proper embedding generation and storage
  • Effective similarity search functionality
  • Appropriate context retrieval

LLM Integration

  • Successful model initialization
  • Proper prompt handling
  • Consistent response generation
  • Appropriate context utilization

Tool Integration

  • Web Search: Successful DuckDuckGo integration
  • Calculator: Accurate mathematical operations
  • Vector Search: Proper collection routing and search
  • Direct Response: Appropriate handling of straightforward queries

Performance Observations

Response times were measured for different query types:

  • Web Search Queries: ~160-170s
  • Knowledge Retrieval: ~90-100s
  • Direct Queries: ~60s
  • Calculation Queries: ~45-50s

Note: These times are from a containerized environment running on CPU. Production deployment with GPU support would show significantly improved performance.

Error Handling

The system demonstrates robust error handling:

  • Graceful handling of service unavailability
  • Proper fallback mechanisms
  • Clear error messages
  • Request validation
  • Input sanitization

API Endpoints

All API endpoints were tested and verified:

  • /query: Main query endpoint
  • /health: System health check
  • /stores: Vector store management
  • /search/{store_name}: Direct store search
  • /upload/files: Document upload functionality
  • /upload/text: Text content upload

Future Improvements

While the PoC successfully demonstrates the core functionality, several areas could be enhanced:

  1. Response time optimization
  2. Query planning sophistication
  3. Enhanced caching mechanisms
  4. More comprehensive tool integration
  5. Advanced query decomposition

Conclusion

The Single Agent RAG Router PoC successfully demonstrates:

  1. Proper implementation of agentic RAG architecture
  2. Effective query routing
  3. Tool integration and orchestration
  4. Robust error handling
  5. Scalable design