An experimental multi provider AI agent system built to learn async patterns, API integration, routing, and conversation memory.
💡 Seeking Microgrant: $400 to upgrade memory from TF-IDF to semantic embeddings.
See funding details →
I wanted to understand how AI agents actually work under the hood, not just call a single API.
This project began as a small "connect to Ollama" script and turned into a 6 month learning journey involving async orchestration, memory, routing, and safe error handling.
Multi Provider Support
- Ollama (local), Google Gemini, Perplexity, Azure OpenAI
- Async calls with fallback chain (Gemini → Perplexity → Ollama)
- Basic rate limit and timeout handling
Conversation Memory
- TF-IDF keyword search (no embeddings yet)
- Persistent memory via pickle
- Hybrid retrieval (semantic-ish keyword match + recent history)
Agent Routing
- 5 specialized agents (Assistant, Code, Data, Creative, Research)
- Simple keyword-based routing
- Basic usage statistics per agent
Error Handling
- Exponential backoff
- Provider fallbacks
- Graceful timeout handling
graph TD
A[👤 User Query] --> B[ Agent Router]
B -->|keyword: code| C1[ Code Agent]
B -->|keyword: data| C2[ Data Agent]
B -->|keyword: creative| C3[ Creative Agent]
B -->|keyword: research| C4[ Research Agent]
B -->|default| C5[ Assistant Agent]
C1 --> D[ Memory Search TF-IDF]
C2 --> D
C3 --> D
C4 --> D
C5 --> D
D --> E[ Async API Handler]
E --> F1[ Google Gemini]
E --> F2[ Perplexity]
E --> F3[ Ollama Local]
E --> F4[ Azure OpenAI]
F1 -->|success| G[ Response]
F1 -->|fail| F2
F2 -->|fail| F3
F3 -->|fail| F4
F2 -->|success| G
F3 -->|success| G
F4 -->|success| G
G --> H[ Memory Update]
H --> I[ Return to User]
style A fill:#64FFDA,stroke:#333,stroke-width:2px,color:#000
style I fill:#64FFDA,stroke:#333,stroke-width:2px,color:#000
style B fill:#FFA726,stroke:#333,stroke-width:2px
style D fill:#42A5F5,stroke:#333,stroke-width:2px
style E fill:#AB47BC,stroke:#333,stroke-width:2px
style G fill:#66BB6A,stroke:#333,stroke-width:2px
Key Files
agent.py— main CLI agent (≈500 lines)src/entaera/— experimental modules (not fully integrated)requirements-local-models.txt— dependencies
- No vector embeddings — memory is TF-IDF, not semantic.
- Keyword routing only — ambiguous prompts may route incorrectly.
- Pickle based storage — not database safe, single user only.
- Minimal tests — basic checks only, no unit tests for memory/routing.
- CLI only — no FastAPI or UI yet.
- Manual .env setup — no validation for missing/invalid keys.
- Basic fallback logic — works, but not robust for edge cases.
Current Focus: Semantic memory upgrade (funded by microgrant)
Next 8-10 Weeks
- Replace TF-IDF with sentence-transformers
- Add FAISS vector storage
- Build comprehensive test suite
- Deploy live demo for community testing
See ROADMAP.md for detailed timeline and deliverables.
Future Possibilities (post-funding)
- REST API endpoint
- Multi user support
- Streaming responses
Requirements
- Python 3.11+
- Running Ollama (
ollama serve) - API keys for Gemini, Perplexity, or Azure
Setup
git clone https://github.com/SaurabhCodesAI/ENTAERA.git
cd ENTAERA
pip install aiohttp python-dotenv
# Add your environment variables
cp .env.example .envRun
ollama serve # separate terminal
python agent.pyCommands
/agents— list agents/memory [n]— show history/search <query>— TF-IDF search/stats— usage stats/clear— clear memory/quit— exit
Microgrant Request: $400
Funding will upgrade ENTAERA's memory from keyword based (TF-IDF) to semantic embeddings, making context retrieval significantly better.
Budget:
- $120 — API testing credits (Gemini Pro + Perplexity)
- $80 — Cloud deployment for live demo (3 months)
- $200 — Development time (open source rate)
Timeline: 8-10 weeks
Deliverable: Working semantic search with live demo anyone can test
Why fund this?
- Real working code, not tutorials
- Helps developers learn multi provider AI orchestration
- Open source (MIT), benefits entire community
- Honest about scope and limitations
This is a learning project, so small, helpful contributions are welcome.
Good First Issues
- Improve docs
- Clean up error messages
- Add small tests
- Add type hints or logging improvements
Larger Contributions
- Adding a new provider
- Implementing embeddings
- Building FastAPI service
Not Looking For
- Full rewrites
- Large architectural changes
- Enterprise auth systems
See CONTRIBUTING.md.
Language: Python 3.11+
Core: asyncio, aiohttp, python-dotenv, pickle
Providers: Ollama, Gemini, Perplexity, Azure OpenAI
Memory: TF-IDF keyword search
Architecture: Single file CLI with experimental modules
MIT License — see LICENSE.
Built by Saurabh Pareek (@SaurabhCodesAI)
6 months of debugging, learning, and real hands on development.
Not perfect, but honest work.
Note: AI assistants (ChatGPT, Claude) were used for learning and debugging, not for producing code I don't understand.