AgentX

AgentX is an advanced AI-powered assistant that integrates voice processing, document retrieval, conversational memory, and generative AI capabilities. This agent is designed to handle a wide range of tasks, from answering queries to processing voice commands.

Features

Voice Recognition
- Converts speech to text using Vosk models.
- Processes voice commands and generates audio responses via Google Text-to-Speech (gTTS).
Document Retrieval
- Utilizes FAISS for efficient document embedding and retrieval.
- Supports multiple document types for context-aware responses.
Generative AI
- Powered by OpenAI GPT models for conversational capabilities.
- Generates meaningful responses based on retrieved documents and context.
Conversational Memory
- Maintains context across interactions using LangChain's ConversationBufferMemory.
Integration Ready
- Modular design to integrate additional tools like Whisper API, advanced LangChain tools, or scheduling tasks.

Tech Stack

Tool	Purpose
LangChain	Conversational chains and memory handling
OpenAI GPT	Language generation and query answering
Vosk	Speech-to-text offline transcription
gTTS	Text-to-speech conversion
FAISS	Document embedding and vector retrieval
playsound	Audio playback
dotenv	Environment variable management
pytest	Unit testing framework
Logging	Structured logging for debugging

Project Structure

.
├── config
│   ├── constants.py        # Reusable constants
│   ├── settings.py         # Application settings and environment variables
├── core
│   ├── agent.py            # Main agent logic
│   ├── document_handler.py # Document loading and splitting
│   ├── memory.py           # Conversational memory handler
│   ├── voice_processor.py  # Voice processing logic
├── data
│   ├── documents           # Sample documents for testing
│   ├── audio               # Audio files (input/output)
├── model
│   └── vosk-model-small-en-us-0.15  # Speech recognition model
├── scripts
│   ├── run_agent.sh        # Script to run the agent
│   ├── setup_env.sh        # Script to set up the environment
├── tests
│   ├── test_agent.py       # Tests for agent.py
│   ├── test_memory.py      # Tests for memory handler
│   ├── test_voice_processor.py  # Tests for voice processing
├── Dockerfile              # Docker setup
├── README.md               # Project documentation
├── requirements.txt        # Python dependencies
└── main.py                 # Entry point for the application

Prerequisites

Python 3.11 or later
Vosk Model: Download and place the model in the model directory.
API Keys:
- OpenAI API Key
- Whisper API Key (Optional)

Installation

Clone the repository:

git clone https://github.com/sattyamjjain/AgentX.git
cd AgentX

Set up the environment:
```
./scripts/setup_env.sh
```

Create a .env file:

OPENAI_API_KEY=<your_openai_api_key>
WHISPER_API_KEY=<your_whisper_api_key>
DEBUG=True

Run the application:
```
python3 main.py
```

Testing

Run the test suite using pytest:

pytest tests/

Usage

Running the Agent:
```
python3 main.py
```
Interactive Voice Commands:
- Speak into the microphone, and AgentX will respond.
Document Retrieval:
- Place .txt files in data/documents to make them available for querying.

Future Enhancements

Integration with Whisper API for high-quality transcription.
Multi-modal capabilities for image and video processing.
Improved conversational flows with dynamic memory management.

Contributors

Sattyam Jain
Open to contributions!

Support

For issues or feature requests, please open an issue on the GitHub repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentX

Features

Tech Stack

Project Structure

Prerequisites

Installation

Testing

Usage

Future Enhancements

Contributors

Support

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
core		core
data/documents		data/documents
scripts		scripts
tests		tests
utils		utils
.env.example		.env.example
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

sattyamjjain/AgentX

Folders and files

Latest commit

History

Repository files navigation

AgentX

Features

Tech Stack

Project Structure

Prerequisites

Installation

Testing

Usage

Future Enhancements

Contributors

Support

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages