Super-Voice-Assistant is a full-stack web application that creates a customizable voice assistant. It combines a Django backend with a React frontend, allowing users to interact with an AI-powered assistant through audio recordings. The application leverages the power of a chosen AI solution (e.g., Gemini or others) to foster engaging conversational experiences.
Caption: Super-Voice-Assistant in action, showing the voice recording interface and AI response.
- Audio recording and playback
- Speech-to-text conversion
- AI-powered responses
- Text-to-speech output
- Real-time chat interface
Before you begin, ensure you have met the following requirements:
- Python 3.x: Download and install from python.org
- Node.js and npm: Download and install from nodejs.org
- Git: Download and install from git-scm.com
- Clone the repository:
git clone https://github.com/hounfodji/Super-Voice-Assistant.git cd Super-Voice-Assistant
- Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the required packages:
pip install -r requirements.txt
- Set up your environment variables:
Create a
.env
file in the root directory and add your AI API key: Get Gemini API key hereGEMINI_API_KEY=your_api_key_here
- Run migrations:
python manage.py migrate
- Start the Django development server:
python manage.py runserver
- Navigate to the frontend directory:
cd frontend
- Install the required npm packages:
npm install
- Start the React development server:
npm run dev
- Open your web browser and go to
http://localhost:5173
to access the React frontend. - Click the "Start Recording" button to begin recording your voice.
- Speak your query or command.
- Click "Stop Recording" when you're done speaking.
- The application will process your audio, convert it to text, send it to the AI for processing, and display the response.
- The AI's response will be displayed in the chat interface and spoken aloud.
POST /api/record/
: Accepts audio recordings and returns the transcribed text.POST /api/process/
: Accepts text input and returns the AI-generated response.
We are constantly working to improve Super-Voice-Assistant. Here are some features we are planning to implement:
- Store messages in a PostgreSQL database for conversation history
- Add option to stop AI from speaking mid-response
- Implement functionality to upload audio files for processing
- Redesign the interface for a more beautiful and intuitive user experience
- Multi-language support for both speech recognition and AI responses
- User authentication and personal conversation history
- Customizable AI personalities or specialized knowledge domains
- Integration with external services (e.g., weather, news, calendar)
- Voice activity detection to automatically start/stop recording
- Sentiment analysis of user inputs for more empathetic AI responses
- Exportable conversation transcripts
- Mobile app version for iOS and Android
- Offline mode with basic functionality when internet is unavailable
We welcome contributions to help implement these features! Check our Contributing section to get started.
Contributions to the Super-Voice-Assistant project are welcome. Please follow these steps:
- Fork the repository.
- Create a new branch:
git checkout -b <branch_name>
. - Make your changes and commit them:
git commit -m '<commit_message>'
- Push to the original branch:
git push origin <project_name>/<location>
- Create the pull request. Alternatively, see the GitHub documentation on creating a pull request.
This project uses the following license: MIT License.
If you want to contact me, you can reach me at [email protected].