Grammar Assistant is a friendly bot that helps users improve their grammar and provides feedback. It utilizes Auto Speech Recognition (ASR), Natural Language Processing (NLP), and Text to Speech (TTS) technologies through the Hugging Face Inference API.
- Transcribe speech from microphone input or uploaded audio files
- Correct grammar in the transcribed text
- Provide audio feedback with corrected grammar
- Utilizes Hugging Face Inference API, eliminating the need to download models locally
- Easy-to-use web interface powered by Gradio
- Language Learning: Ideal for non-native English speakers to practice and improve their grammar in real-time.
- Professional Communication: Helps professionals refine their spoken English for presentations, meetings, or interviews.
- Academic Writing: Assists students in improving the grammatical accuracy of their spoken ideas before writing them down.
- Accessibility: Supports individuals with hearing impairments by providing text transcriptions and corrections.
- ASR: OpenAI Whisper (tiny model) for speech recognition
- NLP: Meta Llama 3.2 1B Instruct model for grammar correction
- TTS: Facebook MMS TTS (English) for converting corrected text to speech
- Frontend: Gradio for the user interface
- API: Hugging Face Inference API for model inference
- Clone the repository:
https://github.com/Kirushikesh/Grammar-Assistant.git
- Navigate to the project directory:
cd Grammar-Assistant
- Install dependencies using Poetry:
poetry install
- Run the application:
poetry run python main.py
- Open the provided URL in your web browser.
- Enter your Hugging Face API token when prompted.
- Choose between "Transcribe Microphone" or "Transcribe Audio File" tabs.
- Record or upload your audio for grammar correction.
- View the original transcription and listen to the corrected speech.
You can modify the models and settings in the config.py
file:
ASR_MODEL
: Model for Automatic Speech RecognitionGRAMMAR_MODEL
: Model for grammar correctionTTS_MODEL
: Model for Text-to-Speech conversionGRAMMAR_CORRECTION_SYSTEM_PROMPT
: System prompt for grammar correctionMAX_NEW_TOKENS
: Maximum number of new tokens for grammar correctionTEMPERATURE
: Temperature setting for the language modelSAMPLING_RATE
: Audio sampling rate
- No need to download large model files locally
- Reduced computational requirements on the user's machine
- Easy access to state-of-the-art AI models
- Simplified deployment and maintenance
Contributions to improve the project are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.
This project is licensed under the MIT License.