A modern web application built with SvelteKit that demonstrates two different approaches to speech-to-text transcription: browser-native Speech Recognition API and OpenAI's Whisper API. This app allows users to record, transcribe, and manage voice notes using either transcription method.
- Dual transcription methods:
- Browser's native Speech Recognition API for real-time transcription
- OpenAI's Whisper API for high-accuracy transcription
- Voice recording controls (start, stop, pause, resume)
- Note management system (create, save, load, delete)
- Real-time transcription display
- Persistent storage of notes using localStorage
-
Speech Handlers:
SpeechHandler
: Manages browser-native speech recognitionSpeechHandlerOpenAi
: Handles Whisper API integration
-
State:
VoiceNotesHandler
: Manages note storage and retrieval- Svelte context API for state sharing
-
UI Components:
Recorder
: Controls for voice recordingCreateDialog
: Note creation interfaceLoadNoteDialog
: Note loading interface
// Uses the Web Speech API
const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
this.recognition = new SpeechRecognition();
this.recognition.continuous = true;
this.recognition.interimResults = true;
// Handles audio chunks and sends to Whisper API
async transcribeAudio(audioBlob: Blob): Promise<string> {
const file = new File([audioBlob], 'recording.webm', { type: MIME_TYPE });
const formData = new FormData();
formData.append('file', file);
const response = await fetch('/api/transcribe', {
method: 'POST',
body: formData
});
}
The app provides a comprehensive set of recording controls:
- Start/Stop recording
- Pause/Resume recording
- Real-time transcription display
- Error handling and user feedback
Notes are managed through the VoiceNotesHandler
class:
- Create new notes with titles and transcriptions
- Update existing notes
- Delete notes
- Load and display saved notes
- Persistent storage using localStorage
-
Starting a Recording:
- Click the "Start Recording" button
- Grant microphone permissions when prompted
- Speak into your microphone
-
Managing Recordings:
- Use the pause/resume button to temporarily stop recording
- Click "Stop Recording" to finish
- Save the transcription as a note
-
Managing Notes:
- Create new notes with the "+" button
- Load existing notes using the load dialog
- Edit transcriptions directly in the textarea
- Copy transcriptions to clipboard
- Delete unwanted notes
- Clone the repository
- Install dependencies:
pnpm install
- Set up environment variables:
OPENAI_API_KEY=your_api_key_here
- Start the development server:
pnpm dev
The app includes configurable parameters:
MIN_CHUNK_SIZE
: Minimum size for audio chunksDEFAULT_INTERVAL
: Default recording intervalDEFAULT_CONFIDENCE
: Default confidence threshold for transcription