Svelte Voice Notes Transcription

A modern web application built with SvelteKit that demonstrates two different approaches to speech-to-text transcription: browser-native Speech Recognition API and OpenAI's Whisper API. This app allows users to record, transcribe, and manage voice notes using either transcription method.

Demo

Watch the demo video

Features

Dual transcription methods:
- Browser's native Speech Recognition API for real-time transcription
- OpenAI's Whisper API for high-accuracy transcription
Voice recording controls (start, stop, pause, resume)
Note management system (create, save, load, delete)
Real-time transcription display
Persistent storage of notes using localStorage

Architecture

Core Components

Speech Handlers:
- SpeechHandler: Manages browser-native speech recognition
- SpeechHandlerOpenAi: Handles Whisper API integration
State:
- VoiceNotesHandler: Manages note storage and retrieval
- Svelte context API for state sharing
UI Components:
- Recorder: Controls for voice recording
- CreateDialog: Note creation interface
- LoadNoteDialog: Note loading interface

Implementation Details

Browser Speech Recognition

// Uses the Web Speech API
const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
this.recognition = new SpeechRecognition();
this.recognition.continuous = true;
this.recognition.interimResults = true;

OpenAI Whisper Integration

// Handles audio chunks and sends to Whisper API
async transcribeAudio(audioBlob: Blob): Promise<string> {
    const file = new File([audioBlob], 'recording.webm', { type: MIME_TYPE });
    const formData = new FormData();
    formData.append('file', file);

    const response = await fetch('/api/transcribe', {
        method: 'POST',
        body: formData
    });
}

Key Features Implementation

Recording Controls

The app provides a comprehensive set of recording controls:

Start/Stop recording
Pause/Resume recording
Real-time transcription display
Error handling and user feedback

Note Management

Notes are managed through the VoiceNotesHandler class:

Create new notes with titles and transcriptions
Update existing notes
Delete notes
Load and display saved notes
Persistent storage using localStorage

Usage

Starting a Recording:
- Click the "Start Recording" button
- Grant microphone permissions when prompted
- Speak into your microphone
Managing Recordings:
- Use the pause/resume button to temporarily stop recording
- Click "Stop Recording" to finish
- Save the transcription as a note
Managing Notes:
- Create new notes with the "+" button
- Load existing notes using the load dialog
- Edit transcriptions directly in the textarea
- Copy transcriptions to clipboard
- Delete unwanted notes

Setup

Clone the repository
Install dependencies:
```
pnpm install
```
Set up environment variables:
```
OPENAI_API_KEY=your_api_key_here
```
Start the development server:
```
pnpm dev
```

Configuration

The app includes configurable parameters:

MIN_CHUNK_SIZE: Minimum size for audio chunks
DEFAULT_INTERVAL: Default recording interval
DEFAULT_CONFIDENCE: Default confidence threshold for transcription

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
static		static
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
README.md		README.md
components.json		components.json
eslint.config.js		eslint.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
svelte.config.js		svelte.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Svelte Voice Notes Transcription

Demo

Features

Architecture

Core Components

Implementation Details

Browser Speech Recognition

OpenAI Whisper Integration

Key Features Implementation

Recording Controls

Note Management

Usage

Setup

Configuration

About

Releases

Packages

Languages

ichbtrv/svelte-openai-whisper-speech-recognition-api

Folders and files

Latest commit

History

Repository files navigation

Svelte Voice Notes Transcription

Demo

Features

Architecture

Core Components

Implementation Details

Browser Speech Recognition

OpenAI Whisper Integration

Key Features Implementation

Recording Controls

Note Management

Usage

Setup

Configuration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages