Quick Whisper Typer

Super simple python script to start recording sound, send it to whisper then have it type for you anywhere.

Can also modify text according to voice commands.
Latency is as low as I could (instant if deepgram is used, <1s for openai's whisper).
It can be seen as a minimalist alternative to AquaVoice and can be extended easily to replace Deepgram's Shortcut feature.t

The way each task works

write

starts recording
when you're done press shift (escape or spacebar to cancel)
whisper will transcribe your speech 4.a if --auto_paste is True: your current clipboard will be saved, replaced by the transcription, "ctrl+v" will automatically be pressed, then your old clipboard will replace again like nothing happened. 4.b if --auto_paste is False: your clipboard will be replaced by the transcription

transform_clipboard

starts recording
when you're done press shift (escape or spacebar to cancel)
whisper will transcribe your speech
the transcription will be interpreted as an instruction for --llm_model on how to transform the text found in your clipboard
the result will either be pasted or stored in the clipboard like for --task=write

new_voice_chat

starts recording
when you're done press shift (escape or spacebar to cancel)
whisper will transcribe your speech
the transcription will be interpreted as the first user message in a conversation with --llm_model
the result will either be pasted or stored in the clipboard like for --task=write, and optionaly read aloud if --voice_engine is set
To continue the conversation, use the task --task=continue_voice_chat

Examples

I want to write text: python quick_whisper_typer.py --task=write --auto_paste
I want to translate text: copy the text in to the clipboard then python quick_whisper_typer.py --task=transform_clipboard --auto_paste
I want to start a vocal conversation: python quick_whisper_typer.py --task="new_voice_chat" --voice_engine='openai'
I want to continue the conversation: python quick_whisper_typer.py --task="continue_voice_chat" --voice_engine='openai'
I want to call it from anywhere without setting up keybindings, use --loop then press shift key several times from anywhere and you'll see a notification appear to trigger the tasks.

Features

Supports any spoken languages supported by whisper
Supports both openai's whisper and deepgram's whisper
Supports for local transcription by supplying a custom URL.
- For example start whispercpp with ./server -m models/small_acft_q8_0.bin --threads 8 --audio-ctx 1500 -l fr --no-gpu --debug-mode --convert -p 1 (models from FUTO) and use --custom_transcription_url="http://127.0.0.1:8080/inference"
- You can set these environment variables for custom transcription:
  - CUSTOM_WHISPER_API_KEY: API key for the custom transcription server
  - CUSTOM_WHISPER_MODEL: Model name to use with the custom transcription server
Minimalist code
Low latency: it starts as fast as possible to be ready to listen to you
Four supported voice_engine: openai, piper, deepgram, espeak (fallback if any of the other fails)
Optional audio cleanup and long silence removal via sox
--loop to trigger the script from anywhere just by pressing shift multiple times. You can define any king of argument to customize your loop shortcuts by passing a dict to --loop_tasks
Support virtually any type of LLM (ChatGPT, Claude, Huggingface, Llama, etc) thanks to litellm.
Supposedly multiplatform, but I can't test it on anything else than Linux so please open an issue to tell me how it went!

How to

Make sure your environment contains the appropriate api keys (eg as OPENAI_API_KEY, MISTRAL_API_KEY, DEEPGRAM_API_KEY etc)
optional: add a keyboard shortcut to call this script. See my i3 bindings below.
If using deepgram: make sure you are on python 3.10+
create a venv: uv venv --python 3.10 and activate it source .venv/bin/activate
pip install -r requirements.txt
- if you have issues installing the python package playsound, try installing playsound3 instead.

Run in the background using systemd units

To always have quick_whisper_typer running in the background, you can do this after modifying the quick_whisper_typer_launcher.sh file and chmod +x it:

mkdir -p ~/.config/systemd/user/
echo "[Unit]
Description=quick_whisper_typer
After=graphical-session.target

[Service]
Type=simple
ExecStart=[YOUR_APPROPRIATE_PATH]/Quick_Whisper_Typer/quick_whisper_typer_launcher.sh
Restart=on-failure
RestartSec=10

[Install]
; WantedBy=graphical-session.target
WantedBy=default.target
" > ~/.config/systemd/user/quick_whisper_typer.service

systemctl --user enable quick_whisper_typer.service
systemctl --user start quick_whisper_typer.service

i3 bindings

mode "$mode_launch_microphone" {
    # enter text
    bindsym f exec /PATH/TO/quick_whisper_typer.py --task write, mode "default
    # edit clipboard
    bindsym e exec /PATH/TO/quick_whisper_typer.py --task=transform_clipboard, mode "default"
    bindsym v exec /PATH/TO/quick_whisper_typer.py --task=continue_voice_chat, mode "default"
    bindsym shift+V exec /PATH/TO/quick_whisper_typer.py --task=new_voice_chat, mode "default"

    bindsym Return mode "default"
    bindsym Escape mode "default"
    }

Credits

.ogg files were in my /usr/share/sounds/ubuntu/notifications folder.

Name		Name	Last commit message	Last commit date
Latest commit History 257 Commits
instructions		instructions
obsolete		obsolete
sounds		sounds
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
quick_whisper_typer.py		quick_whisper_typer.py
quick_whisper_typer_launcher.sh		quick_whisper_typer_launcher.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quick Whisper Typer

The way each task works

write

transform_clipboard

new_voice_chat

Examples

Features

How to

Run in the background using systemd units

i3 bindings

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

thiswillbeyourgithub/Quick-Whisper-Typer

Folders and files

Latest commit

History

Repository files navigation

Quick Whisper Typer

The way each task works

write

transform_clipboard

new_voice_chat

Examples

Features

How to

Run in the background using systemd units

i3 bindings

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages