The audio.whisper R package allows users to easily use OpenAI's Whisper model (e.g., for automated transcription of audio files) from R. Significant speedups can be achieved on machines with CUDA-enabled graphics cards, but setting this up can be complicated. This docker image allows a user on Windows to easily install all the dependencies needed to run audio.whisper with CUDA support via Windows Subsystem for Linux (WSL2). It is built on top of the rocker/tidyverse image, which means it comes with RStudio Server installed.
Tag | Base Image | Operating System | R ver | CUDA ver |
---|---|---|---|---|
latest | rocker/tidyverse | Ubuntu 24.04 LTS | 4.4.2 | 12.6 |
novad | rocker/tidyverse | Ubuntu 24.04 LTS | 4.4.2 | 12.6 |
vad | rocker/tidyverse | Ubuntu 22.04 LTS | 4.4.1 | 11.8 |
Usage:
- Verify that your machine's graphics card supports CUDA: https://developer.nvidia.com/cuda-gpus
- On Windows, install the latest game-ready driver from NVIDIA: https://www.nvidia.com/Download/index.aspx#
- On Windows, install the latest version of Docker Desktop: https://www.docker.com/products/docker-desktop/
- Open Docker Desktop and click the Terminal button on the bottom of the screen
- In the Terminal, type
docker pull jmgirard/wsl-cuda-whisper:vad
ordocker pull jmgirard/wsl-cuda-whisper:novad
- In the Terminal, type
docker run --gpus all -it -e PASSWORD=pass -p 8787:8787 jmgirard/wsl-cuda-whisper
- Once the Terminal has a line beginning with "TTY detected.", the container is ready
- In Docker Desktop, click the Containers tab on the left and click the "8787:8787" link
- Your browser should show a login page, enter "rstudio" as the username and "pass" for the password
- You should now be shown the RStudio page, so enter
library(audio.whisper)
- Now you can download and load whisper models via, e.g.,
model <- whisper("tiny", use_gpu = TRUE)
- You can now use the
model
object and thepredict()
function with great speed