|
1 |
| -# wsl-cuda-whisper |
2 |
| -The audio.whisper R package allows users to easily use OpenAI's Whisper model (e.g., for automated transcription of audio files) from R. Significant speedups can be achieved on machines with CUDA-enabled graphics cards, but setting this up can be complicated. This docker image allows a user on Windows to easily install all the dependencies needed to run audio.whisper with CUDA support via Windows Subsystem for Linux (WSL2). It is built on top of the rocker/tidyverse image, which means it comes with RStudio Server installed. |
3 |
| - |
4 |
| -Versions: |
5 |
| -- `jmgirard/wsl-cuda-whisper:vad` is a larger image that contains voice activity detection (VAD) via {audio.vadwebrtc} and {audio.vadsilero}. It also uses CUDA 11.8 as required by these packages. |
6 |
| -- `jmgirard/wsl-cuda-whisper:novad` is a more streamlined image that does not contain VAD and uses the newest CUDA 12.6 version. |
7 |
| - |
8 |
| -Usage: |
9 |
| -1. Verify that your machine's graphics card supports CUDA: https://developer.nvidia.com/cuda-gpus |
10 |
| -2. On Windows, install the latest game-ready driver from NVIDIA: https://www.nvidia.com/Download/index.aspx# |
11 |
| -3. On Windows, install the latest version of Docker Desktop: https://www.docker.com/products/docker-desktop/ |
12 |
| -4. Open Docker Desktop and click the Terminal button on the bottom of the screen |
13 |
| -5. In the Terminal, type `docker pull jmgirard/wsl-cuda-whisper` (hit Enter and wait, it may take a while) |
14 |
| -6. In the Terminal, type `docker run --gpus all --rm -it -e PASSWORD=pass -p 8787:8787 jmgirard/wsl-cuda-whisper` |
15 |
| -7. If you want access to the Windows filesystem, you can add `-v "C:\Users\jmgirard:/data"` and then access `/data` in R |
16 |
| -8. Once the Terminal has a line beginning with "TTY detected.", the container is ready |
17 |
| -9. In Docker Desktop, click the Containers tab on the left and click the "8787:8787" link |
18 |
| -10. Your browser should show a login page, enter "rstudio" as the username and "pass" for the password |
19 |
| -11. You should now be shown the RStudio page, so enter `library(audio.whisper)` |
20 |
| -12. Now you can download and load whisper models via, e.g., `model <- whisper("tiny", use_gpu = TRUE)` |
21 |
| -13. You can now use the `model` object and the `predict()` function with great speed |
| 1 | +# wsl-cuda-whisper |
| 2 | +The audio.whisper R package allows users to easily use OpenAI's Whisper model (e.g., for automated transcription of audio files) from R. Significant speedups can be achieved on machines with CUDA-enabled graphics cards, but setting this up can be complicated. This docker image allows a user on Windows to easily install all the dependencies needed to run audio.whisper with CUDA support via Windows Subsystem for Linux (WSL2). It is built on top of the rocker/tidyverse image, which means it comes with RStudio Server installed. |
| 3 | + |
| 4 | +Versions: |
| 5 | +- `jmgirard/wsl-cuda-whisper:vad` is a larger image that contains voice activity detection (VAD) via {audio.vadwebrtc} and {audio.vadsilero}. It also uses CUDA 11.8 as required by these packages. |
| 6 | +- `jmgirard/wsl-cuda-whisper:novad` is a more streamlined image that does not contain VAD and uses the newest CUDA 12.6 version. |
| 7 | + |
| 8 | +Usage: |
| 9 | +1. Verify that your machine's graphics card supports CUDA: https://developer.nvidia.com/cuda-gpus |
| 10 | +2. On Windows, install the latest game-ready driver from NVIDIA: https://www.nvidia.com/Download/index.aspx# |
| 11 | +3. On Windows, install the latest version of Docker Desktop: https://www.docker.com/products/docker-desktop/ |
| 12 | +4. Open Docker Desktop and click the Terminal button on the bottom of the screen |
| 13 | +5. In the Terminal, type `docker pull jmgirard/wsl-cuda-whisper` (hit Enter and wait, it may take a while) |
| 14 | +6. In the Terminal, type `docker run --gpus all --rm -it -e PASSWORD=pass -p 8787:8787 jmgirard/wsl-cuda-whisper` |
| 15 | +7. If you want access to the Windows filesystem, you can add `-v "C:\Users\jmgirard:/data"` and then access `/data` in R |
| 16 | +8. Once the Terminal has a line beginning with "TTY detected.", the container is ready |
| 17 | +9. In Docker Desktop, click the Containers tab on the left and click the "8787:8787" link |
| 18 | +10. Your browser should show a login page, enter "rstudio" as the username and "pass" for the password |
| 19 | +11. You should now be shown the RStudio page, so enter `library(audio.whisper)` |
| 20 | +12. Now you can download and load whisper models via, e.g., `model <- whisper("tiny", use_gpu = TRUE)` |
| 21 | +13. You can now use the `model` object and the `predict()` function with great speed |
0 commit comments