Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,19 @@ jobs:
dockerfile: "./backend/Dockerfile.python"
context: "./"
ubuntu-version: '2404'
- build-type: ''
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: '-cpu-moonshine'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:24.04"
skip-drivers: 'true'
backend: "moonshine"
dockerfile: "./backend/Dockerfile.python"
context: "./"
ubuntu-version: '2404'
# CUDA 12 builds
- build-type: 'cublas'
cuda-major-version: "12"
Expand Down Expand Up @@ -222,6 +235,19 @@ jobs:
dockerfile: "./backend/Dockerfile.python"
context: "./"
ubuntu-version: '2404'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "9"
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: '-gpu-nvidia-cuda-12-moonshine'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:24.04"
skip-drivers: 'false'
backend: "moonshine"
dockerfile: "./backend/Dockerfile.python"
context: "./"
ubuntu-version: '2404'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "9"
Expand Down Expand Up @@ -444,6 +470,19 @@ jobs:
dockerfile: "./backend/Dockerfile.python"
context: "./"
ubuntu-version: '2404'
- build-type: 'cublas'
cuda-major-version: "13"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: '-gpu-nvidia-cuda-13-moonshine'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:24.04"
skip-drivers: 'false'
backend: "moonshine"
dockerfile: "./backend/Dockerfile.python"
context: "./"
ubuntu-version: '2404'
- build-type: 'cublas'
cuda-major-version: "13"
cuda-minor-version: "0"
Expand Down
19 changes: 19 additions & 0 deletions .github/workflows/test-extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -247,3 +247,22 @@ jobs:
run: |
make --jobs=5 --output-sync=target -C backend/python/coqui
make --jobs=5 --output-sync=target -C backend/python/coqui test
tests-moonshine:
runs-on: ubuntu-latest
steps:
- name: Clone
uses: actions/checkout@v6
with:
submodules: true
- name: Dependencies
run: |
sudo apt-get update
sudo apt-get install build-essential ffmpeg
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
pip install --user --no-cache-dir grpcio-tools==1.64.1
- name: Test moonshine
run: |
make --jobs=5 --output-sync=target -C backend/python/moonshine
make --jobs=5 --output-sync=target -C backend/python/moonshine test
144 changes: 144 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,150 @@ Let's say the user wants to build a particular backend for a given platform. For
- The user may say they want to build AMD or ROCM instead of hipblas, or Intel instead of SYCL or NVIDIA insted of l4t or cublas. Ask for confirmation if there is ambiguity.
- Sometimes the user may need extra parameters to be added to `docker build` (e.g. `--platform` for cross-platform builds or `--progress` to view the full logs), in which case you can generate the `docker build` command directly.

## Adding a New Backend

When adding a new backend to LocalAI, you need to update several files to ensure the backend is properly built, tested, and registered. Here's a step-by-step guide based on the pattern used for adding backends like `moonshine`:

### 1. Create Backend Directory Structure

Create the backend directory under the appropriate location:
- **Python backends**: `backend/python/<backend-name>/`
- **Go backends**: `backend/go/<backend-name>/`
- **C++ backends**: `backend/cpp/<backend-name>/`

For Python backends, you'll typically need:
- `backend.py` - Main gRPC server implementation
- `Makefile` - Build configuration
- `install.sh` - Installation script for dependencies
- `protogen.sh` - Protocol buffer generation script
- `requirements.txt` - Python dependencies
- `run.sh` - Runtime script
- `test.py` / `test.sh` - Test files

### 2. Add Build Configurations to `.github/workflows/backend.yml`

Add build matrix entries for each platform/GPU type you want to support. Look at similar backends (e.g., `chatterbox`, `faster-whisper`) for reference.

**Placement in file:**
- CPU builds: Add after other CPU builds (e.g., after `cpu-chatterbox`)
- CUDA 12 builds: Add after other CUDA 12 builds (e.g., after `gpu-nvidia-cuda-12-chatterbox`)
- CUDA 13 builds: Add after other CUDA 13 builds (e.g., after `gpu-nvidia-cuda-13-chatterbox`)

**Additional build types you may need:**
- ROCm/HIP: Use `build-type: 'hipblas'` with `base-image: "rocm/dev-ubuntu-24.04:6.4.4"`
- Intel/SYCL: Use `build-type: 'intel'` or `build-type: 'sycl_f16'`/`sycl_f32` with `base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04"`
- L4T (ARM): Use `build-type: 'l4t'` with `platforms: 'linux/arm64'` and `runs-on: 'ubuntu-24.04-arm'`

### 3. Add Backend Metadata to `backend/index.yaml`

**Step 3a: Add Meta Definition**

Add a YAML anchor definition in the `## metas` section (around line 2-300). Look for similar backends to use as a template such as `diffusers` or `chatterbox`

**Step 3b: Add Image Entries**

Add image entries at the end of the file, following the pattern of similar backends such as `diffusers` or `chatterbox`. Include both `latest` (production) and `master` (development) tags.

### 4. Update the Makefile

The Makefile needs to be updated in several places to support building and testing the new backend:

**Step 4a: Add to `.NOTPARALLEL`**

Add `backends/<backend-name>` to the `.NOTPARALLEL` line (around line 2) to prevent parallel execution conflicts:

```makefile
.NOTPARALLEL: ... backends/<backend-name>
```

**Step 4b: Add to `prepare-test-extra`**

Add the backend to the `prepare-test-extra` target (around line 312) to prepare it for testing:

```makefile
prepare-test-extra: protogen-python
...
$(MAKE) -C backend/python/<backend-name>
```

**Step 4c: Add to `test-extra`**

Add the backend to the `test-extra` target (around line 319) to run its tests:

```makefile
test-extra: prepare-test-extra
...
$(MAKE) -C backend/python/<backend-name> test
```

**Step 4d: Add Backend Definition**

Add a backend definition variable in the backend definitions section (around line 428-457). The format depends on the backend type:

**For Python backends with root context** (like `faster-whisper`, `bark`):
```makefile
BACKEND_<BACKEND_NAME> = <backend-name>|python|.|false|true
```

**For Python backends with `./backend` context** (like `chatterbox`, `moonshine`):
```makefile
BACKEND_<BACKEND_NAME> = <backend-name>|python|./backend|false|true
```

**For Go backends**:
```makefile
BACKEND_<BACKEND_NAME> = <backend-name>|golang|.|false|true
```

**Step 4e: Generate Docker Build Target**

Add an eval call to generate the docker-build target (around line 480-501):

```makefile
$(eval $(call generate-docker-build-target,$(BACKEND_<BACKEND_NAME>)))
```

**Step 4f: Add to `docker-build-backends`**

Add `docker-build-<backend-name>` to the `docker-build-backends` target (around line 507):

```makefile
docker-build-backends: ... docker-build-<backend-name>
```

**Determining the Context:**

- If the backend is in `backend/python/<backend-name>/` and uses `./backend` as context in the workflow file, use `./backend` context
- If the backend is in `backend/python/<backend-name>/` but uses `.` as context in the workflow file, use `.` context
- Check similar backends to determine the correct context

### 5. Verification Checklist

After adding a new backend, verify:

- [ ] Backend directory structure is complete with all necessary files
- [ ] Build configurations added to `.github/workflows/backend.yml` for all desired platforms
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
- [ ] Tag suffixes match between workflow file and index.yaml
- [ ] Makefile updated with all 6 required changes (`.NOTPARALLEL`, `prepare-test-extra`, `test-extra`, backend definition, docker-build target eval, `docker-build-backends`)
- [ ] No YAML syntax errors (check with linter)
- [ ] No Makefile syntax errors (check with linter)
- [ ] Follows the same pattern as similar backends (e.g., if it's a transcription backend, follow `faster-whisper` pattern)

### 6. Example: Adding a Python Backend

For reference, when `moonshine` was added:
- **Files created**: `backend/python/moonshine/{backend.py, Makefile, install.sh, protogen.sh, requirements.txt, run.sh, test.py, test.sh}`
- **Workflow entries**: 3 build configurations (CPU, CUDA 12, CUDA 13)
- **Index entries**: 1 meta definition + 6 image entries (cpu, cuda12, cuda13 × latest/development)
- **Makefile updates**:
- Added to `.NOTPARALLEL` line
- Added to `prepare-test-extra` and `test-extra` targets
- Added `BACKEND_MOONSHINE = moonshine|python|./backend|false|true`
- Added eval for docker-build target generation
- Added `docker-build-moonshine` to `docker-build-backends`

# Coding style

- The project has the following .editorconfig
Expand Down
8 changes: 6 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Disable parallel execution for backend builds
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/piper backends/stablediffusion-ggml backends/whisper backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/stablediffusion-ggml-darwin backends/vllm
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/piper backends/stablediffusion-ggml backends/whisper backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/stablediffusion-ggml-darwin backends/vllm backends/moonshine

GOCMD=go
GOTEST=$(GOCMD) test
Expand Down Expand Up @@ -315,13 +315,15 @@ prepare-test-extra: protogen-python
$(MAKE) -C backend/python/chatterbox
$(MAKE) -C backend/python/vllm
$(MAKE) -C backend/python/vibevoice
$(MAKE) -C backend/python/moonshine

test-extra: prepare-test-extra
$(MAKE) -C backend/python/transformers test
$(MAKE) -C backend/python/diffusers test
$(MAKE) -C backend/python/chatterbox test
$(MAKE) -C backend/python/vllm test
$(MAKE) -C backend/python/vibevoice test
$(MAKE) -C backend/python/moonshine test

DOCKER_IMAGE?=local-ai
DOCKER_AIO_IMAGE?=local-ai-aio
Expand Down Expand Up @@ -455,6 +457,7 @@ BACKEND_VLLM = vllm|python|./backend|false|true
BACKEND_DIFFUSERS = diffusers|python|./backend|--progress=plain|true
BACKEND_CHATTERBOX = chatterbox|python|./backend|false|true
BACKEND_VIBEVOICE = vibevoice|python|./backend|--progress=plain|true
BACKEND_MOONSHINE = moonshine|python|./backend|false|true

# Helper function to build docker image for a backend
# Usage: $(call docker-build-backend,BACKEND_NAME,DOCKERFILE_TYPE,BUILD_CONTEXT,PROGRESS_FLAG,NEEDS_BACKEND_ARG)
Expand Down Expand Up @@ -499,12 +502,13 @@ $(eval $(call generate-docker-build-target,$(BACKEND_VLLM)))
$(eval $(call generate-docker-build-target,$(BACKEND_DIFFUSERS)))
$(eval $(call generate-docker-build-target,$(BACKEND_CHATTERBOX)))
$(eval $(call generate-docker-build-target,$(BACKEND_VIBEVOICE)))
$(eval $(call generate-docker-build-target,$(BACKEND_MOONSHINE)))

# Pattern rule for docker-save targets
docker-save-%: backend-images
docker save local-ai-backend:$* -o backend-images/$*.tar

docker-build-backends: docker-build-llama-cpp docker-build-rerankers docker-build-vllm docker-build-transformers docker-build-diffusers docker-build-kokoro docker-build-faster-whisper docker-build-coqui docker-build-bark docker-build-chatterbox docker-build-vibevoice docker-build-exllama2
docker-build-backends: docker-build-llama-cpp docker-build-rerankers docker-build-vllm docker-build-transformers docker-build-diffusers docker-build-kokoro docker-build-faster-whisper docker-build-coqui docker-build-bark docker-build-chatterbox docker-build-vibevoice docker-build-exllama2 docker-build-moonshine

########################################################
### END Backends
Expand Down
56 changes: 56 additions & 0 deletions backend/index.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
capabilities:
nvidia: "cuda12-rfdetr"
intel: "intel-rfdetr"
#amd: "rocm-rfdetr"

Check warning on line 106 in backend/index.yaml

View workflow job for this annotation

GitHub Actions / Yamllint

106:6 [comments] missing starting space in comment
nvidia-l4t: "nvidia-l4t-arm64-rfdetr"
default: "cpu-rfdetr"
nvidia-cuda-13: "cuda13-rfdetr"
Expand Down Expand Up @@ -275,6 +275,24 @@
amd: "rocm-faster-whisper"
nvidia-cuda-13: "cuda13-faster-whisper"
nvidia-cuda-12: "cuda12-faster-whisper"
- &moonshine
description: |
Moonshine is a fast, accurate, and efficient speech-to-text transcription model using ONNX Runtime.
It provides real-time transcription capabilities with support for multiple model sizes and GPU acceleration.
urls:
- https://github.com/moonshine-ai/moonshine
tags:
- speech-to-text
- transcription
- ONNX
license: MIT
name: "moonshine"
alias: "moonshine"
capabilities:
nvidia: "cuda12-moonshine"
default: "cpu-moonshine"
nvidia-cuda-13: "cuda13-moonshine"
nvidia-cuda-12: "cuda12-moonshine"
- &kokoro
icon: https://avatars.githubusercontent.com/u/166769057?v=4
description: |
Expand Down Expand Up @@ -947,7 +965,7 @@
capabilities:
nvidia: "cuda12-rfdetr-development"
intel: "intel-rfdetr-development"
#amd: "rocm-rfdetr-development"

Check warning on line 968 in backend/index.yaml

View workflow job for this annotation

GitHub Actions / Yamllint

968:6 [comments] missing starting space in comment
nvidia-l4t: "nvidia-l4t-arm64-rfdetr-development"
default: "cpu-rfdetr-development"
nvidia-cuda-13: "cuda13-rfdetr-development"
Expand Down Expand Up @@ -1203,7 +1221,7 @@
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-diffusers"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-diffusers
## exllama2

Check warning on line 1224 in backend/index.yaml

View workflow job for this annotation

GitHub Actions / Yamllint

1224:3 [comments-indentation] comment not indented like content
- !!merge <<: *exllama2
name: "exllama2-development"
capabilities:
Expand Down Expand Up @@ -1315,6 +1333,44 @@
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-13-faster-whisper"
mirrors:
- localai/localai-backends:master-gpu-nvidia-cuda-13-faster-whisper
## moonshine
- !!merge <<: *moonshine
name: "moonshine-development"
capabilities:
nvidia: "cuda12-moonshine-development"
default: "cpu-moonshine-development"
nvidia-cuda-13: "cuda13-moonshine-development"
nvidia-cuda-12: "cuda12-moonshine-development"
- !!merge <<: *moonshine
name: "cpu-moonshine"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-moonshine"
mirrors:
- localai/localai-backends:latest-cpu-moonshine
- !!merge <<: *moonshine
name: "cpu-moonshine-development"
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-moonshine"
mirrors:
- localai/localai-backends:master-cpu-moonshine
- !!merge <<: *moonshine
name: "cuda12-moonshine"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-moonshine"
mirrors:
- localai/localai-backends:latest-gpu-nvidia-cuda-12-moonshine
- !!merge <<: *moonshine
name: "cuda12-moonshine-development"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-moonshine"
mirrors:
- localai/localai-backends:master-gpu-nvidia-cuda-12-moonshine
- !!merge <<: *moonshine
name: "cuda13-moonshine"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-13-moonshine"
mirrors:
- localai/localai-backends:latest-gpu-nvidia-cuda-13-moonshine
- !!merge <<: *moonshine
name: "cuda13-moonshine-development"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-13-moonshine"
mirrors:
- localai/localai-backends:master-gpu-nvidia-cuda-13-moonshine
## coqui

- !!merge <<: *coqui
Expand Down
16 changes: 16 additions & 0 deletions backend/python/moonshine/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
.DEFAULT_GOAL := install

.PHONY: install
install:
bash install.sh

.PHONY: protogen-clean
protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py

.PHONY: clean
clean: protogen-clean
rm -rf venv __pycache__

test: install
bash test.sh
Loading
Loading