Skip to content

Commit ac1e39e

Browse files
committed
(feat) rhoai-lls does not require kubernetes to run
1 parent 24b2f4e commit ac1e39e

File tree

8 files changed

+384
-32
lines changed

8 files changed

+384
-32
lines changed

README.md

Lines changed: 65 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,36 @@ This directory contains the necessary files to build a Red Hat compatible contai
88
- `llama` CLI tool installed: `pip install llama-stack`
99
- Podman or Docker installed
1010

11+
## Build Modes
12+
13+
The build script supports three modes:
14+
15+
### 1. Full Mode (Default)
16+
Includes all features including TrustyAI providers that require Kubernetes/OpenShift:
17+
```bash
18+
./distribution/build.py
19+
```
20+
21+
### 2. Standalone Mode
22+
Builds a version without Kubernetes dependencies, using Llama Guard for safety:
23+
```bash
24+
./distribution/build.py --standalone
25+
```
26+
27+
### 3. Unified Mode (Recommended)
28+
Builds a single container that supports both modes via environment variables:
29+
```bash
30+
./distribution/build.py --unified
31+
```
32+
1133
## Generating the Containerfile
1234

1335
The Containerfile is auto-generated from a template. To generate it:
1436

1537
1. Make sure you have the `llama` CLI tool installed
16-
2. Run the build script from root of this git repo:
38+
2. Run the build script from root of this git repo with your desired mode:
1739
```bash
18-
./distribution/build.py
40+
./distribution/build.py [--standalone] [--unified]
1941
```
2042

2143
This will:
@@ -35,7 +57,47 @@ Once the Containerfile is generated, you can build the image using either Podman
3557
### Using Podman build image for x86_64
3658

3759
```bash
38-
podman build --platform linux/amd64 -f distribution/Containerfile -t rh .
60+
podman build --platform linux/amd64 -f distribution/Containerfile -t llama-stack-rh .
61+
```
62+
63+
### Using Docker
64+
65+
```bash
66+
docker build -f distribution/Containerfile -t llama-stack-rh .
67+
```
68+
69+
## Running the Container
70+
71+
### Running in Standalone Mode (No Kubernetes)
72+
73+
To run the container in standalone mode without Kubernetes dependencies, set the `STANDALONE` environment variable:
74+
75+
```bash
76+
# Using Docker
77+
docker run -e STANDALONE=true \
78+
-e VLLM_URL=http://host.docker.internal:8000/v1 \
79+
-e INFERENCE_MODEL=your-model-name \
80+
-p 8321:8321 \
81+
llama-stack-rh
82+
83+
# Using Podman
84+
podman run -e STANDALONE=true \
85+
-e VLLM_URL=http://host.docker.internal:8000/v1 \
86+
-e INFERENCE_MODEL=your-model-name \
87+
-p 8321:8321 \
88+
llama-stack-rh
89+
```
90+
91+
### Running in Full Mode (With Kubernetes)
92+
93+
To run with all features including TrustyAI providers (requires Kubernetes/OpenShift):
94+
95+
```bash
96+
# Using Docker
97+
docker run -p 8321:8321 llama-stack-rh
98+
99+
# Using Podman
100+
podman run -p 8321:8321 llama-stack-rh
39101
```
40102

41103
## Notes

distribution/Containerfile

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# WARNING: This file is auto-generated. Do not modify it manually.
2-
# Generated by: distribution/build.py
2+
# Generated by: distribution/build.py --unified
33

44
FROM registry.access.redhat.com/ubi9/python-312@sha256:95ec8d3ee9f875da011639213fd254256c29bc58861ac0b11f290a291fa04435
55
WORKDIR /opt/app-root
@@ -8,6 +8,7 @@ RUN pip install sqlalchemy # somehow sqlalchemy[asyncio] is not sufficient
88
RUN pip install \
99
aiosqlite \
1010
autoevals \
11+
blobfile \
1112
chardet \
1213
datasets \
1314
fastapi \
@@ -42,7 +43,15 @@ RUN pip install --index-url https://download.pytorch.org/whl/cpu torch torchvisi
4243
RUN pip install --no-deps sentence-transformers
4344
RUN pip install --no-cache llama-stack==0.2.18
4445
RUN mkdir -p ${HOME}/.llama/providers.d ${HOME}/.cache
45-
COPY distribution/run.yaml ${APP_ROOT}/run.yaml
46+
47+
# Copy both configurations
48+
COPY distribution/run.yaml ${APP_ROOT}/run-full.yaml
49+
COPY distribution/run-standalone.yaml ${APP_ROOT}/run-standalone.yaml
50+
51+
# Copy the entrypoint script
52+
COPY --chmod=755 distribution/entrypoint.sh ${APP_ROOT}/entrypoint.sh
53+
54+
# Copy providers directory (will be filtered by entrypoint script)
4655
COPY distribution/providers.d/ ${HOME}/.llama/providers.d/
4756

48-
ENTRYPOINT ["python", "-m", "llama_stack.core.server.server", "/opt/app-root/run.yaml"]
57+
ENTRYPOINT ["/opt/app-root/entrypoint.sh"]

distribution/Containerfile.in

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,15 @@ RUN pip install sqlalchemy # somehow sqlalchemy[asyncio] is not sufficient
55
{dependencies}
66
RUN pip install --no-cache llama-stack==0.2.18
77
RUN mkdir -p ${{HOME}}/.llama/providers.d ${{HOME}}/.cache
8-
COPY distribution/run.yaml ${{APP_ROOT}}/run.yaml
8+
9+
# Copy both configurations
10+
COPY distribution/run.yaml ${{APP_ROOT}}/run-full.yaml
11+
COPY distribution/run-standalone.yaml ${{APP_ROOT}}/run-standalone.yaml
12+
13+
# Copy the entrypoint script
14+
COPY --chmod=755 distribution/entrypoint.sh ${{APP_ROOT}}/entrypoint.sh
15+
16+
# Copy providers directory (will be filtered by entrypoint script)
917
COPY distribution/providers.d/ ${{HOME}}/.llama/providers.d/
1018

11-
ENTRYPOINT ["python", "-m", "llama_stack.core.server.server", "/opt/app-root/run.yaml"]
19+
ENTRYPOINT ["/opt/app-root/entrypoint.sh"]

distribution/build-standalone.yaml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
version: "2"
2+
distribution_spec:
3+
description: Red Hat distribution of Llama Stack (Standalone Docker)
4+
providers:
5+
inference:
6+
- "remote::vllm"
7+
- "inline::sentence-transformers"
8+
vector_io:
9+
- "inline::milvus"
10+
safety:
11+
- "inline::llama-guard"
12+
agents:
13+
- "inline::meta-reference"
14+
# eval: removed trustyai_lmeval provider for standalone Docker
15+
datasetio:
16+
- "remote::huggingface"
17+
- "inline::localfs"
18+
scoring:
19+
- "inline::basic"
20+
- "inline::llm-as-judge"
21+
- "inline::braintrust"
22+
telemetry:
23+
- "inline::meta-reference"
24+
tool_runtime:
25+
- "remote::brave-search"
26+
- "remote::tavily-search"
27+
- "inline::rag-runtime"
28+
- "remote::model-context-protocol"
29+
container_image: registry.redhat.io/ubi9/python-311:9.6-1749631027
30+
additional_pip_packages:
31+
- aiosqlite
32+
- sqlalchemy[asyncio]
33+
image_type: container
34+
image_name: llama-stack-rh-standalone
35+
# external_providers_dir: distribution/providers.d # Disabled for standalone mode

distribution/build.py

Lines changed: 61 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,14 @@
55
# This source code is licensed under the terms described in the LICENSE file in
66
# the root directory of this source tree.
77

8-
# Usage: ./distribution/build.py
8+
# Usage: ./distribution/build.py [--standalone] [--unified]
9+
# Or set STANDALONE=true or UNIFIED=true environment variables
910

11+
import os
1012
import shutil
1113
import subprocess
1214
import sys
15+
import argparse
1316
from pathlib import Path
1417

1518
BASE_REQUIREMENTS = [
@@ -57,9 +60,10 @@ def check_llama_stack_version():
5760
print("Continuing without version validation...")
5861

5962

60-
def get_dependencies():
63+
def get_dependencies(standalone=False):
6164
"""Execute the llama stack build command and capture dependencies."""
62-
cmd = "llama stack build --config distribution/build.yaml --print-deps-only"
65+
config_file = "distribution/build-standalone.yaml" if standalone else "distribution/build.yaml"
66+
cmd = f"llama stack build --config {config_file} --print-deps-only"
6367
try:
6468
result = subprocess.run(
6569
cmd, shell=True, capture_output=True, text=True, check=True
@@ -112,7 +116,7 @@ def get_dependencies():
112116
sys.exit(1)
113117

114118

115-
def generate_containerfile(dependencies):
119+
def generate_containerfile(dependencies, standalone=False, unified=False):
116120
"""Generate Containerfile from template with dependencies."""
117121
template_path = Path("distribution/Containerfile.in")
118122
output_path = Path("distribution/Containerfile")
@@ -126,7 +130,13 @@ def generate_containerfile(dependencies):
126130
template_content = f.read()
127131

128132
# Add warning message at the top
129-
warning = "# WARNING: This file is auto-generated. Do not modify it manually.\n# Generated by: distribution/build.py\n\n"
133+
if unified:
134+
mode = "unified"
135+
elif standalone:
136+
mode = "standalone"
137+
else:
138+
mode = "full"
139+
warning = f"# WARNING: This file is auto-generated. Do not modify it manually.\n# Generated by: distribution/build.py --{mode}\n\n"
130140

131141
# Process template using string formatting
132142
containerfile_content = warning + template_content.format(
@@ -141,19 +151,63 @@ def generate_containerfile(dependencies):
141151

142152

143153
def main():
154+
parser = argparse.ArgumentParser(
155+
description="Build Llama Stack distribution",
156+
epilog="""
157+
Examples:
158+
%(prog)s # Build full version (default)
159+
%(prog)s --standalone # Build standalone version (no Kubernetes deps)
160+
%(prog)s --unified # Build unified version (supports both modes)
161+
STANDALONE=true %(prog)s # Build standalone via environment variable
162+
UNIFIED=true %(prog)s # Build unified via environment variable
163+
""",
164+
formatter_class=argparse.RawDescriptionHelpFormatter
165+
)
166+
parser.add_argument("--standalone", action="store_true",
167+
help="Build standalone version without Kubernetes dependencies")
168+
parser.add_argument("--unified", action="store_true",
169+
help="Build unified version that supports both modes via environment variables")
170+
args = parser.parse_args()
171+
172+
# Check environment variable as fallback
173+
standalone = args.standalone or os.getenv("STANDALONE", "false").lower() in ("true", "1", "yes")
174+
unified = args.unified or os.getenv("UNIFIED", "false").lower() in ("true", "1", "yes")
175+
176+
if unified:
177+
mode = "unified"
178+
print("Building unified version (supports both full and standalone modes)...")
179+
else:
180+
mode = "standalone" if standalone else "full"
181+
print(f"Building {mode} version...")
182+
144183
print("Checking llama installation...")
145184
check_llama_installed()
146185

147186
print("Checking llama-stack version...")
148187
check_llama_stack_version()
149188

150189
print("Getting dependencies...")
151-
dependencies = get_dependencies()
190+
dependencies = get_dependencies(standalone)
152191

153192
print("Generating Containerfile...")
154-
generate_containerfile(dependencies)
193+
generate_containerfile(dependencies, standalone, unified)
155194

156195
print("Done!")
196+
print(f"\nTo build the Docker image:")
197+
if unified:
198+
print(" docker build -f distribution/Containerfile -t llama-stack-unified .")
199+
print("\nTo run in standalone mode:")
200+
print(" docker run -e STANDALONE=true -e VLLM_URL=http://host.docker.internal:8000/v1 -e INFERENCE_MODEL=your-model -p 8321:8321 llama-stack-unified")
201+
print("\nTo run in full mode (requires Kubernetes):")
202+
print(" docker run -p 8321:8321 llama-stack-unified")
203+
elif standalone:
204+
print(" docker build -f distribution/Containerfile -t llama-stack-standalone .")
205+
print("\nTo run in standalone mode:")
206+
print(" docker run -e VLLM_URL=http://host.docker.internal:8000/v1 -e INFERENCE_MODEL=your-model -p 8321:8321 llama-stack-standalone")
207+
else:
208+
print(" docker build -f distribution/Containerfile -t llama-stack-full .")
209+
print("\nTo run with full features (requires Kubernetes):")
210+
print(" docker run -p 8321:8321 llama-stack-full")
157211

158212

159213
if __name__ == "__main__":

distribution/build.yaml

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,32 @@
1-
version: 2
1+
version: "2"
22
distribution_spec:
33
description: Red Hat distribution of Llama Stack
44
providers:
55
inference:
6-
- provider_type: remote::vllm
7-
- provider_type: inline::sentence-transformers
6+
- "remote::vllm"
7+
- "inline::sentence-transformers"
88
vector_io:
9-
- provider_type: inline::milvus
9+
- "inline::milvus"
1010
safety:
11-
- provider_type: remote::trustyai_fms
11+
- "remote::trustyai_fms"
1212
agents:
13-
- provider_type: inline::meta-reference
13+
- "inline::meta-reference"
1414
eval:
15-
- provider_type: remote::trustyai_lmeval
15+
- "remote::trustyai_lmeval"
1616
datasetio:
17-
- provider_type: remote::huggingface
18-
- provider_type: inline::localfs
17+
- "remote::huggingface"
18+
- "inline::localfs"
1919
scoring:
20-
- provider_type: inline::basic
21-
- provider_type: inline::llm-as-judge
22-
- provider_type: inline::braintrust
20+
- "inline::basic"
21+
- "inline::llm-as-judge"
22+
- "inline::braintrust"
2323
telemetry:
24-
- provider_type: inline::meta-reference
24+
- "inline::meta-reference"
2525
tool_runtime:
26-
- provider_type: remote::brave-search
27-
- provider_type: remote::tavily-search
28-
- provider_type: inline::rag-runtime
29-
- provider_type: remote::model-context-protocol
26+
- "remote::brave-search"
27+
- "remote::tavily-search"
28+
- "inline::rag-runtime"
29+
- "remote::model-context-protocol"
3030
container_image: registry.redhat.io/ubi9/python-311:9.6-1749631027
3131
additional_pip_packages:
3232
- aiosqlite

distribution/entrypoint.sh

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
#!/bin/bash
2+
3+
# Unified entrypoint script for Llama Stack distribution
4+
# Supports both full and standalone modes via STANDALONE environment variable
5+
6+
set -e
7+
8+
echo "=== Llama Stack Distribution Entrypoint ==="
9+
10+
# Check if we should run in standalone mode
11+
if [ "${STANDALONE:-false}" = "true" ]; then
12+
echo "Running in STANDALONE mode (no Kubernetes dependencies)"
13+
14+
# Use standalone configuration
15+
CONFIG_FILE="/opt/app-root/run-standalone.yaml"
16+
17+
# Filter out TrustyAI providers from providers.d directory
18+
echo "Filtering out TrustyAI providers for standalone mode..."
19+
mkdir -p ${HOME}/.llama/providers.d
20+
21+
# Copy only non-TrustyAI providers
22+
find /opt/app-root/.llama/providers.d -name "*.yaml" ! -name "*trustyai*" -exec cp {} ${HOME}/.llama/providers.d/ \; 2>/dev/null || true
23+
24+
# Remove the external_providers_dir from the config to prevent loading TrustyAI providers
25+
echo "Disabling external providers directory for standalone mode..."
26+
sed -i 's|external_providers_dir:.*|# external_providers_dir: disabled for standalone mode|' "$CONFIG_FILE"
27+
28+
echo "✓ Standalone configuration ready"
29+
echo "✓ TrustyAI providers excluded"
30+
else
31+
echo "Running in FULL mode (with Kubernetes dependencies)"
32+
33+
# Use full configuration
34+
CONFIG_FILE="/opt/app-root/run-full.yaml"
35+
36+
# Copy all providers
37+
echo "Setting up all providers..."
38+
mkdir -p ${HOME}/.llama/providers.d
39+
cp -r /opt/app-root/.llama/providers.d/* ${HOME}/.llama/providers.d/ 2>/dev/null || true
40+
41+
echo "✓ Full configuration ready"
42+
echo "✓ All providers available"
43+
fi
44+
45+
echo "Configuration file: $CONFIG_FILE"
46+
echo "APIs enabled: $(grep -A 20 '^apis:' $CONFIG_FILE | grep '^-' | wc -l) APIs"
47+
48+
# Show which APIs are available
49+
echo "Available APIs:"
50+
grep -A 20 '^apis:' $CONFIG_FILE | grep '^-' | sed 's/^- / - /' || echo " (none listed)"
51+
52+
# Start the server
53+
echo "Starting Llama Stack server..."
54+
exec python -m llama_stack.core.server.server "$CONFIG_FILE"

0 commit comments

Comments
 (0)