Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 98 additions & 13 deletions container/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,16 +167,19 @@ The `run.sh` script launches Docker containers with the appropriate configuratio
- **GPU Management**: Automatic GPU detection and allocation
- **Volume Mounting**: Workspace and HuggingFace cache mounting
- **User Management**: Root or user-based container execution
- **Network Configuration**: Host networking for service communication
- **Network Configuration**: Configurable networking modes (host, bridge, none, container sharing)
- **Resource Limits**: Memory, file descriptors, and IPC configuration

**Common Usage Examples:**

```bash
# Basic container launch (inference/production)
# Basic container launch (inference/production, runs as root user)
./run.sh --image dynamo:latest-vllm

# Mount workspace for development (use local-dev image for local user permissions)
# Basic container launch with local host user permissions
./run.sh --image dynamo:latest-vllm-local-dev

# Mount workspace for development (use local-dev image for local host user permissions)
./run.sh --image dynamo:latest-vllm-local-dev --mount-workspace

# Use specific image and framework for development
Expand All @@ -188,16 +191,89 @@ The `run.sh` script launches Docker containers with the appropriate configuratio
# Development with custom environment variables
./run.sh --image dynamo:latest-vllm-local-dev -e CUDA_VISIBLE_DEVICES=0,1 --mount-workspace

# Production inference without GPU access
./run.sh --image dynamo:latest-vllm --gpus none

# Dry run to see docker command
./run.sh --dry-run

# Development with custom volume mounts
./run.sh --image dynamo:latest-vllm-local-dev -v /host/path:/container/path --mount-workspace
```

### Network Configuration Options

The `run.sh` script supports different networking modes via the `--network` flag (defaults to `host`):

#### Host Networking (Default)
```bash
# Same examples with local host user permissions
./run.sh --image dynamo:latest-vllm-local-dev --network host
./run.sh --image dynamo:latest-vllm-local-dev
```
**Use cases:**
- High-performance ML inference (default for GPU workloads)
- Services that need direct host port access
- Maximum network performance with minimal overhead
- Sharing services with the host machine (NATS, etcd, etc.)

**⚠️ Port Sharing Limitation:** Host networking shares all ports with the host machine, which means you can only run **one instance** of services like NATS (port 4222) or etcd (port 2379) across all containers and the host.

#### Bridge Networking (Isolated)
```bash
# Development/testing with bridge networking and host cache sharing
./run.sh --image dynamo:latest-vllm-local-dev --mount-workspace --network bridge -v $HOME/.cache:/home/ubuntu/.cache
```
**Use cases:**
- Secure isolation from host network
- CI/CD pipelines requiring complete isolation
- When you need absolute control of ports

#### No Networking ⚠️ **LIMITED FUNCTIONALITY**
```bash
# Complete network isolation - no external connectivity
./run.sh --image dynamo:latest-vllm --network none --mount-workspace

# Same with local user permissions
./run.sh --image dynamo:latest-vllm-local-dev --network none --mount-workspace
```
**⚠️ WARNING: `--network none` severely limits Dynamo functionality:**
- **No model downloads** - HuggingFace models cannot be downloaded
- **No API access** - Cannot reach external APIs or services
- **No distributed inference** - Multi-node setups won't work
- **No monitoring/logging** - External monitoring systems unreachable
- **Limited debugging** - Cannot access external debugging tools

**Very limited use cases:**
- Pre-downloaded models with purely local processing
- Air-gapped security environments (models must be pre-staged)

#### Container Network Sharing
Use `--network container:name` to share the network namespace with another container.

**Use cases:**
- Sidecar patterns (logging, monitoring, caching)
- Service mesh architectures
- Sharing network namespaces between related containers

See Docker documentation for `--network container:name` usage.

#### Custom Networks
Use custom Docker networks for multi-container applications. Create with `docker network create` and specify with `--network network-name`.

**Use cases:**
- Multi-container applications
- Service discovery by container name

See Docker documentation for custom network creation and management.

#### Network Mode Comparison

| Mode | Performance | Security | Use Case | Dynamo Compatibility | Port Sharing |
|------|-------------|----------|----------|---------------------|---------------|
| `host` | Highest | Lower | ML/GPU workloads, high-performance services | ✅ Full | ⚠️ **Shared with host** (one NATS/etcd only) |
| `bridge` | Good | Higher | General web services, controlled port exposure | ✅ Full | ✅ Isolated ports |
| `none` | N/A | Highest | Air-gapped environments only | ⚠️ **Very Limited** | ✅ No network |
| `container:name` | Good | Medium | Sidecar patterns, shared network stacks | ✅ Full | ⚠️ Shared with target container |
| Custom networks | Good | Medium | Multi-container applications | ✅ Full | ✅ Isolated ports |

## Workflow Examples

### Development Workflow
Expand All @@ -213,23 +289,32 @@ The `run.sh` script launches Docker containers with the appropriate configuratio
python -m dynamo.frontend &

# Start backend (vLLM example)
python -m dynamo.vllm --model Qwen/Qwen3-0.6B --gpu-memory-utilization 0.50 &
python -m dynamo.vllm --model Qwen/Qwen3-0.6B --gpu-memory-utilization 0.20 &
```

### Production Workflow
```bash
# 1. Build production image
./build.sh --framework vllm --release-build

# 2. Run production container
./run.sh --image dynamo:latest-vllm-local-dev --gpus all
# 2. Run production container (runs as root)
./run.sh --image dynamo:latest-vllm --gpus all
```

### Testing Workflow
### CI/CD Workflow
```bash
# 1. Build with no cache for clean build
# 1. Build image for CI
./build.sh --framework vllm --no-cache

# 2. Test container functionality (--image defaults to dynamo:latest-vllm)
./run.sh --mount-workspace -it -- python -m pytest tests/
# 2. Run tests with network isolation for reproducible results
./run.sh --image dynamo:latest-vllm --mount-workspace --network bridge -v $HOME/.cache:/home/ubuntu/.cache -- python -m pytest tests/

**Note:** When using `--network bridge`, services are only accessible inside the container. To run the full Dynamo stack for testing, execute these commands inside the container:

```bash
# Inside the container with bridge networking, make sure you start nats-server & etcd:
$ nats-server -js &
$ etcd --listen-client-urls http://0.0.0.0:2379 --advertise-client-urls http://0.0.0.0:2379 --data-dir /tmp/etcd &
$ python -m dynamo.frontend &
$ DYN_SYSTEM_ENABLED=true DYN_SYSTEM_PORT=8081 python -m dynamo.vllm --model Qwen/Qwen3-0.6B --gpu-memory-utilization 0.20 --enforce-eager --no-enable-prefix-caching --max-num-seqs 64 &
```
15 changes: 14 additions & 1 deletion container/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ INTERACTIVE=
USE_NIXL_GDS=
RUNTIME=nvidia
WORKDIR=/workspace
NETWORK=host

get_options() {
while :; do
Expand Down Expand Up @@ -165,6 +166,14 @@ get_options() {
--use-nixl-gds)
USE_NIXL_GDS=TRUE
;;
--network)
if [ "$2" ]; then
NETWORK=$2
shift
else
missing_requirement "$1"
fi
;;
--dry-run)
RUN_PREFIX="echo"
echo ""
Expand Down Expand Up @@ -304,6 +313,10 @@ show_help() {
echo " [--hf-cache directory to volume mount as the hf cache, default is NONE unless mounting workspace]"
echo " [--gpus gpus to enable, default is 'all', 'none' disables gpu support]"
echo " [--use-nixl-gds add volume mounts and capabilities needed for NVIDIA GPUDirect Storage]"
echo " [--network network mode for container, default is 'host']"
echo " Options: 'host' (default), 'bridge', 'none', 'container:name'"
echo " Examples: --network bridge (isolated), --network none (no network - WARNING: breaks most functionality)"
echo " --network container:redis (share network with 'redis' container)"
echo " [-v add volume mount]"
echo " [-e add environment variable]"
echo " [--mount-workspace set up for local development]"
Expand Down Expand Up @@ -335,7 +348,7 @@ ${RUN_PREFIX} docker run \
${GPU_STRING} \
${INTERACTIVE} \
${RM_STRING} \
--network host \
--network "$NETWORK" \
${RUNTIME:+--runtime "$RUNTIME"} \
--shm-size=10G \
--ulimit memlock=-1 \
Expand Down
Loading