Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 34 additions & 22 deletions docs/maintenance/holodeck/update-mcp-awareness.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,65 @@
<!-- SPDX-License-Identifier: AGPL-3.0-or-later | Copyright (C) 2026 Chris Means -->
# Update MCP Awareness on Holodeck

Manual deployment steps for updating the mcp-awareness service on the holodeck Proxmox host (CT 201 — `awareness-app`).
The mcp-awareness service runs on two app nodes (CT 210, CT 211) behind an HAProxy load balancer (CT 203). Updates are deployed using the zero-downtime deploy script.

## Prerequisites

- SSH access to holodeck (`192.168.200.70`)
- Root access on CT 201 (`awareness-app`, `192.168.200.101`)
- SSH access to holodeck and all CTs (via `~/.ssh/config` aliases)
- The deploy script at `scripts/holodeck/deploy.sh`

## Steps
## Deploying Updates

### 1. SSH into the container
### Code-only updates (zero-downtime)

```bash
ssh root@192.168.200.101
scripts/holodeck/deploy.sh hot
```

### 2. Pull latest code
This performs a rolling update: drains each node from HAProxy, pulls latest code, installs, restarts the service, waits for health check, then re-enables. One node is always serving traffic.

```bash
git config --global --add safe.directory /opt/mcp-awareness
cd /opt/mcp-awareness
git pull origin main
```
**Note:** Active MCP sessions on the restarting node will get "Session terminated" errors. Clients need to reconnect. See issues #161–#163 for planned improvements.

### 3. Install updated package
### Updates with migrations or config changes

```bash
/opt/mcp-awareness/venv/bin/pip install -e .
scripts/holodeck/deploy.sh maintenance
```

### 4. Add any new environment variables
This drains all nodes, runs Alembic migrations on the first node, then updates and restarts all nodes. There is a brief service interruption during migration.

### Adding new environment variables

If the release includes new env vars, append them to the env file:
If a release requires new env vars, update the env file on both app nodes before deploying:

```bash
nano /etc/awareness/env
ssh awareness-app-a 'nano /etc/awareness/env'
ssh awareness-app-b 'nano /etc/awareness/env'
```

### 5. Restart the service
## Verification

After deploy, verify via HAProxy:

```bash
systemctl restart mcp-awareness
curl -s http://192.168.200.103:8420/health | python3 -m json.tool
```

### 6. Verify
Or check both backends directly:

```bash
curl -s localhost:8420/health | python3 -m json.tool
curl -s http://192.168.200.110:8420/health | python3 -m json.tool
curl -s http://192.168.200.111:8420/health | python3 -m json.tool
```

Confirm `status: ok` and expected uptime (should be a few seconds).
## Architecture

See `docs/superpowers/specs/2026-04-02-zero-downtime-deployment-design.md` for the full design spec.

| Component | Host | IP |
|-----------|------|----|
| HAProxy (load balancer) | CT 203 `awareness-lb` | 192.168.200.103 |
| App node A | CT 210 `awareness-app-a` | 192.168.200.110 |
| App node B | CT 211 `awareness-app-b` | 192.168.200.111 |
| Postgres | CT 200 `awareness-pg` | 192.168.200.100 |
| Cloudflare tunnel | CT 202 `awareness-tunnel` | 192.168.200.102 |
Loading