diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index ce2fa1be885..7817b43a146 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -24,6 +24,7 @@ This file is the cross-system architecture index. Detailed designs live in domai
- Notification producers emit through `emitNotificationSignal()` to preserve decisioning and audit invariants. Reminder routing metadata (`routingIntent`, `routingHints`) flows through the signal and is enforced post-decision to control multi-channel fanout. The decision engine produces per-channel thread actions (`start_new` / `reuse_existing`) validated against a candidate set; `notification_thread_created` IPC is emitted only on actual creation, not on reuse.
- Memory extraction/recall must enforce actor-role provenance gates for untrusted actors.
- Trusted contact ingress ACL is channel-agnostic; identity binding adapts per channel (chat ID, E.164 phone, external user ID) without channel-specific branching.
+- Feature flags (`featureFlags` in workspace config) control skill availability. When a skill's flag is OFF, it is excluded from all exposure surfaces: client skill lists, system prompt catalog, `skill_load`, runtime tool projection, and included child skills. The gateway owns the `/v1/feature-flags` REST API; the daemon reads flags from config at each enforcement point.
## System Overview
@@ -221,6 +222,7 @@ graph TB
GW_SLACK_DELIVER["Slack Deliver
/deliver/slack
(internal, from runtime)"]
GW_OAUTH["OAuth Callback
/webhooks/oauth/callback"]
GW_PROXY["Runtime Proxy
(optional, bearer auth)"]
+ GW_FEATURE_FLAGS["Feature Flags API
GET /v1/feature-flags
PATCH /v1/feature-flags/:key"]
GW_PROBES["/healthz + /readyz
k8s liveness/readiness"]
end
diff --git a/assistant/ARCHITECTURE.md b/assistant/ARCHITECTURE.md
index 9040d2d7050..40f9fbce301 100644
--- a/assistant/ARCHITECTURE.md
+++ b/assistant/ARCHITECTURE.md
@@ -305,6 +305,34 @@ Release-driven update notification system that surfaces release notes to the ass
---
+### Skill Feature Flags — Enforcement Points
+
+Feature flags allow external clients to disable individual skills at runtime without restarting the daemon. Flags are stored in `~/.vellum/workspace/config.json` under the `featureFlags` key (managed by the gateway's `/v1/feature-flags` API — see [`gateway/ARCHITECTURE.md`](../gateway/ARCHITECTURE.md)). The daemon's config watcher hot-reloads this file, so flag changes take effect on the next tool resolution or session.
+
+**Flag key format:** `skills..enabled`. A missing key defaults to enabled; only an explicit `false` disables a skill.
+
+**Guarantee:** When a skill's feature flag is OFF, the skill is unavailable everywhere — it cannot appear in client UIs, model context, or runtime tool execution. This is enforced at five independent points:
+
+| Enforcement Point | Module | Effect |
+|-------------------|--------|--------|
+| **1. Client skill list** | `resolveSkillStates()` in `config/skill-state.ts` | Skills with flag OFF are excluded from the resolved list returned to IPC clients (macOS skill list, settings UI). The skill never appears in the client. |
+| **2. System prompt skill catalog** | `appendSkillsCatalog()` in `config/system-prompt.ts` | The model-visible `## Skills Catalog` section in the system prompt filters out flagged-off skills. The model cannot see or reference them. |
+| **3. `skill_load` tool** | `executeSkillLoad()` in `tools/skills/load.ts` | If the model attempts to load a flagged-off skill by name, the tool returns an error: `"skill is currently unavailable (disabled by feature flag)"`. |
+| **4. Runtime tool projection** | `projectSkillTools()` in `daemon/session-skill-tools.ts` | Even if a skill was previously active in a session (has `` markers in history), the per-turn projection drops it when the flag is OFF. Already-registered tools are unregistered. |
+| **5. Included child skills** | `executeSkillLoad()` in `tools/skills/load.ts` | When a parent skill includes children via the `includes` directive, each child is independently checked against its feature flag. Flagged-off children are silently excluded from the loaded skill content. |
+
+The shared gate function `isSkillFeatureEnabled(skillId, config)` in `config/skill-state.ts` is used by all five enforcement points for consistency.
+
+**Key source files:**
+
+| File | Purpose |
+|------|---------|
+| `src/config/skill-state.ts` | `isSkillFeatureEnabled()` — shared gate function; `resolveSkillStates()` — enforcement point 1 |
+| `src/config/system-prompt.ts` | `appendSkillsCatalog()` — enforcement point 2 |
+| `src/tools/skills/load.ts` | `executeSkillLoad()` — enforcement points 3 and 5 |
+| `src/daemon/session-skill-tools.ts` | `projectSkillTools()` — enforcement point 4 |
+| `src/config/schema.ts` | `featureFlags` field definition in `AssistantConfig` (Zod schema) |
+| `src/daemon/handlers/skills.ts` | `handleSkillsList()` — uses `resolveSkillStates()` for IPC client responses |
---
@@ -362,10 +390,11 @@ graph LR
subgraph "~/.vellum/ (Root Files)"
SOCK["vellum.sock
Unix domain socket"]
TRUST["protected/trust.json
Tool permission rules"]
+ FF_TOKEN["feature-flag-token
Dedicated auth for PATCH /v1/feature-flags"]
end
subgraph "~/.vellum/workspace/ (Workspace Files)"
- CONFIG["config files
Hot-reloaded by daemon"]
+ CONFIG["config files
Hot-reloaded by daemon
(includes featureFlags)"]
ONBOARD_PLAYBOOKS["onboarding/playbooks/
[channel]_onboarding.md
assistant-updatable checklists"]
ONBOARD_REGISTRY["onboarding/playbooks/registry.json
channel-start index for fast-path + reconciliation"]
APPS_STORE["data/apps/
.json + pages/*.html
prebuilt Home Base seeded here"]
diff --git a/gateway/ARCHITECTURE.md b/gateway/ARCHITECTURE.md
index 60011752797..57ca8b089d9 100644
--- a/gateway/ARCHITECTURE.md
+++ b/gateway/ARCHITECTURE.md
@@ -29,6 +29,40 @@ Internet
+-- /webhooks/* --> BLOCKED (404, never forwarded to runtime)
```
+### Feature Flags API
+
+The gateway exposes a REST API for reading and mutating feature flags. Feature flags control which skills are available to the assistant — when a flag is OFF, the corresponding skill is excluded from every exposure surface in the assistant (see [`assistant/ARCHITECTURE.md`](../assistant/ARCHITECTURE.md) for enforcement points).
+
+**Endpoints:**
+
+| Method | Path | Description |
+|--------|------|-------------|
+| GET | `/v1/feature-flags` | List all feature flags from workspace config |
+| PATCH | `/v1/feature-flags/:key` | Set a single feature flag. Body: `{ "enabled": true\|false }` |
+
+**Storage:** Flags are persisted in `~/.vellum/workspace/config.json` under the `featureFlags` key as a `Record`. The gateway reads and writes this file directly (atomic temp + rename for writes). The daemon's config watcher hot-reloads changes, so flag mutations take effect on the next session or tool resolution without a restart.
+
+**Flag key format:** Only keys matching `skills..enabled` are accepted for the initial rollout. Other key patterns are rejected with 400.
+
+**Authentication boundary:**
+
+The feature-flags API uses a dedicated token stored at `~/.vellum/feature-flag-token`, separate from the runtime bearer token (`~/.vellum/http-token`). This separation ensures that clients with feature-flag access cannot access runtime endpoints, and vice versa.
+
+| Operation | Accepted tokens |
+|-----------|----------------|
+| `GET /v1/feature-flags` | Runtime bearer token OR feature-flag token |
+| `PATCH /v1/feature-flags/:key` | Feature-flag token ONLY (runtime token is explicitly rejected) |
+
+The feature-flag token is auto-generated on first gateway startup if the file does not exist. The gateway watches the token file for changes and hot-reloads without restart.
+
+**Key source files:**
+
+| File | Purpose |
+|------|---------|
+| `gateway/src/http/routes/feature-flags.ts` | GET and PATCH handlers; config read/write logic |
+| `gateway/src/config.ts` | `readOrGenerateFeatureFlagToken()` — token provisioning; `featureFlagToken` config field |
+| `gateway/src/index.ts` | Route registration, auth enforcement (dual-token for GET, flag-token-only for PATCH), token file watcher |
+
### Channel Binding Lifecycle (Lane Separation)
Each channel (desktop, Telegram, etc.) operates in its own **lane**: conversations created by an external channel are never displayed in the desktop thread list, and desktop conversations are never exposed to external channels. The `channelBinding` metadata on a conversation is used solely for routing inbound/outbound messages within that lane and for filtering sessions during desktop session restoration.