anomalyco · sjawhar · Feb 10, 2026 · Feb 6, 2026 · Feb 21, 2026 · Feb 27, 2026
diff --git a/docs/plans/2026-03-01-context-1m-auto-escalation-design.md b/docs/plans/2026-03-01-context-1m-auto-escalation-design.md
@@ -0,0 +1,98 @@
+# Design: Session-Aware Auto-Escalation for 1M Context
+
+## Problem
+
+The branch adds `context-1m-2025-08-07` to the Anthropic `anthropic-beta` header unconditionally (`provider.ts` line 126). This causes HTTP 400 errors for accounts below Tier 4: `"The long context beta is not yet available for this subscription."` It also enables 2× input / 1.5× output pricing for requests exceeding 200K tokens, even when conversations are small.
+
+## Solution
+
+Session-aware auto-escalation: only send the `context-1m-2025-08-07` beta header when the model supports 1M context AND the session actually needs it.
+
+## Config
+
+Add `context1m` to provider options:
+
+```typescript
+context1m: z.union([z.literal("auto"), z.boolean()]).optional()
+```
+
+```jsonc
+// opencode.json
+{ "provider": { "anthropic": { "options": { "context1m": "auto" } } } }
+```
+
+- `"auto"` (default): enable header only when model supports 1M AND session input tokens exceed 150K
+- `true`: always send header for models that support 1M context
+- `false`: never send header
+
+## Decision Logic
+
+Three conditions determine whether the header is sent (in `"auto"` mode, all must be true):
+
+1. **Model supports it**: `model.limit.context > 200_000`
+2. **Session needs it**: accumulated input tokens > 150K (75% of 200K threshold)
+3. **Config allows it**: `context1m !== false`
+
+For `true` mode: only condition 1 is checked.
+For `false` mode: never send.
+
+The model's declared `limit.context` is the capability signal. Users who set `limit.context: 1000000` on a model in their config (e.g., `claude-opus-4-6`) are opting in to 1M support for that model. Models with 200K limits (Haiku, older models) never get the header.
+
+## Implementation
+
+### Touch Points
+
+1. **`provider.ts` — Anthropic loader** (CUSTOM_LOADERS, line 126): Remove `context-1m-2025-08-07` from the static beta header string. Keep `claude-code-20250219`, `interleaved-thinking-2025-05-14`, `fine-grained-tool-streaming-2025-05-14`, and `adaptive-thinking-2026-01-28`.
+
+2. **`provider.ts` — Module-level state**: Add a boolean flag and setter for the session layer to communicate with the fetch wrapper.
+
+   ```typescript
+   let _context1m = false
+   export function setContext1m(enabled: boolean) {
+     _context1m = enabled
+   }
+   ```
+
+3. **`provider.ts` — Fetch wrapper** (in `getSDK()`, ~line 1073): For Anthropic requests (check `model.providerID === "anthropic"` or `model.api.npm === "@ai-sdk/anthropic"`), if `_context1m` is true, append `,context-1m-2025-08-07` to the `anthropic-beta` request header.
+
+4. **`session/llm.ts`** — Before each LLM call: Read the provider config, check the model's context limit, check accumulated session tokens, and call `Provider.setContext1m()`.
+
+   ```typescript
+   const config = provider.options?.context1m ?? "auto"
+   const supports1m = model.limit.context > 200_000
+   const needs1m = lastUsage.tokens.input > 150_000
+   Provider.setContext1m(config === true ? supports1m : config === false ? false : supports1m && needs1m)
+   ```
+
+5. **`config.ts`** — Provider options schema: Add `context1m` to the options object with the union type.
+
+### Console (`packages/console`)
+
+The console's `anthropic.ts` already conditionally applies the header based on model name (`supports1m = reqModel.includes("sonnet") || reqModel.includes("opus-4-6")`). This is a separate package and can be updated independently to also respect a config option if desired.
+
+## Edge Cases
+
+| Scenario                           | Behavior                                                                                                |
+| ---------------------------------- | ------------------------------------------------------------------------------------------------------- |
+| New session, any model             | No header — safe for all tiers                                                                          |
+| Opus 4.6 at 180K tokens, auto mode | Header enabled — can grow to 1M                                                                         |
+| Haiku at any token count           | Never gets header (200K context limit)                                                                  |
+| Sub-Tier-4, small conversation     | No header — works fine                                                                                  |
+| Sub-Tier-4, Opus 4.6 at 180K       | Header enabled, API returns Tier error. Separate fallback work (see below) handles graceful degradation |
+| `context1m: false`, any model      | Never sends header, hard 200K limit                                                                     |
+| `context1m: true`, Opus 4.6 at 10K | Header sent. No cost impact — premium pricing only triggers when total input >200K                      |
+| `context1m: true`, Haiku           | No header — model doesn't support 1M (context limit ≤200K)                                              |
+
+## Related Work
+
+A separate agent is working on runtime fallback for auth/billing errors (`~/.agent-mail/long-context`). That work makes the error recoverable (fall back to another model). Our work prevents the error from occurring in the first place. Both are complementary.
+
+## Pricing Reference
+
+The `context-1m` header alone doesn't change pricing. Premium rates only apply when total input tokens (including cache) exceed 200K:
+
+- Input: 2× standard rate
+- Output: 1.5× standard rate
+- Cache read/write: proportional increase
+
+This is why auto-escalation saves money — the header is only present when you'd hit the premium tier anyway.
diff --git a/docs/plans/2026-03-01-context-1m-auto-escalation.md b/docs/plans/2026-03-01-context-1m-auto-escalation.md
@@ -0,0 +1,225 @@
+# 1M Context Error-Retry Implementation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** Gracefully handle Anthropic's "long context beta not available" error by retrying without the `context-1m` header, then remembering to skip it for the process lifetime. Zero config needed.
+
+**Architecture:** Keep the `context-1m-2025-08-07` beta header in the static Anthropic loader (already present on this branch). In the fetch wrapper inside `getSDK()`, detect the specific Tier error from the response, retry the request without the header, and set a process-level flag to skip it on future requests.
+
+**Tech Stack:** TypeScript, Vercel AI SDK
+
+**Design doc:** `docs/plans/2026-03-01-context-1m-auto-escalation-design.md`
+
+---
+
+### Task 1: Add Error-Retry Logic to the Fetch Wrapper
+
+**Files:**
+
+- Modify: `packages/opencode/src/provider/provider.ts`
+
+**Context:** The fetch wrapper is at line 1073 inside `getSDK()`. It's a closure that captures `model` from the outer scope. The `anthropic-beta` header including `context-1m-2025-08-07` is set statically in `CUSTOM_LOADERS["anthropic"]` at line 126.
+
+**Step 1: Add process-level disabled flag**
+
+At the top of the `Provider` namespace (after the `log` declaration, around line 49), add:
+
+```typescript
+let _context1mDisabled = false
+```
+
+**Step 2: Add retry logic in the fetch wrapper**
+
+In the fetch wrapper (`options["fetch"] = async (input, init) => {`, line 1073), replace the final return statement. Currently (line 1106-1110):
+
+```typescript
+return fetchFn(input, {
+  ...opts,
+  // @ts-ignore see here: https://github.com/oven-sh/bun/issues/16682
+  timeout: false,
+})
+```
+
+Replace with:
+
+```typescript
+const response = await fetchFn(input, {
+  ...opts,
+  // @ts-ignore see here: https://github.com/oven-sh/bun/issues/16682
+  timeout: false,
+})
+
+// Detect Anthropic "long context beta not available" error and retry without the header
+if (!_context1mDisabled && model.api.npm === "@ai-sdk/anthropic" && response.status === 400) {
+  const cloned = response.clone()
+  const body = await cloned.json().catch(() => null)
+  if (
+    body?.error?.type === "invalid_request_error" &&
+    typeof body?.error?.message === "string" &&
+    body.error.message.toLowerCase().includes("long context")
+  ) {
+    log.info("context-1m beta not available, retrying without it")
+    _context1mDisabled = true
+    const headers = new Headers(opts.headers as HeadersInit)
+    const beta = headers.get("anthropic-beta") ?? ""
+    headers.set(
+      "anthropic-beta",
+      beta
+        .split(",")
+        .filter((h) => !h.includes("context-1m"))
+        .join(","),
+    )
+    return fetchFn(input, {
+      ...opts,
+      headers,
+      // @ts-ignore
+      timeout: false,
+    })
+  }
+}
+
+return response
+```
+
+**Step 3: Strip `context-1m` from future requests when disabled**
+
+At the top of the fetch wrapper (after `const opts = init ?? {}`, line 1076), add:
+
+```typescript
+// Skip context-1m header if previously detected as unavailable
+if (_context1mDisabled && model.api.npm === "@ai-sdk/anthropic") {
+  const headers = new Headers(opts.headers as HeadersInit)
+  const beta = headers.get("anthropic-beta") ?? ""
+  if (beta.includes("context-1m")) {
+    headers.set(
+      "anthropic-beta",
+      beta
+        .split(",")
+        .filter((h) => !h.includes("context-1m"))
+        .join(","),
+    )
+    opts.headers = headers
+  }
+}
+```
+
+**Step 4: Verify no type errors**
+
+Run: `cd packages/opencode && npx tsc --noEmit`
+Expected: No new errors
+
+**Step 5: Describe and advance**
+
+```bash
+jj describe -m "feat(provider): auto-retry without context-1m header when account lacks access"
+jj new
+```
+
+---
+
+### Task 2: Tests
+
+**Files:**
+
+- Create: `packages/opencode/test/provider/context1m.test.ts`
+
+**Step 1: Write tests for the retry behavior**
+
+The retry logic is embedded in the fetch wrapper, which is hard to unit test in isolation. Instead, test the header-stripping logic and the flag behavior:
+
+```typescript
+import { describe, test, expect } from "bun:test"
+
+describe("context-1m header stripping", () => {
+  function strip(beta: string) {
+    return beta
+      .split(",")
+      .filter((h) => !h.includes("context-1m"))
+      .join(",")
+  }
+
+  test("strips context-1m from beta header", () => {
+    const header =
+      "claude-code-20250219,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,adaptive-thinking-2026-01-28,context-1m-2025-08-07"
+    expect(strip(header)).toBe(
+      "claude-code-20250219,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,adaptive-thinking-2026-01-28",
+    )
+  })
+
+  test("preserves other headers when context-1m is not present", () => {
+    const header = "claude-code-20250219,interleaved-thinking-2025-05-14"
+    expect(strip(header)).toBe("claude-code-20250219,interleaved-thinking-2025-05-14")
+  })
+
+  test("handles context-1m as only header", () => {
+    expect(strip("context-1m-2025-08-07")).toBe("")
+  })
+})
+
+describe("error detection", () => {
+  test("matches the known Anthropic tier error", () => {
+    const body = {
+      error: {
+        type: "invalid_request_error",
+        message: "The long context beta is not yet available for this subscription.",
+      },
+    }
+    const matches =
+      body.error.type === "invalid_request_error" &&
+      typeof body.error.message === "string" &&
+      body.error.message.toLowerCase().includes("long context")
+    expect(matches).toBe(true)
+  })
+
+  test("does not match unrelated errors", () => {
+    const body = {
+      error: {
+        type: "invalid_request_error",
+        message: "max_tokens must be less than 8192",
+      },
+    }
+    const matches =
+      body.error.type === "invalid_request_error" &&
+      typeof body.error.message === "string" &&
+      body.error.message.toLowerCase().includes("long context")
+    expect(matches).toBe(false)
+  })
+})
+```
+
+**Step 2: Run the tests**
+
+Run: `cd packages/opencode && bun test test/provider/context1m.test.ts`
+Expected: All tests pass
+
+**Step 3: Run existing tests for regressions**
+
+Run: `cd packages/opencode && bun test test/session/compaction.test.ts`
+Expected: All tests pass
+
+**Step 4: Describe and advance**
+
+```bash
+jj describe -m "test(provider): add context-1m retry logic tests"
+jj new
+```
+
+---
+
+### Task 3: Verify End-to-End
+
+**Step 1: Type check the full package**
+
+Run: `cd packages/opencode && npx tsc --noEmit`
+Expected: No errors
+
+**Step 2: Run the full test suite**
+
+Run: `cd packages/opencode && bun test`
+Expected: All tests pass
+
+**Step 3: Final describe**
+
+```bash
+jj describe -m "feat(provider): graceful context-1m fallback for sub-Tier-4 accounts"
+```
diff --git a/packages/opencode/src/bun/index.ts b/packages/opencode/src/bun/index.ts
@@ -54,11 +54,33 @@ export namespace BunProc {
     }),
   )
 
+  // For github: dependencies, bun installs under the package's actual name
+  // (from its package.json "name" field), not under the github: specifier.
+  // Resolve the real module path by reading the installed package name from
+  // the cache lockfile.
+  async function resolveModulePath(pkg: string): Promise<string> {
+    const nodeModules = path.join(Global.Path.cache, "node_modules")
+    if (!pkg.startsWith("github:")) return path.join(nodeModules, pkg)
+    const lockPath = path.join(Global.Path.cache, "bun.lock")
+    const lock = await Filesystem.readText(lockPath).catch(() => "")
+    // lockfile maps "actual-name": "github:owner/repo#ref"
+    for (const line of lock.split("\n")) {
+      if (line.includes(pkg)) {
+        const match = line.match(/^\s*"([^"]+)":\s*"/)
+        if (match && match[1] !== pkg) return path.join(nodeModules, match[1])
+      }
+    }
+    // Fallback: strip github: prefix and use repo name
+    const repoName = pkg.replace(/^github:/, "").split("#")[0].split("/").pop()
+    if (repoName) return path.join(nodeModules, repoName)
+    return path.join(nodeModules, pkg)
+  }
+
   export async function install(pkg: string, version = "latest") {
     // Use lock to ensure only one install at a time
     using _ = await Lock.write("bun-install")
 
-    const mod = path.join(Global.Path.cache, "node_modules", pkg)
+    const mod = await resolveModulePath(pkg)
     const pkgjsonPath = path.join(Global.Path.cache, "package.json")
     const parsed = await Filesystem.readJson<{ dependencies: Record<string, string> }>(pkgjsonPath).catch(async () => {
       const result = { dependencies: {} as Record<string, string> }
@@ -89,7 +111,7 @@ export namespace BunProc {
       ...(proxied() || process.env.CI ? ["--no-cache"] : []),
       "--cwd",
       Global.Path.cache,
-      pkg + "@" + version,
+      pkg.includes("#") ? pkg : pkg + "@" + version,
     ]
 
     // Let Bun handle registry resolution:
@@ -112,11 +134,14 @@ export namespace BunProc {
       )
     })
 
+    // Re-resolve after install in case lockfile changed
+    const installedMod = await resolveModulePath(pkg)
+
     // Resolve actual version from installed package when using "latest"
     // This ensures subsequent starts use the cached version until explicitly updated
     let resolvedVersion = version
     if (version === "latest") {
-      const installedPkg = await Filesystem.readJson<{ version?: string }>(path.join(mod, "package.json")).catch(
+      const installedPkg = await Filesystem.readJson<{ version?: string }>(path.join(installedMod, "package.json")).catch(
         () => null,
       )
       if (installedPkg?.version) {
@@ -126,6 +151,6 @@ export namespace BunProc {
 
     parsed.dependencies[pkg] = resolvedVersion
     await Filesystem.writeJson(pkgjsonPath, parsed)
-    return mod
+    return installedMod
   }
 }