Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3f00ac1
feat[api]: add managed mode to @qvac/ai-sdk-provider (QVAC-19900)
simon-iribarren Jun 3, 2026
d73595e
fix: resolve @qvac/cli via main entry when its exports block package.…
simon-iribarren Jun 3, 2026
7bc2c12
doc: update ai-sdk provider agent setup after queue (QVAC-19900)
simon-iribarren Jun 5, 2026
af540a9
Merge remote-tracking branch 'upstream/main' into feat/qvac-19900-ai-…
simon-iribarren Jun 5, 2026
16c38b5
QVAC-19900 feat[api]: per-model config for managed mode
simon-iribarren Jun 5, 2026
3eed57e
QVAC-19900 feat[api]: shared idle-reaped managed serve daemon
simon-iribarren Jun 5, 2026
92d7ba8
QVAC-19900 fix: reject duplicate model names in managed mode
simon-iribarren Jun 5, 2026
6763b8c
QVAC-19900 fix[api]: address managed-mode self-review findings
simon-iribarren Jun 8, 2026
acdbbc2
QVAC-19900 fix[api]: address managed-mode lifecycle review (round 2)
simon-iribarren Jun 8, 2026
8fc5df6
docs: use canonical qvac.tether.io URL in ai-sdk-provider README
simon-iribarren Jun 9, 2026
ddc5075
Merge branch 'main' into feat/qvac-19900-ai-sdk-provider-managed-mode
simon-iribarren Jun 9, 2026
03372a3
QVAC-19900 feat[api]: public model catalog + catalog-id aliases in ma…
simon-iribarren Jun 9, 2026
b529810
QVAC-19900 feat[api]: process-group serve teardown + closeOnParentExit
simon-iribarren Jun 9, 2026
60c0fa1
QVAC-19900 fix: keep managed serve lifecycle correct under close() ra…
simon-iribarren Jun 9, 2026
c0e39c6
Merge branch 'main' into feat/qvac-19900-ai-sdk-provider-managed-mode
simon-iribarren Jun 9, 2026
4c9d4c5
QVAC-19900 fix: rename reresolve result to resolved for clarity in ma…
simon-iribarren Jun 9, 2026
e418a55
QVAC-19900 mod: collapse redundant sync/async registry teardown helpers
simon-iribarren Jun 10, 2026
11477ab
QVAC-19900 mod: trim verbose comments in managed registry
simon-iribarren Jun 10, 2026
1191b78
Merge branch 'main' into feat/qvac-19900-ai-sdk-provider-managed-mode
simon-iribarren Jun 10, 2026
d3d0ff1
QVAC-19900 mod: drop unused DEFAULT_SERVE_BIN and ephemeralConfigName
simon-iribarren Jun 10, 2026
7351a25
Merge branch 'main' into feat/qvac-19900-ai-sdk-provider-managed-mode
simon-iribarren Jun 10, 2026
817589b
Merge branch 'main' into feat/qvac-19900-ai-sdk-provider-managed-mode
simon-iribarren Jun 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 50 additions & 2 deletions docs/website/content/docs/cli/http-server/integration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,33 @@ qvac.textEmbeddingModel('embed-gemma') // text embeddings
qvac.imageModel('flux-schnell') // image generation
```

## Managed mode (auto-spawn the server)

<Callout type="info" title="Unreleased β€” lands in @qvac/ai-sdk-provider 0.2.0.">
The behaviour below is on `main` but not yet in the published `0.1.0`. The default (external) mode shown above is unchanged.
</Callout>

The example above is **external mode**: you run `qvac serve openai` yourself and pass its `baseURL`. **Managed mode** instead lets the provider synthesize an ephemeral config from a model list and bring up `qvac serve` for you. In this mode `createQvac` is asynchronous and returns a `Promise<ManagedQvacProvider>`:

```ts
import { createQvac } from '@qvac/ai-sdk-provider'
import { streamText } from 'ai'

// Spawns (or reuses) a shared `qvac serve` on a free port, then resolves.
await using qvac = await createQvac({
mode: 'managed',
models: [{ name: 'QWEN3_8B_INST_Q4_K_M', config: { ctx_size: 32768, reasoning_budget: 0 } }]
})

const { textStream } = streamText({ model: qvac('QWEN3_8B_INST_Q4_K_M'), prompt: 'Hello!' })
for await (const chunk of textStream) process.stdout.write(chunk)
// Leaving the `await using` scope detaches this process from the serve.
```

The spawned serve is **shared** across processes (keyed by model set + per-model config + host + binary + pinned port) and owned by a detached supervisor that idle-reaps it once no consumer process remains for `serveIdleTimeout` (default 5 minutes). `provider.baseURL` / `provider.port` / `provider.pid` expose the live serve coordinates. Requires the optional `@qvac/cli` peer dependency.

Managed mode needs the `@qvac/cli` peer dependency installed, and (because liveness is tracked by the process that called `createQvac`, not by HTTP traffic) a tool that connects directly to `baseURL` only keeps the serve warm while that resolving process stays alive β€” see the package [README](https://github.com/tetherto/qvac/tree/main/packages/ai-sdk-provider#using-with-coding-agents) for the wrapper pattern and the full option/lifecycle reference.

## Using with coding agents

The HTTP server's primary use case is integrating local AI with coding agents (e.g., OpenCode, Cline, Aider, Continue, and Roo). Although the API is OpenAI-compatible, _the following behaviors require explicit configuration for this use case._
Expand Down Expand Up @@ -155,19 +182,40 @@ type EndpointCategory =

## API

### `createQvac(options?: QvacOptions): QvacProvider`
### `createQvac(options?: QvacOptions): QvacProvider | Promise<ManagedQvacProvider>`

Factory returning a branded Vercel AI SDK provider. Wraps `createOpenAICompatible` with QVAC defaults.

In the default **external** mode it is synchronous and returns a `QvacProvider`:

```ts
interface QvacOptions {
interface QvacExternalOptions {
mode?: 'external' // default
baseURL?: string // default: see Default base URL
apiKey?: string // default: 'qvac'
headers?: Record<string, string> // default: {}
fetch?: typeof fetch // default: globalThis.fetch
}
```

With `mode: 'managed'` it is asynchronous and returns a `Promise<ManagedQvacProvider>` that spawns/reuses a `qvac serve` for you (see [Managed mode](#managed-mode-auto-spawn-the-server)):

```ts
interface QvacManagedOptions {
mode: 'managed'
models: (string | { name: string; config?: Record<string, unknown>; preload?: boolean; default?: boolean })[]
servePort?: number // default: auto-allocate a free port
serveHost?: string // default: '127.0.0.1'
serveStartTimeout?: number // ms; default: 180000
serveBinPath?: string // default: resolve @qvac/cli
reuse?: boolean // share a matching serve; default: true (false if servePort is pinned)
serveIdleTimeout?: number // ms a shared serve lingers after its last consumer; default: 300000
apiKey?: string
headers?: Record<string, string>
fetch?: typeof fetch
}
```

### `qvac`

A default `createQvac()` instance with all defaults. Convenient for quick scripts; explicit `createQvac({ baseURL })` is recommended.
Expand Down
8 changes: 8 additions & 0 deletions packages/ai-sdk-provider/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Changelog

## [Unreleased]

### Added

- **Managed mode (`mode: 'managed'`).** `createQvac({ mode: 'managed', models, ... })` returns a `Promise<ManagedQvacProvider>` that synthesizes an ephemeral `qvac.config.json` from a model list and brings up `qvac serve openai` for you β€” no hand-authored config or separate CLI step. Serves are **shared** across processes via a *fleet key* (model set + per-model config + host + binary + pinned port), owned by a **detached runner** that idle-reaps the serve once no consumer process remains for `serveIdleTimeout` (default 5 min). `close()` / `await using` detaches the calling process; a serve still in use by another consumer keeps running. Includes crash-recovery (`fetch` re-resolves and retries once on `ECONNREFUSED`) and a self-healing registry under `~/.qvac/managed-serves/`. New options: `models`, `servePort`, `serveHost`, `serveStartTimeout`, `serveBinPath`, `reuse`, `serveIdleTimeout`. New exports: `ManagedQvacProvider`, `QvacManagedOptions`, `QvacManagedModel`, `QvacExternalOptions`, and the managed error classes (`QvacManagedModeError` + subclasses) with the `QvacManagedErrorCode` union. Requires the optional `@qvac/cli` peer dependency. **External mode is unchanged**; the managed subsystem is dynamically imported only when `mode: 'managed'` is set.

---

## [0.1.0]

Release Date: 2026-05-27
Expand Down
Loading
Loading