diff --git a/packages/sdk/CHANGELOG.md b/packages/sdk/CHANGELOG.md
index 8f5c606db3..f8ff23a335 100644
--- a/packages/sdk/CHANGELOG.md
+++ b/packages/sdk/CHANGELOG.md
@@ -1,5 +1,32 @@
# Changelog
+## [0.8.2]
+
+š¦ **NPM:** https://www.npmjs.com/package/@qvac/sdk/v/0.8.2
+
+This is a maintenance release that refreshes the SDK README with a streamlined quickstart guide and updated documentation links pointing to the new docs site at docs.qvac.tether.io.
+
+---
+
+## š Documentation
+
+### README Rewrite
+
+The SDK README has been rewritten to provide a cleaner onboarding experience. The verbose installation, usage, and feature sections have been replaced with a concise quickstart that gets users running in four steps, and all documentation links now point to the new docs site.
+
+Key changes:
+
+- **Simplified quickstart** ā A minimal four-step guide (create workspace, install, write script, run) replaces the previous multi-section setup
+- **Updated links** ā Documentation URLs now point to `docs.qvac.tether.io` instead of `qvac.tether.dev`
+- **Support channel** ā The support link now points to the Discord channel instead of FeatureBase
+- **Leaner content** ā Detailed platform instructions (Expo, Linux), feature lists, and example indexes have been moved to the docs site to keep the README focused
+
+---
+
+## āļø Infrastructure
+
+- SDK dependency installs in CI publish and pod check workflows are now frozen to prevent unexpected version drift during builds.
+
## [0.8.1]
š¦ **NPM:** https://www.npmjs.com/package/@qvac/sdk/v/0.8.1
diff --git a/packages/sdk/README.md b/packages/sdk/README.md
index a952aa3bba..c728a4a606 100644
--- a/packages/sdk/README.md
+++ b/packages/sdk/README.md
@@ -1,9 +1,3 @@
-[](https://qvac.tether.dev)
-
----
-
-**QVAC** is an open-source, cross-platform **ecosystem** for local-first, peer-to-peer **AI**. QVAC runs on [**Bare**](https://bare.pears.com) by [Holepunch](https://holepunch.to), a lightweight, cross-platform JavaScript runtime.
-
# QVAC SDK
**QVAC SDK** is the canonical entry point to develop AI applications with QVAC.
@@ -11,140 +5,73 @@
> _Part of **QVAC** ecosystem_
>
>
-> Home ā¢
-> Docs ā¢
-> Support ā¢
+> Home ā¢
+> Docs ā¢
+> Support ā¢
> Discord
->
-
-Written in TypeScript, it provides all QVAC capabilities through a unified interface while also abstracting away the complexity of running your application in a JS environment other than Bare.
-For the comprehensive reference, [see official QVAC documentation](https://qvac.tether.dev/docs).
+**QVAC SDK** is the main entry point for developing applications with QVAC. It is type-safe and exposes all QVAC capabilities through a unified interface. It runs on Node.js, [Bare runtime](https://bare.pears.com), and [Expo](https://expo.dev).
-## Requirements
+See [https://docs.qvac.tether.io/sdk/getting-started](https://docs.qvac.tether.io/sdk/getting-started) for the comprehensive QVAC documentation.
-Supported JS environments: Bare, Node.js, Expo and Bun.
+## Supported environments and installation
-## Installation
-
-```bash
-npm install @qvac/sdk
-```
+See https://docs.qvac.tether.io/sdk/getting-started/installation
-### Linux
+## Quickstart
-OS peer dependency:
+1. Create the examples workspace:
```bash
-apt install vulkan-sdk
+mkdir qvac-examples
+cd qvac-examples
+npm init -y && npm pkg set type=module
```
-### Expo
-
-1. Peer dependencies:
+2. Install the SDK:
```bash
-npm i expo-file-system react-native-bare-kit
-```
-
-2. On Android, bump `minSdkVersion` to 29, by adding `ext { minSdkVersion=29 }` to `android/build.gradle` or using `expo-build-properties`.
-
-3. Add the QVAC Expo plugin to `app.json`:
-
-```js
-export default {
- expo: {
- plugins: ["@qvac/sdk/expo-plugin"],
- },
-};
-```
-
-4. Prebuild your project to generate the native files:
-
-```bash
-npx expo prebuild
-```
-
-5. Build and run it on a **physical device**:
-
-```bash
-npx expo run:ios --device
-# or
-npx expo run:android --device
+npm install @qvac/sdk
```
-> [!IMPORTANT]
-> Due to limitations with `llamacpp`, QVAC currently does not run on emulators. You **must** use a physical device.
-
-## Usage
+3. Create the quickstart script:
```js
-import {
- completion,
- LLAMA_3_2_1B_INST_Q4_0,
- loadModel,
- downloadAsset,
- unloadModel,
- VERBOSITY,
-} from "@qvac/sdk";
+import { loadModel, LLAMA_3_2_1B_INST_Q4_0, completion, unloadModel, } from "@qvac/sdk";
try {
- // First just cache the model
- await downloadAsset({
- assetSrc: LLAMA_3_2_1B_INST_Q4_0,
- onProgress: (progress) => {
- console.log(progress);
- },
- });
- // Then load it in memory from cache
- const modelId = await loadModel({
- modelSrc: LLAMA_3_2_1B_INST_Q4_0,
- modelType: "llm",
- modelConfig: {
- device: "gpu",
- ctx_size: 2048,
- verbosity: VERBOSITY.ERROR,
- },
- });
- const history = [
- {
- role: "user",
- content: "Explain quantum computing in one sentence, use lots of emojis",
- },
- ];
- const result = completion({ modelId, history, stream: true });
- for await (const token of result.tokenStream) {
- process.stdout.write(token);
- }
- const stats = await result.stats;
- console.log("\nš Performance Stats:", stats);
- // Change `clearStorage: true` to delete cached model files
- await unloadModel({ modelId, clearStorage: false });
-} catch (error) {
- console.error("ā Error:", error);
- process.exit(1);
+ // Load a model into memory
+ const modelId = await loadModel({
+ modelSrc: LLAMA_3_2_1B_INST_Q4_0,
+ modelType: "llm",
+ onProgress: (progress) => {
+ console.log(progress);
+ },
+ });
+ // You can use the loaded model multiple times
+ const history = [
+ {
+ role: "user",
+ content: "Explain quantum computing in one sentence",
+ },
+ ];
+ const result = completion({ modelId, history, stream: true });
+ for await (const token of result.tokenStream) {
+ process.stdout.write(token);
+ }
+ // Unload model to free up system resources
+ await unloadModel({ modelId });
+}
+catch (error) {
+ console.error("ā Error:", error);
+ process.exit(1);
}
```
-## Functionalities
-
-### AI tasks
+4. Run the quickstart script:
-- Completion: LLM inference via [`llama.cpp`](https://github.com/ggml-org/llama.cpp).
-- Transcription: speech-to-text (ASR) via [`whisper.cpp`](https://github.com/ggml-org/whisper.cpp).
-- Text embeddings: via `llama.cpp`, for RAG.
-- Translation: between different languages.
-- Text-to-Speech: TTS via ONNX.
-- Multimodal: via [`llama.cpp`] ā i.e., process and understand multiple types of media within the same conversation context.
-- RAG: retrieval-augmented generation with progress streaming, cancellation, and workspace management.
-- Delegated inference: perform peer-to-peer edge inference via Holepunch stack.
-
-### Utilities
-
-- Configuration: customize SDK behavior via config files (`qvac.config.json`, `.js`, or `.ts`).
-- Logging: visibility into what's happening inside your models during loading, inference, and other operations.
-- Download Lifecycle: pause and resume model downloads.
-- Blind Relays: establish peer connections through NAT/firewalls by routing traffic through relay nodes.
-- Sharded models: download a model that is sharded into multiple parts.
+```bash
+node quickstart.js
+```
## Examples
@@ -164,161 +91,27 @@ node dist/examples/path/to/example.js
bun run examples/path/to/example.ts
```
-### Completion
-
-- `llama.cpp` with local files: [`examples/llamacpp-filesystem.ts`](examples/llamacpp-filesystem.ts)
-- `llama.cpp` with P2P registry: [`examples/llamacpp-p2p.ts`](examples/llamacpp-p2p.ts)
-- `llama.cpp` with HTTP: [`examples/llamacpp-http.ts`](examples/llamacpp-http.ts)
-- `llama.cpp` with tools/function calls: [`examples/llamacpp-native-tools.ts`](examples/llamacpp-native-tools.ts)
-- `llama.cpp` with multimodal inference: [`examples/llamacpp-multimodal.ts`](examples/llamacpp-multimodal.ts)
-- `llama.cpp` with KV cache: [`examples/kv-cache-example.ts`](examples/kv-cache-example.ts)
-
-### Transcription
-
-- `whisper.cpp` transcription: [`examples/whispercpp-filesystem.ts`](examples/whispercpp-filesystem.ts)
-- Microphone recording: [`examples/whispercpp-microphone-record.ts`](examples/whispercpp-microphone-record.ts)
-
-### Embeddings
-
-- Single and batch embeddings: [`examples/embed-p2p.ts`](examples/embed-p2p.ts)
-
-**RAG with HyperDB** (cross-platform):
-
-- Ingest (full pipeline): [`examples/rag/rag-hyperdb/ingest.ts`](examples/rag/rag-hyperdb/ingest.ts)
-- Segregated pipeline: [`examples/rag/rag-hyperdb/pipeline.ts`](examples/rag/rag-hyperdb/pipeline.ts) _(Segregated flow: chunk ā embed ā save)_
-- Workspaces: [`examples/rag/rag-hyperdb/workspaces.ts`](examples/rag/rag-hyperdb/workspaces.ts) _(Workspace lifecycle: list, close, delete)_
-- Cancellation: [`examples/rag/rag-hyperdb/cancellation.ts`](examples/rag/rag-hyperdb/cancellation.ts) _(progress + cancel)_
-
-**RAG with other backends** (desktop only):
-
-- LanceDB: [`examples/rag/rag-lancedb.ts`](examples/rag/rag-lancedb.ts)
-- ChromaDB: [`examples/rag/rag-chromadb.ts`](examples/rag/rag-chromadb.ts) _(requires ChromaDB server)_
-- SQLite-Vector: [`examples/rag/rag-sqlite.ts`](examples/rag/rag-sqlite.ts) _(SQLite-Vector WASM)_
-
-### Translation
-
-- Marian OPUS translation: [`examples/translation/translation-opus.ts`](examples/translation/translation-opus.ts)
-- Indic language translation: [`examples/translation/translation-indic.ts`](examples/translation/translation-indic.ts)
-- LLM-based translation: [`examples/translation/translation-llm.ts`](examples/translation/translation-llm.ts)
-
-### Text-to-Speech
-
-- TTS (Chatterbox): [`examples/tts/chatterbox.ts`](examples/tts/chatterbox.ts) _(voice cloning with reference audio)_
-- TTS (Supertonic): [`examples/tts/supertonic.ts`](examples/tts/supertonic.ts) _(general-purpose, no voice cloning)_
-
-### Multimodel
-
-- Load multiple models simultaneously: [`examples/multi-model-demo.ts`](examples/multi-model-demo.ts)
-
-### Delegated inference
-
-- Provider: [`examples/delegated-inference/provider.ts`](examples/delegated-inference/provider.ts)
-- Consumer: [`examples/delegated-inference/consumer.ts`](examples/delegated-inference/consumer.ts)
-
-> [!TIP]
-> Set `QVAC_HYPERSWARM_SEED` env var to ensure that the provider uses the same keypair (i.e., public key doesn't change on every run).
-
-> [!NOTE]
-> Consumer does not handle reconnection yet.
-
-### Logging
-
-Stream real-time logs from the SDK server and native addons:
-
-```ts
-import { loggingStream, SDK_LOG_ID } from "@qvac/sdk";
+## Build
-// SDK server logs (general operations)
-for await (const log of loggingStream({ id: SDK_LOG_ID })) {
- console.log(`[${log.level}] ${log.namespace}: ${log.message}`);
-}
+Use the [Bun](https://bun.sh/) package manager:
-// Addon logs per model (llamacpp, whispercpp, etc.)
-for await (const log of loggingStream({ id: modelId })) {
- console.log(`[${log.level}] ${log.namespace}: ${log.message}`);
-}
+```bash
+bun i
```
-- Log streaming: [`examples/logging-streaming.ts`](examples/logging-streaming.ts)
-- Log with custom file transport: [`examples/logging-file-transport.ts`](examples/logging-file-transport.ts)
-
-### Configuration
-
-Customize SDK behavior using a config file. The SDK auto-discovers `qvac.config.{json,js,ts}` in your project root, or you can specify a path via `QVAC_CONFIG_PATH` environment variable.
-
-**Supported formats:**
-
-- `qvac.config.json` - JSON format
-- `qvac.config.js` - JavaScript with `export default`
-- `qvac.config.ts` - TypeScript with `export default`
-
-**Available options:**
-
-| Option | Type | Default | Description |
-| ------------------------- | ---------- | ---------------- | --------------------------------------------------- |
-| `cacheDirectory` | `string` | `~/.qvac/models` | Where models and assets are stored |
-| `swarmRelays` | `string[]` | `[]` | Hyperswarm relay public keys for P2P |
-| `loggerLevel` | `string` | `"info"` | Log level: `"error"`, `"warn"`, `"info"`, `"debug"` |
-| `loggerConsoleOutput` | `boolean` | `true` | Enable/disable console output |
-| `httpDownloadConcurrency` | `number` | `3` | Max concurrent HTTP downloads for sharded models |
-| `httpConnectionTimeoutMs` | `number` | `10000` | HTTP connection timeout in milliseconds |
-
-- Config usage example: [`examples/default-config-usage.ts`](examples/default-config-usage.ts)
-
-### Download Lifecycle
-
-- Pause and resume download: [`examples/download-with-cancel.ts`](examples/download-with-cancel.ts)
-
-### Blind Relays
-
-Blind relays help establish peer connections through NAT/firewalls by routing traffic through relay nodes.
-
-- Model downloads via Hyperdrive: [`examples/download-with-blind-relays.ts`](./examples/download-with-blind-relays.ts)
-- Delegated inference: You can reuse the same pattern for delegated inference by adding `swarmRelays` to your config file before starting your provider/consumer.
-
-> [!NOTE]
-> The examples use mock relay keys. For real deployments, you **must** use your own relay servers or trusted public relays.
-
-### Sharded Models
-
-Sharded models are split into multiple files following the pattern: `-00001-of-0000X.`. The SDK automatically downloads and loads all parts with detailed progress tracking.
-
-**Supported formats:**
-
-- Archives (`.tar`, `.tar.gz`, `.tgz`): HTTP or local with automatic extraction
-- HTTP sharded URL: pass the download URL of any shard and the SDK will fetch the remaining shards
-- Hyperdrive: use any sharded Hyperdrive model source
-- Local shards: pass the path to any shard file. _(Note: All shards must be in the same directory)_
-
-See: [`examples/llamacpp-sharded.ts`](examples/llamacpp-sharded.ts)
-
-## Basic flow
-
-```mermaid
-sequenceDiagram
- participant User as User
- participant SDK as QVAC SDK
- participant RPC as RPC Client (singleton)
- participant Worker as Bare Worker (singleton)
-
- User->>SDK: Call loadModel("llama-3", options)
- SDK->>RPC: Create runtime-specific RPC client
- RPC->>Worker: Spawn worker (if needed) and send loadModel request
- Worker->>Worker: Download and load model into memory
- Worker->>RPC: Return success response
- RPC->>SDK: Model loaded successfully
- SDK->>User: loadModel() resolves with modelId
+```bash
+bun run build # or `watch` for hotreload
```
-**Note**: The example uses mock relay keys. In real deployments, you **must** use your own relay servers or trusted public relays.
-
-### Hot Config Reload
-
-Hot config reload allows you to update model configurations on-the-fly without unloading the model. Pass `modelId` (instead of `modelSrc`) to `loadModel` with the `modelType` and new `modelConfig` to apply changes instantly.
+```bash
+bun run build:pack
+```
-- Config reload using whisper: [`examples/config-reload.ts`](examples/config-reload.ts)
+This outputs a tarball under `dist/sdk-{version}.tgz` that you can install in your project, e.g.:
-**Note**: Config reload is currently supported for Whisper models. All config parameters except `contextParams` (GPU settings, flash attention) can be hot reloaded. `contextParams` are load-time only and require full model reload. More model types coming soon.
+```bash
+npm i path/to/sdk-0.3.0.tgz
+```
## Contributing
@@ -459,32 +252,3 @@ This will:
4. Generate `changelog//CHANGELOG.md`
5. Generate `changelog//breaking.md` for BC changes (with code examples)
6. Generate `changelog//api.md` for API changes (with code examples)
-
-**Note:** Requires a GitHub token (`GITHUB_TOKEN` or `GH_TOKEN` environment variable) to fetch PR metadata.
-
-## Build
-
-Use the [Bun](https://bun.sh/) package manager:
-
-```bash
-bun i
-```
-
-```bash
-bun run build # or `watch` for hotreload
-```
-
-```bash
-bun run build:pack
-```
-
-This outputs a tarball under `dist/sdk-{version}.tgz` that you can install in your project, e.g.:
-
-```bash
-npm i path/to/sdk-0.3.0.tgz
-```
-
-## More Resources
-
-- [Comprehensive documentation of this SDK](https://qvac.tether.dev/docs/sdk)
-- [Package at NPM](https://www.npmjs.com/package/@qvac/sdk)
diff --git a/packages/sdk/changelog/0.8.2/CHANGELOG.md b/packages/sdk/changelog/0.8.2/CHANGELOG.md
new file mode 100644
index 0000000000..e058add43b
--- /dev/null
+++ b/packages/sdk/changelog/0.8.2/CHANGELOG.md
@@ -0,0 +1,11 @@
+# Changelog v0.8.2
+
+Release Date: 2026-04-09
+
+## š Docs
+
+- Rewrite SDK README with streamlined quickstart and updated documentation links.
+
+## āļø Infrastructure
+
+- Freeze SDK dependency installs in publish and SDK pod CI checks.
diff --git a/packages/sdk/changelog/0.8.2/CHANGELOG_LLM.md b/packages/sdk/changelog/0.8.2/CHANGELOG_LLM.md
new file mode 100644
index 0000000000..946d4a582f
--- /dev/null
+++ b/packages/sdk/changelog/0.8.2/CHANGELOG_LLM.md
@@ -0,0 +1,26 @@
+# QVAC SDK v0.8.2 Release Notes
+
+š¦ **NPM:** https://www.npmjs.com/package/@qvac/sdk/v/0.8.2
+
+This is a maintenance release that refreshes the SDK README with a streamlined quickstart guide and updated documentation links pointing to the new docs site at docs.qvac.tether.io.
+
+---
+
+## š Documentation
+
+### README Rewrite
+
+The SDK README has been rewritten to provide a cleaner onboarding experience. The verbose installation, usage, and feature sections have been replaced with a concise quickstart that gets users running in four steps, and all documentation links now point to the new docs site.
+
+Key changes:
+
+- **Simplified quickstart** ā A minimal four-step guide (create workspace, install, write script, run) replaces the previous multi-section setup
+- **Updated links** ā Documentation URLs now point to `docs.qvac.tether.io` instead of `qvac.tether.dev`
+- **Support channel** ā The support link now points to the Discord channel instead of FeatureBase
+- **Leaner content** ā Detailed platform instructions (Expo, Linux), feature lists, and example indexes have been moved to the docs site to keep the README focused
+
+---
+
+## āļø Infrastructure
+
+- SDK dependency installs in CI publish and pod check workflows are now frozen to prevent unexpected version drift during builds.
diff --git a/packages/sdk/package.json b/packages/sdk/package.json
index a7f1b9e5a1..3af9ddd202 100644
--- a/packages/sdk/package.json
+++ b/packages/sdk/package.json
@@ -1,6 +1,6 @@
{
"name": "@qvac/sdk",
- "version": "0.8.1",
+ "version": "0.8.2",
"license": "Apache-2.0",
"repository": {
"type": "git",