Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions docs/website/content/docs/(latest)/sdk/api/close.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,9 @@ function close(): Promise<void>;

Safe to call multiple times — subsequent calls are a no-op if already closed.

## Bare direct runtime

When the SDK runs in-process on Bare (not under Node.js with a separate worker process), calling `close()` runs the same teardown as a signal-driven shutdown and then **ends the process with exit code 0**. The returned promise may not settle in that case because the process exits immediately afterward. This matches the behavior users expect after the last model is unloaded (the SDK calls `close()` automatically when nothing remains loaded).

## Returns

`Promise<void>` — Resolves when the connection is closed (Node.js and Expo). On Bare direct mode, the process usually exits before the promise resolves.
`Promise<void>` — Resolves when the connection is closed.

## Example

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ Optional sampling and generation parameters (strict — no extra keys allowed):
| timeToFirstToken | `number` | Time to first token in milliseconds |
| tokensPerSecond | `number` | Tokens generated per second |
| cacheTokens | `number` | Number of cached tokens |
| backendDevice | `"cpu" \| "gpu" \| undefined` | Compute backend used for inference |

### `ToolCallEvent`

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
title: "defineDuplexHandler( )"
titleStyle: code
description: Helper function to define a duplex (bidirectional streaming) handler with full type inference.
---

```ts
function defineDuplexHandler<TRequest extends ZodType, TResponse extends ZodType>(
definition: DuplexPluginHandlerDefinition<TRequest, TResponse>
): PluginHandlerDefinition<TRequest, TResponse>;
```

## Parameters

| Name | Type | Required? | Description |
| --- | --- | :---: | --- |
| definition | [`DuplexPluginHandlerDefinition`](#duplexpluginhandlerdefinition) | ✓ | The duplex handler definition with schemas and handler function |

### `DuplexPluginHandlerDefinition`

| Field | Type | Required? | Description |
| --- | --- | :---: | --- |
| requestSchema | `ZodType` | ✓ | Zod schema for validating incoming requests |
| responseSchema | `ZodType` | ✓ | Zod schema for validating outgoing responses |
| streaming | `true` | ✓ | Must be `true` — duplex handlers are always streaming |
| duplex | `true` | ✓ | Must be `true` — marks this handler as bidirectional |
| handler | `(request, inputStream: AsyncIterable<Buffer>) => AsyncGenerator<response>` | ✓ | The handler function — receives a validated request and an input stream, yields validated response chunks |

## Returns

`PluginHandlerDefinition<TRequest, TResponse>` — The same definition object, with full type inference applied. This is an identity function used for type checking.
107 changes: 107 additions & 0 deletions docs/website/content/docs/(latest)/sdk/api/diffusion.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: "diffusion( )"
titleStyle: code
description: Generates images using a loaded diffusion model.
---

```ts
function diffusion(params: DiffusionClientParams): {
progressStream: AsyncGenerator<DiffusionProgressTick>;
outputs: Promise<Uint8Array[]>;
stats: Promise<DiffusionStats | undefined>;
};
```

## Parameters

| Name | Type | Required? | Description |
| --- | --- | :---: | --- |
| params | [`DiffusionClientParams`](#diffusionclientparams) | ✓ | The diffusion parameters |

### `DiffusionClientParams`

| Field | Type | Required? | Description |
| --- | --- | :---: | --- |
| modelId | `string` | ✓ | The identifier of the loaded diffusion model |
| prompt | `string` | ✓ | Text prompt describing the image to generate |
| negative_prompt | `string` | ✗ | Text describing what to avoid in the generated image |
| width | `number` | ✗ | Image width in pixels (must be a multiple of 8) |
| height | `number` | ✗ | Image height in pixels (must be a multiple of 8) |
| steps | `number` | ✗ | Number of diffusion steps |
| cfg_scale | `number` | ✗ | Classifier-free guidance scale for SD 1.x / 2.x / XL / SD3 models (typical range 1–20, default 7) |
| guidance | `number` | ✗ | Distilled guidance for FLUX models (typical range 1–10, default 3.5) |
| sampling_method | [`SamplingMethod`](#samplingmethod) | ✗ | Sampling algorithm |
| scheduler | [`Scheduler`](#scheduler) | ✗ | Noise scheduler |
| seed | `number` | ✗ | Random seed for reproducibility |
| batch_count | `number` | ✗ | Number of images to generate |
| vae_tiling | `boolean` | ✗ | Enable VAE tiling for large images on limited VRAM |
| cache_preset | `string` | ✗ | Cache preset identifier |

#### `SamplingMethod`

`"euler" | "euler_a" | "heun" | "dpm2" | "dpm++2m" | "dpm++2mv2" | "dpm++2s_a" | "lcm" | "ipndm" | "ipndm_v" | "ddim_trailing" | "tcd" | "res_multistep" | "res_2s"`

#### `Scheduler`

`"discrete" | "karras" | "exponential" | "ays" | "gits" | "sgm_uniform" | "simple" | "lcm" | "smoothstep" | "kl_optimal" | "bong_tangent"`

## Returns

`object` — Object with the following fields:

| Field | Type | Description |
| --- | --- | --- |
| progressStream | `AsyncGenerator<`[`DiffusionProgressTick`](#diffusionprogresstick)`>` | Stream of generation progress ticks |
| outputs | `Promise<Uint8Array[]>` | Generated image buffers (resolves when generation completes) |
| stats | `Promise<`[`DiffusionStats`](#diffusionstats) `\| undefined>` | Performance statistics |

### `DiffusionProgressTick`

| Field | Type | Description |
| --- | --- | --- |
| step | `number` | Current diffusion step |
| totalSteps | `number` | Total number of steps |
| elapsedMs | `number` | Elapsed time in milliseconds |

### `DiffusionStats`

| Field | Type | Description |
| --- | --- | --- |
| modelLoadMs | `number \| undefined` | Model loading time in milliseconds |
| generationMs | `number \| undefined` | Single generation time in milliseconds |
| totalGenerationMs | `number \| undefined` | Total generation time in milliseconds |
| totalWallMs | `number \| undefined` | Total wall-clock time in milliseconds |
| totalSteps | `number \| undefined` | Total diffusion steps performed |
| totalGenerations | `number \| undefined` | Number of generations completed |
| totalImages | `number \| undefined` | Number of images produced |
| totalPixels | `number \| undefined` | Total pixels generated |
| width | `number \| undefined` | Output image width |
| height | `number \| undefined` | Output image height |
| seed | `number \| undefined` | Seed used for generation |

## Example

```typescript
import fs from "fs";

// Basic usage
const { outputs, stats } = diffusion({ modelId, prompt: "a cat" });
const buffers = await outputs;
fs.writeFileSync("output.png", buffers[0]);

// With progress tracking
const { progressStream, outputs: images } = diffusion({
modelId,
prompt: "a cat sitting on a windowsill",
width: 512,
height: 512,
steps: 20,
cfg_scale: 7,
});

for await (const { step, totalSteps } of progressStream) {
console.log(`${step}/${totalSteps}`);
}

const imageBuffers = await images;
```
35 changes: 26 additions & 9 deletions docs/website/content/docs/(latest)/sdk/api/embed.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@ description: Generates embeddings for a single text using a specified model.
---

```ts
function embed(params: { modelId: string; text: string }, options?: RPCOptions): Promise<number[]>;
function embed(params: { modelId: string; text: string[] }, options?: RPCOptions): Promise<number[][]>;
function embed(params: { modelId: string; text: string }, options?: RPCOptions): Promise<{ embedding: number[]; stats?: EmbedStats }>;
function embed(params: { modelId: string; text: string[] }, options?: RPCOptions): Promise<{ embedding: number[][]; stats?: EmbedStats }>;
```

## Parameters

| Name | Type | Required? | Description |
| --- | --- | :---: | --- |
| params | [`EmbedParams`](#embedparams) | ✓ | The embedding parameters |
| options | [`RPCOptions`](../shared-types/#rpcoptions) | ✗ | Optional RPC transport options |
| options | [`RPCOptions`](./shared-types#rpcoptions) | ✗ | Optional RPC transport options |

### `EmbedParams`

Expand All @@ -25,8 +25,21 @@ function embed(params: { modelId: string; text: string[] }, options?: RPCOptions

## Returns

- `Promise<number[]>` — When `text` is a single string, returns the embedding vector.
- `Promise<number[][]>` — When `text` is an array, returns an array of embedding vectors.
`Promise<object>` — Resolves to an object with the following fields:

| Field | Type | Description |
| --- | --- | --- |
| embedding | `number[] \| number[][]` | The embedding vector(s). Single `number[]` when `text` is a string; `number[][]` when `text` is an array. |
| stats | [`EmbedStats`](#embedstats) ` \| undefined` | Performance statistics |

### `EmbedStats`

| Field | Type | Description |
| --- | --- | --- |
| totalTime | `number \| undefined` | Total embedding time in milliseconds |
| tokensPerSecond | `number \| undefined` | Tokens processed per second |
| totalTokens | `number \| undefined` | Total tokens processed |
| backendDevice | `"cpu" \| "gpu" \| undefined` | Compute backend used for inference |

## Throws

Expand All @@ -38,13 +51,17 @@ function embed(params: { modelId: string; text: string[] }, options?: RPCOptions

```typescript
// Single text
const vector = await embed({ modelId: "embedding-model", text: "Hello world" });
console.log(vector.length); // e.g. 384
const { embedding, stats } = await embed({
modelId: "embedding-model",
text: "Hello world",
});
console.log(embedding.length); // e.g. 384
console.log(stats?.tokensPerSecond);

// Multiple texts (batch)
const vectors = await embed({
const { embedding: vectors } = await embed({
modelId: "embedding-model",
text: ["Hello world", "How are you?"]
text: ["Hello world", "How are you?"],
});
console.log(vectors.length); // 2
```
6 changes: 4 additions & 2 deletions docs/website/content/docs/(latest)/sdk/api/errors.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ Thrown on the client side (response validation, RPC, provider). Access via `SDK_

| Error | Code | Summary | Thrown by |
| --- | --- | --- | --- |
| `INVALID_RESPONSE_TYPE` | 50001 | Invalid response type received. | `cancel()`, `downloadAsset()`, `embed()`, `getModelInfo()`, `loadModel()`, `loggingStream()`, `ping()`, `ragDeleteEmbeddings()`, `ragSaveEmbeddings()`, `ragSearch()`, `startQVACProvider()`, `stopQVACProvider()`, `unloadModel()` |
| `INVALID_RESPONSE_TYPE` | 50001 | Invalid response type received. | `cancel()`, `downloadAsset()`, `embed()`, `finetune()`, `getModelInfo()`, `heartbeat()`, `loadModel()`, `loggingStream()`, `ragDeleteEmbeddings()`, `ragSaveEmbeddings()`, `ragSearch()`, `resume()`, `startQVACProvider()`, `stopQVACProvider()`, `suspend()`, `unloadModel()` |
| `INVALID_OPERATION_IN_RESPONSE` | 50002 | Response operation didn't match the expected RAG operation. | `ragDeleteEmbeddings()`, `ragSaveEmbeddings()`, `ragSearch()` |
| `STREAM_ENDED_WITHOUT_RESPONSE` | 50003 | Streaming RPC ended without a final response. | `downloadAsset()`, `loadModel()`, `ragDeleteEmbeddings()`, `ragSaveEmbeddings()`, `ragSearch()` |
| `STREAM_ENDED_WITHOUT_RESPONSE` | 50003 | Streaming RPC ended without a final response. | `downloadAsset()`, `finetune()`, `loadModel()`, `ragDeleteEmbeddings()`, `ragSaveEmbeddings()`, `ragSearch()` |
| `INVALID_AUDIO_CHUNK_TYPE` | 50004 | Invalid audio chunk input type provided. | `transcribe()`, `transcribeStream()` |
| `INVALID_TOOLS_ARRAY` | 50005 | Invalid tools array provided. | `completion()` |
| `INVALID_TOOL_SCHEMA` | 50006 | Invalid tool schema provided. | `completion()` |
Expand Down Expand Up @@ -126,6 +126,8 @@ Thrown by the server (model operations, downloads, cache, RAG). Access via `SDK_
| `DELEGATE_PROVIDER_ERROR` | 53702 | Delegated provider returned an error. | `completion()`, `loadModel()` |
| `RPC_NO_DATA_RECEIVED` | 53703 | No data received from request. | Internal server RPC |
| `RPC_UNKNOWN_REQUEST_TYPE` | 53704 | Unknown request type received. | Internal server RPC |
| `LIFECYCLE_SUSPEND_FAILED` | 53600 | Failed to suspend one or more resources. | `suspend()` |
| `LIFECYCLE_RESUME_FAILED` | 53601 | Failed to resume one or more resources. | `resume()` |
| `PLUGIN_NOT_FOUND` | 53850 | Plugin not found for the specified model type. | `invokePlugin()`, `invokePluginStream()`, `loadModel()` |
| `PLUGIN_HANDLER_NOT_FOUND` | 53851 | Handler not found in plugin. | `invokePlugin()`, `invokePluginStream()` |
| `PLUGIN_REQUEST_VALIDATION_FAILED` | 53852 | Plugin request validation failed. | `invokePlugin()`, `invokePluginStream()` |
Expand Down
Loading
Loading