Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions core/schemas/bifrost.go
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,7 @@ const (
BifrostContextKeyHTTPRequestType BifrostContextKey = "bifrost-http-request-type" // RequestType (set by bifrost - DO NOT SET THIS MANUALLY))
BifrostContextKeyPassthroughExtraParams BifrostContextKey = "bifrost-passthrough-extra-params" // bool
BifrostContextKeyRoutingEnginesUsed BifrostContextKey = "bifrost-routing-engines-used" // []string (set by bifrost - DO NOT SET THIS MANUALLY) - list of routing engines used ("routing-rule", "governance", "loadbalancing", etc.)
BifrostContextKeyPromptStreamRequest BifrostContextKey = "bifrost-prompt-stream-request" // bool (set by prompts HTTP plugin when prompt version model_params.stream is true and body omitted stream)
BifrostContextKeyRoutingEngineLogs BifrostContextKey = "bifrost-routing-engine-logs" // []RoutingEngineLogEntry (set by bifrost - DO NOT SET THIS MANUALLY) - list of routing engine log entries
BifrostContextKeyTransportPluginLogs BifrostContextKey = "bifrost-transport-plugin-logs" // []PluginLogEntry (transport-layer plugin logs accumulated during HTTP transport hooks)
BifrostContextKeyTransportPostHookCompleter BifrostContextKey = "bifrost-transport-posthook-completer" // func() (callback to run HTTPTransportPostHook after streaming - set by transport interceptor middleware)
Expand Down
3 changes: 2 additions & 1 deletion docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,8 @@
"group": "Prompt Repository",
"icon": "folder",
"pages": [
"features/prompt-repository/playground"
"features/prompt-repository/playground",
"features/prompt-repository/prompts-plugin"
]
},
{
Expand Down
6 changes: 6 additions & 0 deletions docs/features/prompt-repository/playground.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -229,3 +229,9 @@ With sessions you can:
- Switch between past experiments

![Sessions](../../media/prompt-repo-sessions.png)

---

## Using prompts in production

To attach committed versions to **Chat Completions** or **Responses** requests through the gateway (HTTP headers, merging, and caching behavior), see the [Prompts plugin](/features/prompt-repository/prompts-plugin).
134 changes: 134 additions & 0 deletions docs/features/prompt-repository/prompts-plugin.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
title: "Prompts plugin"
description: "Use committed prompt templates from the Prompt Repository on inference requests via HTTP headers or custom resolvers."
icon: "puzzle-piece"
---

## Overview

The **Prompts** plugin connects the [Prompt Repository](/features/prompt-repository/playground) to inference. It loads committed prompt versions from the config store and **prepends** their messages to **Chat Completions** and **Responses** requests. It also **merges model parameters** from the stored version with the incoming request (request values take precedence).

**What it does:**

- Resolves which prompt and version to apply per request (default: HTTP headers).
- Injects the version’s message history **before** the client’s messages.
- Applies the version’s `model` parameters as defaults, then overrides with whatever the client sent for the same parameters.

---

## Prerequisites

- **Config store** with Prompt Repository tables (typically **PostgreSQL**). File-backed config alone does not store prompts.
- Prompts authored and **committed as versions** in the UI or via the `/api/prompt-repo/...` HTTP API (see `docs/openapi/openapi.yaml` in the repository).
- A **prompt ID** (UUID) for each prompt you reference at runtime. You can read it from the repository API or the playground.

---

## How it works

```mermaid
flowchart TB
Client([Client]) --> Gateway[Bifrost HTTP]
Gateway --> PreHook["HTTP transport pre-hook:<br/>copy bf-prompt-id / bf-prompt-version to context"]
PreHook --> PreLLM["PreLLM hook:<br/>resolve version, merge params,<br/>prepend template messages"]
PreLLM --> Provider[Provider]
```

1. **Transport (HTTP):** Incoming headers `bf-prompt-id` and `bf-prompt-version` are copied onto the Bifrost context (header name matching is case-insensitive).
2. **Resolve:** The plugin looks up the prompt and the requested version. If **`bf-prompt-version` is omitted**, the prompt’s **latest committed version** is used.
3. **Parameters:** Version `model` parameters are merged into the request; any field already set on the request wins.
4. **Messages:** Messages from the committed version are **prepended** to `messages` (chat) or `input` (responses). Your request body adds the user turn(s) after the template.

If the prompt ID is missing, the plugin does nothing and the request passes through unchanged.

---

## HTTP headers (gateway)

| Header | Required | Description |
|--------|----------|-------------|
| `bf-prompt-id` | Yes, to enable injection | UUID of the prompt in the repository. |
| `bf-prompt-version` | No | **Integer version number** (e.g. `3` for v3). If omitted, the **latest** committed version for that prompt is used. |

Invalid or unknown IDs / versions are logged as warnings; the request is **not** failed by the plugin (it proceeds without template injection).

---

## Example: Chat Completions

Use the same JSON body as a normal chat request. Only the headers select the template.

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "bf-prompt-id: YOUR-PROMPT-UUID" \
-H "x-bf-vk: sk-bf-your-virtual-key" \
-d '{
"model": "openai/gpt-5.4",
"messages": [
{
"role": "user",
"content": "Tell me about Bifrost Gateway?"
}
]
}'
```

![Commit Version with Stream enabled in the playground](../../media/prompt-plugin-version-commit.png)

When you commit a version from the playground, **Stream** is saved in that version’s model parameters. The example `curl` above does not set `"stream": true` in the JSON body, but if the committed version was saved with streaming enabled (as in the screenshot), the merged parameters still include `stream: true`, so the request is handled as **streaming** even though the client did not send `stream` explicitly.

![LLM log for the same request showing Type: Chat Stream](../../media/prompt-plugin-llm-log.png)

In **Logs**, that run shows **Type: Chat Stream** and the full conversation: the committed **system** template, your **user** message from the request body, and the assistant reply.

The provider receives the **stored** messages from the prompt version, checks if the request is streaming or non-streaming, applies the additional model parameters from the request and prepends the messages from the prompt version followed by your user message.

---

## Example: Responses API

```bash
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-H "bf-prompt-id: YOUR-PROMPT-UUID" \
-H "bf-prompt-version: 4" \
-H "x-bf-vk: sk-bf-your-virtual-key" \
-d '{
"model": "openai/gpt-5-nano-2025-08-07",
"input": "What is Pale Blue Dot?"
}'
```

---

## Streaming

If the committed version’s **model parameters** include `"stream": true`, the plugin may set streaming on the HTTP transport so behavior matches the saved version. Client-side `stream` flags still interact with the merged parameters as usual.
Comment thread
roroghost17 marked this conversation as resolved.

Comment thread
roroghost17 marked this conversation as resolved.
---

## Cache and updates

The plugin keeps an in-memory cache of prompts and versions (loaded with a small number of store queries at startup). When you create, update, or delete prompts or versions through the **gateway APIs**, the server **reloads** that cache so new commits are visible without a full process restart.
Comment thread
roroghost17 marked this conversation as resolved.

---

## Go SDK and custom resolution

For embedded Bifrost (Go SDK), register the plugin with `prompts.Init` and a **config store** that implements the prompt tables API. The default resolver reads the same logical keys from `BifrostContext`:

- `prompts.PromptIDKey` (`bf-prompt-id`)
- `prompts.PromptVersionKey` (`bf-prompt-version`)

Set them on the context you pass to `ChatCompletion` / `Responses` if you are not going through the HTTP transport hooks.

For advanced routing (for example, choosing a prompt from governance metadata), implement `prompts.PromptResolver` in `plugins/prompts/main.go` and use **`prompts.InitWithResolver`**.

---

## Related

- [Playground](/features/prompt-repository/playground) — create folders, prompts, sessions, and committed versions.
- [Writing Go plugins](/plugins/writing-go-plugin) — plugin interfaces and lifecycle.
- Built-in plugin name in code: `prompts` (`github.com/maximhq/bifrost/plugins/prompts`).
Binary file added docs/media/prompt-plugin-llm-log.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/media/prompt-plugin-version-commit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 12 additions & 12 deletions framework/configstore/prompts.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,6 @@ func (s *RDBConfigStore) GetFolders(ctx context.Context) ([]tables.TableFolder,
if err := s.db.WithContext(ctx).
Order("created_at DESC").
Find(&folders).Error; err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
return []tables.TableFolder{}, nil
}
return nil, err
}

Expand Down Expand Up @@ -147,9 +144,6 @@ func (s *RDBConfigStore) GetPrompts(ctx context.Context, folderID *string) ([]ta
}

if err := query.Find(&prompts).Error; err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
return []tables.TablePrompt{}, nil
}
return nil, err
}

Expand Down Expand Up @@ -261,6 +255,18 @@ func (s *RDBConfigStore) DeletePrompt(ctx context.Context, id string) error {
// Prompt Repository - Versions
// ============================================================================

// GetAllPromptVersions returns every version across all prompts in a single query.
func (s *RDBConfigStore) GetAllPromptVersions(ctx context.Context) ([]tables.TablePromptVersion, error) {
var versions []tables.TablePromptVersion
if err := s.db.WithContext(ctx).
Preload("Messages", func(db *gorm.DB) *gorm.DB { return db.Order("order_index ASC") }).
Order("prompt_id ASC, version_number DESC").
Find(&versions).Error; err != nil {
return nil, err
}
return versions, nil
}

// GetPromptVersions gets all versions for a prompt
func (s *RDBConfigStore) GetPromptVersions(ctx context.Context, promptID string) ([]tables.TablePromptVersion, error) {
var versions []tables.TablePromptVersion
Expand All @@ -269,9 +275,6 @@ func (s *RDBConfigStore) GetPromptVersions(ctx context.Context, promptID string)
Where("prompt_id = ?", promptID).
Order("version_number DESC").
Find(&versions).Error; err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
return []tables.TablePromptVersion{}, nil
}
return nil, err
}
return versions, nil
Expand Down Expand Up @@ -416,9 +419,6 @@ func (s *RDBConfigStore) GetPromptSessions(ctx context.Context, promptID string)
Where("prompt_id = ?", promptID).
Order("created_at DESC").
Find(&sessions).Error; err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
return []tables.TablePromptSession{}, nil
}
return nil, err
}
return sessions, nil
Expand Down
1 change: 1 addition & 0 deletions framework/configstore/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,7 @@ type ConfigStore interface {
DeletePrompt(ctx context.Context, id string) error

// Prompt Repository - Versions
GetAllPromptVersions(ctx context.Context) ([]tables.TablePromptVersion, error)
GetPromptVersions(ctx context.Context, promptID string) ([]tables.TablePromptVersion, error)
GetPromptVersionByID(ctx context.Context, id uint) (*tables.TablePromptVersion, error)
GetLatestPromptVersion(ctx context.Context, promptID string) (*tables.TablePromptVersion, error)
Expand Down
79 changes: 79 additions & 0 deletions plugins/prompts/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
module github.com/maximhq/bifrost/plugins/prompts

go 1.26.1

require (
github.com/maximhq/bifrost/core v1.4.13
github.com/maximhq/bifrost/framework v1.2.32
github.com/stretchr/testify v1.11.1
)

require (
cloud.google.com/go v0.123.0 // indirect
cloud.google.com/go/compute/metadata v0.9.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0 // indirect
github.com/andybalholm/brotli v1.2.0 // indirect
github.com/aws/aws-sdk-go-v2 v1.41.3 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.6 // indirect
github.com/aws/aws-sdk-go-v2/config v1.32.11 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.19.11 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.5 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.7 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.19 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.16 // indirect
github.com/aws/aws-sdk-go-v2/service/s3 v1.94.0 // indirect
github.com/aws/aws-sdk-go-v2/service/signin v1.0.7 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.30.12 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.16 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.41.8 // indirect
github.com/aws/smithy-go v1.24.2 // indirect
github.com/bahlo/generic-list-go v0.2.0 // indirect
github.com/buger/jsonparser v1.1.2 // indirect
github.com/bytedance/gopkg v0.1.3 // indirect
github.com/bytedance/sonic v1.15.0 // indirect
github.com/bytedance/sonic/loader v0.5.0 // indirect
github.com/cloudwego/base64x v0.1.6 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/golang-jwt/jwt/v5 v5.3.0 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/invopop/jsonschema v0.13.0 // indirect
github.com/jinzhu/inflection v1.0.0 // indirect
github.com/jinzhu/now v1.1.5 // indirect
github.com/klauspost/compress v1.18.2 // indirect
github.com/klauspost/cpuid/v2 v2.3.0 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect
github.com/mailru/easyjson v0.9.1 // indirect
github.com/mark3labs/mcp-go v0.43.2 // indirect
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/rs/zerolog v1.34.0 // indirect
github.com/spf13/cast v1.10.0 // indirect
github.com/tidwall/gjson v1.18.0 // indirect
github.com/tidwall/match v1.1.1 // indirect
github.com/tidwall/pretty v1.2.0 // indirect
github.com/tidwall/sjson v1.2.5 // indirect
github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
github.com/valyala/bytebufferpool v1.0.0 // indirect
github.com/valyala/fasthttp v1.68.0 // indirect
github.com/wk8/go-ordered-map/v2 v2.1.8 // indirect
github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
go.starlark.net v0.0.0-20260102030733-3fee463870c9 // indirect
golang.org/x/arch v0.23.0 // indirect
golang.org/x/crypto v0.49.0 // indirect
golang.org/x/net v0.52.0 // indirect
golang.org/x/oauth2 v0.35.0 // indirect
golang.org/x/sys v0.42.0 // indirect
golang.org/x/text v0.35.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
gorm.io/gorm v1.31.1 // indirect
)
Loading
Loading