Skip to content

logging in plugins#2215

Merged
akshaydeo merged 1 commit intov1.5.0from
03-23-logging_in_plugins
Mar 24, 2026
Merged

logging in plugins#2215
akshaydeo merged 1 commit intov1.5.0from
03-23-logging_in_plugins

Conversation

@akshaydeo
Copy link
Copy Markdown
Contributor

@akshaydeo akshaydeo commented Mar 23, 2026

Summary

Reorders middleware initialization in the Bifrost HTTP server to ensure tracing middleware is added before transport interceptor middleware in the inference pipeline.

Changes

  • Moved tracing middleware initialization and setup earlier in the bootstrap process
  • Reordered middleware registration so tracing middleware is prepended before transport interceptor middleware
  • Updated comments to clarify the middleware ordering logic and rationale

The change ensures that tracing context and trace IDs are properly established before other middleware components process requests.

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Verify that tracing middleware executes before transport interceptor middleware by checking trace logs and middleware execution order.

# Core/Transports
go version
go test ./...

Test with tracing enabled to ensure trace IDs are properly set in context before subsequent middleware processing.

Screenshots/Recordings

N/A

Breaking changes

  • Yes
  • No

Related issues

N/A

Security considerations

No security implications - this is a middleware ordering change that affects observability components.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 23, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Plugin execution logs are captured per request and shown in the request details UI via a new plugin-logs view.
    • Incoming x-request-id header handling to propagate request IDs into traces.
  • Improvements

    • Plugin logs are attached to traces and persisted for later inspection.
    • More reliable delivery and retry behavior for log storage and stream completion, including improved handling of streaming post-hooks and trace completion.

Walkthrough

Adds pooled plugin-scoped Bifrost contexts and per-plugin logging; drains and attaches plugin logs to traces across transport pre/post hooks and streaming chunks; persists grouped plugin logs to DB; surfaces plugin logs in the UI via a new PluginLogsView.

Changes

Cohort / File(s) Summary
Core: Plugin scoping & context
core/schemas/context.go, core/schemas/context_test.go
Add pooled per-plugin scoped BifrostContext (WithPluginScope/ReleasePluginScope), shared plugin log store, Log/GetPluginLogs/DrainPluginLogs, delegation to root for deadline/value/err operations, and tests for scoping/drain/pool reuse.
Core: Types & trace
core/schemas/bifrost.go, core/schemas/trace.go, core/schemas/tracer.go
Add PluginLogEntry and transport context keys; add GroupPluginLogsByName; extend Trace with RequestID and PluginLogs plus thread-safe accessors; expand Tracer interface: CreateTrace(...requestID) and AttachPluginLogs.
Core: Plugin pipeline & runtime
core/bifrost.go
Cache per-plugin scoped contexts for streaming, run pre/post LLM and MCP hooks with plugin-scoped contexts (cached in streaming, per-call non-streaming), add drainAndAttachPluginLogs, drain logs at short-circuits/final stream chunk/post-hook completion, and adjust pipeline release/finalization for streaming paths.
Transport: HTTP middleware & chunking
transports/bifrost-http/handlers/middlewares.go, transports/bifrost-http/lib/config.go, transports/bifrost-http/server/server.go
Propagate x-request-id into CreateTrace; run transport pre/post/stream hooks inside plugin-scoped contexts; capture/drain transport plugin logs onto request ctx; add runTransportPostHooks and deferred completer for streaming; reorder middleware to nest TransportInterceptor inside Tracing.
Logstore: schema, migrations, store
framework/logstore/tables.go, framework/logstore/migrations.go, framework/logstore/rdb.go, framework/logstore/store.go
Add plugin_logs DB column via migration; add PluginLogs string to persisted Log model; add IsLogEntryPresent to RDBLogStore and LogStore interface.
Tracing: store & impl
framework/tracing/store.go, framework/tracing/tracer.go
Make TraceStore.CreateTrace accept optional requestID ...string, add TraceStore.SetRequestID, and implement Tracer.AttachPluginLogs to append plugin logs to a trace.
Plugins: logging plugin changes
plugins/logging/main.go, plugins/logging/writer.go, plugins/logging/operations_test.go
Support multiple pending entries per trace, add Inject(ctx, trace) to assign grouped trace.PluginLogs to pending entries and enqueue them, add pendingInjectEntries wrapper, adjust cleanup and retry/update guard logic, and add tests.
Examples: hello-world plugin
examples/plugins/hello-world/main.go
Replace inline context-key literals with package constants and emit ctx.Log entries from transport/LLM hooks for example instrumentation.
UI: plugin logs rendering & types
ui/app/workspace/logs/views/pluginLogsView.tsx, ui/app/workspace/logs/sheets/logDetailsSheet.tsx, ui/lib/types/logs.ts
Add PluginLogsView React component to parse/render grouped plugin logs JSON; wire into LogDetail sheet; add PluginLogEntry TS interface and optional plugin_logs on LogEntry.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant HTTP as HTTPMiddleware
    participant Pipeline as PluginPipeline
    participant BCtx as BifrostCtx
    participant Tracer as Tracer
    participant DB as LogStore
    participant UI as UI

    Client->>HTTP: HTTP request (may include x-request-id)
    HTTP->>BCtx: create root BifrostContext (attach requestID)
    HTTP->>Pipeline: run transport pre-hooks
    loop per plugin pre-hook
        Pipeline->>BCtx: WithPluginScope(pluginName)
        Pipeline->>Pipeline: execute transport pre-hook (ctx.Log)
        Pipeline->>BCtx: ReleasePluginScope
        BCtx->>HTTP: DrainPluginLogs -> stash transport logs
    end
    HTTP->>Pipeline: handle request / stream chunks
    alt streaming
        loop per chunk
            Pipeline->>BCtx: reuse or WithPluginScope(pluginName)
            Pipeline->>Pipeline: execute chunk hook (ctx.Log)
            alt final chunk
                Pipeline->>Pipeline: FinalizeStreamingPostHookSpans
                BCtx->>Tracer: DrainPluginLogs -> AttachPluginLogs(traceID, logs)
            end
        end
    end
    HTTP->>Pipeline: run post-hooks (or deferred completer)
    Pipeline->>BCtx: DrainPluginLogs on post-hook completion
    HTTP->>Tracer: AttachPluginLogs(traceID, transportLogs)
    Tracer->>DB: persist trace + plugin logs
    UI->>UI: parse plugin_logs JSON and render PluginLogsView
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I scoped my hops for each plugin name,

I logged each nibble, never twice the same.
Drained and attached those tiny trails of light,
then tucked them safe into the trace at night.
A rabbit smiles — pipelines purr just right.

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description addresses only middleware ordering (a minor change affecting 8 lines in server.go) but omits the substantial plugin logging infrastructure added across 16+ files totaling ~600+ lines, including context pooling, tracer modifications, database migrations, and UI components. Rewrite the description to comprehensively cover the plugin logging system implementation, context pooling, tracer interface updates, database schema changes, and UI enhancements, reducing focus on the minor middleware reordering.
Docstring Coverage ⚠️ Warning Docstring coverage is 37.93% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title "logging in plugins" is vague and overly generic, failing to capture the actual scope of changes which involve comprehensive plugin logging infrastructure, middleware reordering, tracer updates, and schema extensions. Revise the title to be more specific and descriptive of the main changes, such as "Add plugin-scoped logging with context pooling and tracer integration" or "Implement plugin execution logging infrastructure with middleware ordering fixes".

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 03-23-logging_in_plugins

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor Author

akshaydeo commented Mar 23, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@akshaydeo akshaydeo marked this pull request as ready for review March 23, 2026 00:48
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #2215

@akshaydeo akshaydeo force-pushed the 03-23-logging_in_plugins branch 3 times, most recently from 3d08dfe to 6f32327 Compare March 23, 2026 01:10
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 23, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/bifrost.go (2)

5816-5831: 🛠️ Refactor suggestion | 🟠 Major

Reset the remaining pooled references before returning this object.

streamScopedCtxs is now cleaned up, but llmPlugins, mcpPlugins, logger, and tracer are still left populated when the pipeline goes back into pluginPipelinePool. That keeps old plugin/tracer instances pinned in pooled entries.

♻️ Suggested fix
 func (p *PluginPipeline) resetPluginPipeline() {
 	p.executedPreHooks = 0
 	p.preHookErrors = p.preHookErrors[:0]
 	p.postHookErrors = p.postHookErrors[:0]
 	// Reset streaming timing accumulation
 	p.chunkCount = 0
 	if p.postHookTimings != nil {
 		clear(p.postHookTimings)
 	}
 	p.postHookPluginOrder = p.postHookPluginOrder[:0]
 	// Release cached scoped contexts for streaming
 	for _, scopedCtx := range p.streamScopedCtxs {
 		scopedCtx.ReleasePluginScope()
 	}
 	p.streamScopedCtxs = nil
+	p.llmPlugins = nil
+	p.mcpPlugins = nil
+	p.logger = nil
+	p.tracer = nil
 }

As per coding guidelines, **/*.go: Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5816 - 5831, The resetPluginPipeline method
leaves several fields populated, causing pooled PluginPipeline instances to
retain references; update PluginPipeline.resetPluginPipeline to clear all
remaining pooled references by setting llmPlugins, mcpPlugins, logger, and
tracer to their zero values (nil) in addition to the existing cleanup of
streamScopedCtxs, pre/post hook slices, timings, and counts so the object
contains no stale plugin/tracer references before it is returned to the pool.

4509-4510: ⚠️ Potential issue | 🔴 Critical

Keep the pipeline alive until the short-circuit stream is finished.

This goroutine keeps calling pipeline.RunPostLLMHooks, but tryStreamRequest returns immediately and triggers the deferred releasePluginPipeline(pipeline). The same pooled PluginPipeline can then be reset/reused by another request while this stream is still running, which can corrupt post-hook execution and cross-wire plugin logs/traces between requests.

🐛 Suggested fix
-	pipeline := bifrost.getPluginPipeline()
-	defer bifrost.releasePluginPipeline(pipeline)
+	pipeline := bifrost.getPluginPipeline()
+	releasePipeline := true
+	defer func() {
+		if releasePipeline {
+			bifrost.releasePluginPipeline(pipeline)
+		}
+	}()
 		if shortCircuit.Stream != nil {
 			outputStream := make(chan *schemas.BifrostStreamChunk)
+			releasePipeline = false

 			// Create a post hook runner cause pipeline object is put back in the pool on defer
 			pipelinePostHookRunner := func(ctx *schemas.BifrostContext, result *schemas.BifrostResponse, err *schemas.BifrostError) (*schemas.BifrostResponse, *schemas.BifrostError) {
 				resp, bifrostErr := pipeline.RunPostLLMHooks(ctx, result, err, preCount)
 				if IsFinalChunk(ctx) {
 					drainAndAttachPluginLogs(ctx)
 				}
 				return resp, bifrostErr
 			}

 			go func() {
 				defer close(outputStream)
+				defer func() {
+					pipeline.FinalizeStreamingPostHookSpans(ctx)
+					bifrost.releasePluginPipeline(pipeline)
+				}()

 				for streamMsg := range shortCircuit.Stream {

Also applies to: 4527-4534

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4509 - 4510, The pooled PluginPipeline is
released immediately after tryStreamRequest returns, which lets the same
pipeline be reused while the goroutine running pipeline.RunPostLLMHooks is still
active; change the lifecycle so the pipeline is only released after the
post-hooks finish — e.g., move the call to
bifrost.releasePluginPipeline(pipeline) into the goroutine that calls
pipeline.RunPostLLMHooks (or use a small sync.WaitGroup/channel to wait for
RunPostLLMHooks to complete) so that getPluginPipeline()/releasePluginPipeline()
bracket the entire post-hook execution triggered by tryStreamRequest; apply the
same change to the other occurrence around lines where pipeline.RunPostLLMHooks
is spawned (the block mentioned as also applying to 4527-4534).
🧹 Nitpick comments (3)
core/schemas/bifrost.go (1)

282-285: Prefer returning an empty map to avoid "null" payloads downstream.

Line 283 returns nil for empty input; if this result is marshaled, it becomes JSON null (not {}), which is harder for consumers to handle safely.

♻️ Proposed tweak
func GroupPluginLogsByName(logs []PluginLogEntry) map[string][]PluginLogEntry {
	if len(logs) == 0 {
-		return nil
+		return map[string][]PluginLogEntry{}
	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/bifrost.go` around lines 282 - 285, GroupPluginLogsByName
currently returns nil when given an empty logs slice which marshals to JSON
null; change it to return an empty map instead. Update the function
(GroupPluginLogsByName and its usage of PluginLogEntry) to initialize and return
an empty map[string][]PluginLogEntry when len(logs) == 0, and ensure the rest of
the grouping logic writes into that map rather than assuming a nil return.
ui/app/workspace/logs/views/pluginLogsView.tsx (2)

19-25: Consider memoizing the JSON parse to avoid re-parsing on every render.

The JSON.parse call runs on every render. If the parent re-renders frequently, this could be optimized.

♻️ Optional: Memoize parsing
+import { useMemo } from "react";
+
 export default function PluginLogsView({ pluginLogs }: PluginLogsViewProps) {
-	let parsed: Record<string, PluginLogEntry[]>;
-	try {
-		parsed = JSON.parse(pluginLogs);
-	} catch {
-		return null;
-	}
+	const parsed = useMemo(() => {
+		try {
+			return JSON.parse(pluginLogs) as Record<string, PluginLogEntry[]>;
+		} catch {
+			return null;
+		}
+	}, [pluginLogs]);
+
+	if (!parsed) return null;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx` around lines 19 - 25, The
component currently calls JSON.parse(pluginLogs) on every render inside
PluginLogsView; wrap the parse in a React useMemo to avoid re-parsing unless
pluginLogs changes (e.g., const parsed = useMemo(() => { try { return
JSON.parse(pluginLogs) as Record<string, PluginLogEntry[]> } catch { return null
} }, [pluginLogs])); keep the existing fallback behavior (return null when parse
fails) but use the memoized parsed variable throughout the component and ensure
useMemo is imported from React.

48-52: Add data-testid to the collapsible button for testability.

Per coding guidelines, interactive UI elements should have data-testid attributes following the pattern <entity>-<element>-<qualifier>.

♻️ Suggested fix
-			<button onClick={() => setIsOpen(!isOpen)} className="hover:bg-muted/50 flex w-full items-center gap-2 px-4 py-2 text-left text-sm">
+			<button
+				data-testid={`plugin-logs-${name}-toggle`}
+				onClick={() => setIsOpen(!isOpen)}
+				className="hover:bg-muted/50 flex w-full items-center gap-2 px-4 py-2 text-left text-sm"
+			>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx` around lines 48 - 52, The
collapsible button lacks a data-testid for tests; update the <button> that
toggles expansion (the one using onClick={() => setIsOpen(!isOpen)} and
rendering isOpen, name, and entries.length) to include a data-testid attribute
following the pattern "<entity>-<element>-<qualifier>" (for example
"plugin-logs-toggle-button" or similar consistent with the codebase) so tests
can reliably target this interactive element.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 5836-5845: The helper drainAndAttachPluginLogs currently calls
ctx.DrainPluginLogs() before checking for a tracer/traceID, which can discard
logs when GetTracerFromContext() returns no tracer/ID; change the order so you
first call GetTracerFromContext(ctx) and return early if tracer==nil or
traceID=="" or err!=nil, then call ctx.DrainPluginLogs(), check len(logs)>0, and
only then call tracer.AttachPluginLogs(traceID, logs); keep the same function
name and preserve use of GetTracerFromContext, DrainPluginLogs, and
AttachPluginLogs when locating and updating the code.

In `@core/schemas/context.go`:
- Around line 401-413: The pooled BifrostContext returned by WithPluginScope
currently only resets scoping pointers and can leak prior state (userValues,
hasDeadline, err, doneOnce, blockRestrictedWrites); update the
allocation/cleanup pattern around pluginScopePool.Get()/Put() so that any
BifrostContext taken from pluginScopePool is fully zeroed before reuse and fully
reset (clear userValues, hasDeadline, deadline, err, doneOnce,
blockRestrictedWrites, and any other per-request fields) before Put() is called;
ensure WithPluginScope rehydrates only the fields it needs (parent, done,
pluginScope, pluginLogs, valueDelegate) and that methods like GetUserValues and
Cancel delegate to valueDelegate or the root context rather than reading stale
local fields; apply the same full-reset before Put() for the other pool usage at
lines 419-428.

In `@core/schemas/trace.go`:
- Around line 49-50: The pooled Trace object isn't fully cleared: before
truncating t.PluginLogs you must zero each entry to release retained strings,
e.g. iterate over t.PluginLogs and set each element to its zero value (for
example: for i := range t.PluginLogs { t.PluginLogs[i] = PluginLog{} }) and then
truncate (t.PluginLogs = t.PluginLogs[:0]) or set to nil; apply this change in
the Trace reset/cleanup code where t.PluginLogs is modified so no stale plugin
log messages remain when the Trace is returned to the pool.

In `@examples/plugins/hello-world/main.go`:
- Around line 26-27: The code uses inline context key casts like
schemas.BifrostContextKey("hello-world-plugin-transport-pre-hook") in
ctx.SetValue/Value; define package-level typed key identifiers (e.g., type
contextKey string and const transportPreHookKey contextKey =
"hello-world-plugin-transport-pre-hook", plus other keys used at the other
locations) and replace all occurrences of schemas.BifrostContextKey("...") with
those constants in ctx.SetValue and ctx.Value so the key type is consistent and
reusable across the plugin (look for usages of ctx.SetValue, ctx.Value and
schemas.BifrostContextKey in this file).

In `@plugins/logging/main.go`:
- Around line 833-836: The current buffering uses pendingLogs.Store(traceID,
entry) which allows only one *logstore.Log per trace and causes later
PostLLMHook calls to overwrite earlier entries; update storeOrEnqueueEntry() and
any other places writing to pendingLogs (and Inject()) to buffer multiple
entries per trace by storing either a slice/map of *logstore.Log keyed by
traceID (e.g., pendingLogs.Store(traceID, []*logstore.Log{...})) or by using a
compound key of traceID+entry.ID so each entry is preserved; adjust Inject() to
iterate and flush all buffered entries for a trace (or keys matching the trace)
and remove them from pendingLogs after successful flush.
- Around line 833-839: The deferred-usage goroutine can race ahead of the insert
because traced requests store the entry in p.pendingLogs (traceID) and then
Inject() later enqueues it, but deferred usage updates still try to update DB by
requestID and get dropped; fix by ensuring deferred usage is merged into the
buffered entry or deferred until after enqueue: when storing an entry into
p.pendingLogs (p.pendingLogs.Store(traceID, entry) in the trace-path) also
initialize a mergeable usage field or container on that entry and have the
deferred-usage goroutine check p.pendingLogs for traceID and merge/update the
in-memory entry (not perform DB update) before Inject() calls p.enqueueLogEntry;
alternatively, change Inject() to drain pending usage attached to the stored
entry and only then call p.enqueueLogEntry(entry, callback); update code paths
in Inject(), the deferred-usage goroutine, and wherever requestID-based updates
occur so they consult p.pendingLogs and merge into the stored entry rather than
issuing immediate DB updates.

In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 322-325: The transport middleware currently only drains plugin
logs on the success path (using bifrostCtx.DrainPluginLogs() and
ctx.SetUserValue(schemas.BifrostContextKeyTransportPluginLogs, ...)), so logs
are lost on early returns from pre-hook or post-hook errors; update the
middleware to flush/merge plugin logs before every early return by calling
DrainPluginLogs() and setting the user value (or extract into a small helper
like drainAndAttachPluginLogs(ctx, bifrostCtx) and call it immediately before
each return point in the pre-hook and post-hook error/short-circuit branches) so
scoped logs are always attached to the request trace.
- Around line 334-339: The current completer set under
schemas.BifrostContextKeyTransportPostHookCompleter calls runTransportPostHooks
(which triggers HTTPTransportPostHook) after a streamed response ends—remove
that call so post-hooks that mutate resp are not executed for streams; instead
the completer should only perform non-mutating end-of-stream tasks (e.g., drain
scoped/log buffers) and explicitly avoid invoking runTransportPostHooks so
streaming responses rely on HTTPTransportStreamChunkHook for chunk handling;
update the lambda assigned to BifrostContextKeyTransportPostHookCompleter to
drain/flush logs (or call your existing log-drain helper) and do not call
runTransportPostHooks or other post-hook invocations.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 5816-5831: The resetPluginPipeline method leaves several fields
populated, causing pooled PluginPipeline instances to retain references; update
PluginPipeline.resetPluginPipeline to clear all remaining pooled references by
setting llmPlugins, mcpPlugins, logger, and tracer to their zero values (nil) in
addition to the existing cleanup of streamScopedCtxs, pre/post hook slices,
timings, and counts so the object contains no stale plugin/tracer references
before it is returned to the pool.
- Around line 4509-4510: The pooled PluginPipeline is released immediately after
tryStreamRequest returns, which lets the same pipeline be reused while the
goroutine running pipeline.RunPostLLMHooks is still active; change the lifecycle
so the pipeline is only released after the post-hooks finish — e.g., move the
call to bifrost.releasePluginPipeline(pipeline) into the goroutine that calls
pipeline.RunPostLLMHooks (or use a small sync.WaitGroup/channel to wait for
RunPostLLMHooks to complete) so that getPluginPipeline()/releasePluginPipeline()
bracket the entire post-hook execution triggered by tryStreamRequest; apply the
same change to the other occurrence around lines where pipeline.RunPostLLMHooks
is spawned (the block mentioned as also applying to 4527-4534).

---

Nitpick comments:
In `@core/schemas/bifrost.go`:
- Around line 282-285: GroupPluginLogsByName currently returns nil when given an
empty logs slice which marshals to JSON null; change it to return an empty map
instead. Update the function (GroupPluginLogsByName and its usage of
PluginLogEntry) to initialize and return an empty map[string][]PluginLogEntry
when len(logs) == 0, and ensure the rest of the grouping logic writes into that
map rather than assuming a nil return.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx`:
- Around line 19-25: The component currently calls JSON.parse(pluginLogs) on
every render inside PluginLogsView; wrap the parse in a React useMemo to avoid
re-parsing unless pluginLogs changes (e.g., const parsed = useMemo(() => { try {
return JSON.parse(pluginLogs) as Record<string, PluginLogEntry[]> } catch {
return null } }, [pluginLogs])); keep the existing fallback behavior (return
null when parse fails) but use the memoized parsed variable throughout the
component and ensure useMemo is imported from React.
- Around line 48-52: The collapsible button lacks a data-testid for tests;
update the <button> that toggles expansion (the one using onClick={() =>
setIsOpen(!isOpen)} and rendering isOpen, name, and entries.length) to include a
data-testid attribute following the pattern "<entity>-<element>-<qualifier>"
(for example "plugin-logs-toggle-button" or similar consistent with the
codebase) so tests can reliably target this interactive element.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 344cba51-f0b8-478a-a817-ee3e371a7937

📥 Commits

Reviewing files that changed from the base of the PR and between 31a1b3f and 6f32327.

📒 Files selected for processing (17)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/tables.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts

Comment thread core/bifrost.go
Comment thread core/schemas/context.go
Comment thread core/schemas/trace.go
Comment thread examples/plugins/hello-world/main.go Outdated
Comment thread plugins/logging/main.go Outdated
Comment thread plugins/logging/main.go
Comment thread transports/bifrost-http/handlers/middlewares.go
Comment thread transports/bifrost-http/handlers/middlewares.go
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 23, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/bifrost.go (2)

5443-5450: ⚠️ Potential issue | 🟠 Major

Put the tracer into ctx before the new MCP log-attachment paths run.

drainAndAttachPluginLogs(ctx) below uses GetTracerFromContext, but handleMCPToolExecution() never seeds BifrostContextKeyTracer the way tryRequest() and tryStreamRequest() do. Requests that already carry a trace ID on the parent context still lose their MCP plugin logs here because the tracer lookup fails.

🐛 Suggested fix
 	// Ensure request ID exists for hooks/tracing consistency
 	if _, ok := ctx.Value(schemas.BifrostContextKeyRequestID).(string); !ok {
 		ctx.SetValue(schemas.BifrostContextKeyRequestID, uuid.New().String())
 	}
+
+	tracer := bifrost.getTracer()
+	if tracer != nil {
+		ctx.SetValue(schemas.BifrostContextKeyTracer, tracer)
+	}

 	// Get plugin pipeline for MCP hooks
 	pipeline := bifrost.getPluginPipeline()

Also applies to: 5459-5470, 5528-5530

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5443 - 5450, The MCP tool execution path seeds
request IDs but fails to put the tracer into ctx before MCP log-attachment runs,
causing drainAndAttachPluginLogs (which calls GetTracerFromContext) to not find
a tracer; update handleMCPToolExecution to seed BifrostContextKeyTracer into ctx
the same way tryRequest()/tryStreamRequest() do (retrieve tracer from parent
context or create/attach one and call
ctx.SetValue(schemas.BifrostContextKeyTracer, tracer)) before calling
drainAndAttachPluginLogs and the subsequent MCP log-attachment paths; apply the
same fix at the other mentioned locations (around the blocks at lines ~5459-5470
and ~5528-5530) so the tracer lookup succeeds for plugin log draining.

4509-4579: ⚠️ Potential issue | 🔴 Critical

Don't return the pooled PluginPipeline while the short-circuit stream is still using it.

This goroutine still closes over pipeline, but tryStreamRequest() returns that pooled object before shortCircuit.Stream is drained. Another request can reuse/reset streamScopedCtxs while this goroutine is still running, and the aggregated streaming post-hook spans never get finalized on completion.

🐛 Suggested fix
 	pipeline := bifrost.getPluginPipeline()
-	defer bifrost.releasePluginPipeline(pipeline)
+	releasePipeline := true
+	defer func() {
+		if releasePipeline {
+			bifrost.releasePluginPipeline(pipeline)
+		}
+	}()

 	preReq, shortCircuit, preCount := pipeline.RunLLMPreHooks(ctx, req)
 	if shortCircuit != nil {
@@
 		if shortCircuit.Stream != nil {
 			outputStream := make(chan *schemas.BifrostStreamChunk)
+			releasePipeline = false

 			// Create a post hook runner cause pipeline object is put back in the pool on defer
 			pipelinePostHookRunner := func(ctx *schemas.BifrostContext, result *schemas.BifrostResponse, err *schemas.BifrostError) (*schemas.BifrostResponse, *schemas.BifrostError) {
 				resp, bifrostErr := pipeline.RunPostLLMHooks(ctx, result, err, preCount)
@@
 			go func() {
 				defer close(outputStream)
+				defer func() {
+					pipeline.FinalizeStreamingPostHookSpans(ctx)
+					bifrost.releasePluginPipeline(pipeline)
+				}()

 				for streamMsg := range shortCircuit.Stream {

Based on learnings, "do not release them back to their pools while asynchronous readers may still access them."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4509 - 4579, The code currently defers
bifrost.releasePluginPipeline(pipeline) immediately, but the short-circuit
stream goroutine captures and calls pipeline.RunPostLLMHooks (via
pipelinePostHookRunner) after tryStreamRequest returns, allowing the pooled
pipeline to be reused while still in use. Fix by ensuring the pipeline isn't
returned to the pool until the stream goroutine finishes: either move the
release call so it's executed after the stream branch completes, or acquire a
fresh pipeline for post-hook work inside the goroutine and release that pipeline
from within the goroutine; specifically adjust the handling around the pipeline
variable, getPluginPipeline, releasePluginPipeline, and the
pipelinePostHookRunner that calls pipeline.RunPostLLMHooks so no pooled pipeline
is released while the goroutine can still access it.
♻️ Duplicate comments (3)
core/schemas/trace.go (1)

50-50: ⚠️ Potential issue | 🟠 Major

Clear PluginLogs elements before truncating in Reset().

Line 50 truncates the slice but keeps previous string references alive in the backing array, which can leak stale plugin log data across pooled trace reuse.

🔧 Proposed fix
 func (t *Trace) Reset() {
 	t.TraceID = ""
 	t.ParentID = ""
 	t.RootSpan = nil
 	t.Spans = t.Spans[:0]
 	t.StartTime = time.Time{}
 	t.EndTime = time.Time{}
 	t.Attributes = nil
-	t.PluginLogs = t.PluginLogs[:0]
+	for i := range t.PluginLogs {
+		t.PluginLogs[i] = PluginLogEntry{}
+	}
+	t.PluginLogs = t.PluginLogs[:0]
 }
As per coding guidelines, "Always reset ALL fields of pooled objects before calling `pool.Put()` to prevent stale data from leaking to subsequent users."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/trace.go` at line 50, In Reset(), before truncating the slice,
explicitly clear each element of t.PluginLogs to remove references (e.g., loop
over t.PluginLogs and set each entry to ""), then set t.PluginLogs =
t.PluginLogs[:0]; update the Reset() method in core/schemas/trace.go (the code
that currently does t.PluginLogs = t.PluginLogs[:0]) to perform the
element-clearing step to avoid retaining stale plugin log strings when pooled
Trace instances are reused.
plugins/logging/main.go (1)

830-869: ⚠️ Potential issue | 🟠 Major

Don’t buffer traced writes as a single entry per trace.

pendingLogs.Store(traceID, entry) gives each trace one slot, so any fallback/retry rows sharing that trace overwrite each other before Inject() drains the map. Because the actual insert is now delayed until Inject(), scheduleDeferredUsageUpdate() can still run Update(requestID, ...) before that row exists and can lose deferred usage. Buffer all entries for a trace (or key by traceID + request ID), and merge deferred usage into that buffer until flush.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 830 - 869, The pendingLogs usage in
storeOrEnqueueEntry/Inject currently stores a single *logstore.Log per traceID
(pendingLogs.Store(traceID, entry)), causing overwrites; change pendingLogs to
buffer multiple entries per traceID (e.g., store a slice or map keyed by
traceID->[]*logstore.Log or traceID+requestID) so storeOrEnqueueEntry appends to
the buffer instead of replacing, ensure any deferred usage merged into that
buffered entry collection (or keyed by requestID) until Inject flushes; update
Inject to LoadAndDelete the buffered collection, iterate over all entries and
call enqueueLogEntry for each (using makePostWriteCallback as before), and keep
scheduleDeferredUsageUpdate logic referring to the buffered entries so updates
aren't lost.
core/schemas/context.go (1)

407-428: ⚠️ Potential issue | 🟠 Major

Scoped wrappers still behave like partially independent contexts.

done is borrowed from the root, but Cancel() still uses wrapper-local doneOnce/err, so canceling a scoped context can close the shared Done() channel without setting the root error. Also, because the wrapper points at bc.parent, GetUserValues() / GetParentCtxWithUserValues() skip values stored on the root, and BlockRestrictedWrites() only toggles the pooled wrapper's flag while writes go through valueDelegate. On top of that, ReleasePluginScope() only clears the scoping pointers before pooling. Fully zero the wrapper before pluginScopePool.Put(), and delegate the remaining stateful methods to valueDelegate as well.

♻️ Minimal reset fix
func (bc *BifrostContext) ReleasePluginScope() {
	if bc.valueDelegate == nil {
		return // not a scoped context
	}
-	bc.parent = nil
-	bc.done = nil
-	bc.pluginScope = nil
-	bc.pluginLogs = nil
-	bc.valueDelegate = nil
+	*bc = BifrostContext{}
	pluginScopePool.Put(bc)
}

As per coding guidelines, "Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users." and "Writes to reserved BifrostContext keys (governance IDs, retry counts, etc.) are silently dropped when BlockRestrictedWrites() is active."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/context.go` around lines 407 - 428, ReleasePluginScope currently
only clears a subset of fields and returns the wrapper to pluginScopePool, which
can leak state; also several methods
(Cancel/CancelWithErr/Done/Err/GetUserValues/GetParentCtxWithUserValues/BlockRestrictedWrites)
operate on wrapper-local fields instead of delegating to valueDelegate, causing
shared-root invariants to break. Fix by fully zeroing every field on the
BifrostContext wrapper (parent, done, doneOnce, err, pluginScope, pluginLogs,
valueDelegate and any other struct fields) before calling
pluginScopePool.Put(bc), and update the stateful methods named above to check if
valueDelegate != nil and forward calls to valueDelegate (delegation) instead of
using wrapper-local state; keep wrapper-local behavior only when valueDelegate
is nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/context.go`:
- Around line 401-405: The root BifrostContext.pluginLogs pointer is accessed
concurrently (see WithPluginScope and DrainPluginLogs) and must be synchronized:
add a mutex (or use atomic.Pointer) on BifrostContext to protect reads/writes to
pluginLogs, ensure allocation via pluginLogStorePool.Get() is done while holding
the lock and capture the resulting store reference into a local variable under
that guard before assigning to bc.pluginLogs and before handing that reference
out to scoped contexts in WithPluginScope; apply the same guarded access pattern
to the other methods mentioned (the blocks around the other pluginLogs
reads/writes) so races cannot produce multiple distinct stores or lost drains.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx`:
- Around line 20-23: The parsed result from JSON.parse(pluginLogs) must be
validated before use because a parseable payload may not match Record<string,
PluginLogEntry[]>; update the parsing logic around the parsed variable (and
other parse sites handling pluginLogs) to: check that parsed is a plain object,
iterate its keys and ensure each value is an Array (Array.isArray) and that each
array item has the expected PluginLogEntry shape (presence/types of required
properties used later), and if validation fails replace parsed with an empty
object or safe default so the later usage that spreads [...entries] cannot
throw; reference the parsed variable, the pluginLogs source, and the
PluginLogEntry shape when adding these guards.
- Around line 48-52: The toggle button that calls setIsOpen and renders
isOpen/name/entries needs a data-testid attribute per convention; add a
data-testid to that button following the pattern entity-element-qualifier (for
example: plugin-logs-toggle-<plugin-name> where <plugin-name> is the kebab-cased
name) so test selectors can reliably target this interactive element.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 5443-5450: The MCP tool execution path seeds request IDs but fails
to put the tracer into ctx before MCP log-attachment runs, causing
drainAndAttachPluginLogs (which calls GetTracerFromContext) to not find a
tracer; update handleMCPToolExecution to seed BifrostContextKeyTracer into ctx
the same way tryRequest()/tryStreamRequest() do (retrieve tracer from parent
context or create/attach one and call
ctx.SetValue(schemas.BifrostContextKeyTracer, tracer)) before calling
drainAndAttachPluginLogs and the subsequent MCP log-attachment paths; apply the
same fix at the other mentioned locations (around the blocks at lines ~5459-5470
and ~5528-5530) so the tracer lookup succeeds for plugin log draining.
- Around line 4509-4579: The code currently defers
bifrost.releasePluginPipeline(pipeline) immediately, but the short-circuit
stream goroutine captures and calls pipeline.RunPostLLMHooks (via
pipelinePostHookRunner) after tryStreamRequest returns, allowing the pooled
pipeline to be reused while still in use. Fix by ensuring the pipeline isn't
returned to the pool until the stream goroutine finishes: either move the
release call so it's executed after the stream branch completes, or acquire a
fresh pipeline for post-hook work inside the goroutine and release that pipeline
from within the goroutine; specifically adjust the handling around the pipeline
variable, getPluginPipeline, releasePluginPipeline, and the
pipelinePostHookRunner that calls pipeline.RunPostLLMHooks so no pooled pipeline
is released while the goroutine can still access it.

---

Duplicate comments:
In `@core/schemas/context.go`:
- Around line 407-428: ReleasePluginScope currently only clears a subset of
fields and returns the wrapper to pluginScopePool, which can leak state; also
several methods
(Cancel/CancelWithErr/Done/Err/GetUserValues/GetParentCtxWithUserValues/BlockRestrictedWrites)
operate on wrapper-local fields instead of delegating to valueDelegate, causing
shared-root invariants to break. Fix by fully zeroing every field on the
BifrostContext wrapper (parent, done, doneOnce, err, pluginScope, pluginLogs,
valueDelegate and any other struct fields) before calling
pluginScopePool.Put(bc), and update the stateful methods named above to check if
valueDelegate != nil and forward calls to valueDelegate (delegation) instead of
using wrapper-local state; keep wrapper-local behavior only when valueDelegate
is nil.

In `@core/schemas/trace.go`:
- Line 50: In Reset(), before truncating the slice, explicitly clear each
element of t.PluginLogs to remove references (e.g., loop over t.PluginLogs and
set each entry to ""), then set t.PluginLogs = t.PluginLogs[:0]; update the
Reset() method in core/schemas/trace.go (the code that currently does
t.PluginLogs = t.PluginLogs[:0]) to perform the element-clearing step to avoid
retaining stale plugin log strings when pooled Trace instances are reused.

In `@plugins/logging/main.go`:
- Around line 830-869: The pendingLogs usage in storeOrEnqueueEntry/Inject
currently stores a single *logstore.Log per traceID (pendingLogs.Store(traceID,
entry)), causing overwrites; change pendingLogs to buffer multiple entries per
traceID (e.g., store a slice or map keyed by traceID->[]*logstore.Log or
traceID+requestID) so storeOrEnqueueEntry appends to the buffer instead of
replacing, ensure any deferred usage merged into that buffered entry collection
(or keyed by requestID) until Inject flushes; update Inject to LoadAndDelete the
buffered collection, iterate over all entries and call enqueueLogEntry for each
(using makePostWriteCallback as before), and keep scheduleDeferredUsageUpdate
logic referring to the buffered entries so updates aren't lost.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 53cf8d9d-bdc4-4dfa-a8fd-315d6f647879

📥 Commits

Reviewing files that changed from the base of the PR and between 31a1b3f and 6f32327.

📒 Files selected for processing (17)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/tables.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts

Comment thread core/schemas/context.go
Comment thread ui/app/workspace/logs/views/pluginLogsView.tsx
Comment thread ui/app/workspace/logs/views/pluginLogsView.tsx Outdated
@coderabbitai coderabbitai Bot requested a review from danpiths March 23, 2026 01:53
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/bifrost.go (1)

5817-5831: 🛠️ Refactor suggestion | 🟠 Major

Fully scrub PluginPipeline before putting it back in the pool.

This reset releases streamScopedCtxs, but the pooled instance still retains llmPlugins, mcpPlugins, logger, tracer, and the backing storage for the error/order collections. Nil out the remaining fields before pool.Put() so stale plugin/tracer state is not kept alive across requests.

♻️ Suggested cleanup
 func (p *PluginPipeline) resetPluginPipeline() {
 	p.executedPreHooks = 0
-	p.preHookErrors = p.preHookErrors[:0]
-	p.postHookErrors = p.postHookErrors[:0]
+	p.preHookErrors = nil
+	p.postHookErrors = nil
 	// Reset streaming timing accumulation
 	p.chunkCount = 0
-	if p.postHookTimings != nil {
-		clear(p.postHookTimings)
-	}
-	p.postHookPluginOrder = p.postHookPluginOrder[:0]
+	p.postHookTimings = nil
+	p.postHookPluginOrder = nil
 	// Release cached scoped contexts for streaming
 	for _, scopedCtx := range p.streamScopedCtxs {
 		scopedCtx.ReleasePluginScope()
 	}
 	p.streamScopedCtxs = nil
+	p.llmPlugins = nil
+	p.mcpPlugins = nil
+	p.logger = nil
+	p.tracer = nil
 }
As per coding guidelines, "Always reset ALL fields of pooled objects before calling `pool.Put()` to prevent stale data from leaking to subsequent users."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5817 - 5831, The resetPluginPipeline method
currently only clears some fields; fully scrub the PluginPipeline before
returning it to the pool by nil-ing or zeroing all remaining state: set
llmPlugins and mcpPlugins to nil, set logger and tracer to nil, set
preHookErrors, postHookErrors, postHookPluginOrder to nil (not just [:0]) to
drop backing arrays, clear and nil postHookTimings (or reinitialize to nil) to
release map backing storage, and nil any other retained references on the
PluginPipeline struct so no stale plugin/tracer state is kept alive across
requests; make these changes in resetPluginPipeline (the method on
PluginPipeline) before pool.Put().
♻️ Duplicate comments (9)
core/schemas/trace.go (1)

49-50: ⚠️ Potential issue | 🟠 Major

Clear PluginLogs entries before truncating the pooled slice.

t.PluginLogs = t.PluginLogs[:0] keeps prior log strings reachable through the backing array, so reused Trace instances can retain stale plugin log payloads.

Suggested fix
 	t.Attributes = nil
-	t.PluginLogs = t.PluginLogs[:0]
+	for i := range t.PluginLogs {
+		t.PluginLogs[i] = PluginLogEntry{}
+	}
+	t.PluginLogs = t.PluginLogs[:0]
 }

As per coding guidelines, "Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/trace.go` around lines 49 - 50, The Trace reset leaves
PluginLogs' backing array reachable by slicing to zero length; replace the
truncate with clearing the slice backing storage (e.g., set t.PluginLogs = nil)
so previous plugin log strings are not retained, and ensure this niling occurs
alongside resetting t.Attributes and before putting the Trace back into the pool
(look for the Trace type/reset function and pool.Put usage to update).
ui/app/workspace/logs/views/pluginLogsView.tsx (2)

48-52: ⚠️ Potential issue | 🟡 Minor

Add the required data-testid to the new toggle button.

This introduces a new interactive control without the selector hook the UI guidelines require. Use a stable 3-part id such as plugin-logs-toggle-<slug>.

As per coding guidelines, "Add new interactive UI elements with data-testid attributes following the pattern: data-testid=\"<entity>-<element>-<qualifier>\"."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx` around lines 48 - 52, The
toggle button for the plugin logs list is missing the required data-testid; add
a stable 3-part test id to the <button> (the element using setIsOpen/isOpen and
rendering name and entries) by setting
data-testid={`plugin-logs-toggle-${slug}`}; if a slug prop already exists, use
that slug, otherwise compute a stable slug (e.g., lowercased, hyphenated from
name) into a local variable and use it in the data-testid on the button.

20-35: ⚠️ Potential issue | 🟠 Major

Validate the parsed plugin-log payload before indexing into it.

JSON.parse succeeding does not guarantee Record<string, PluginLogEntry[]>. If one persisted row contains a parseable object with non-array values, parsed[name] becomes invalid and the spread at Line 44 will throw, blanking the details sheet.

Suggested fix
 export default function PluginLogsView({ pluginLogs }: PluginLogsViewProps) {
 	let parsed: Record<string, PluginLogEntry[]>;
 	try {
-		parsed = JSON.parse(pluginLogs);
+		const raw = JSON.parse(pluginLogs) as unknown;
+		if (!raw || typeof raw !== "object" || Array.isArray(raw)) {
+			return null;
+		}
+		parsed = Object.fromEntries(
+			Object.entries(raw).filter(
+				([, value]): value is PluginLogEntry[] =>
+					Array.isArray(value) &&
+					value.every(
+						(entry) =>
+							entry !== null &&
+							typeof entry === "object" &&
+							typeof entry.plugin_name === "string" &&
+							typeof entry.level === "string" &&
+							typeof entry.message === "string" &&
+							typeof entry.timestamp === "number",
+					),
+			),
+		) as Record<string, PluginLogEntry[]>;
 	} catch {
 		return null;
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx` around lines 20 - 35, The
parsed JSON may not be a Record<string, PluginLogEntry[]> so validate and
sanitize before indexing: after JSON.parse(pluginLogs) check that parsed is an
object and then build pluginNames by filtering Object.keys(parsed) to only
include names where Array.isArray(parsed[name]) (optionally further validate
array items' shape) and map invalid or non-array entries to an empty array or
skip them; then pass sanitized entries to <PluginSection name={name}
entries={parsed[name]}> to avoid the spread/lookup throwing in PluginSection.
plugins/logging/main.go (2)

833-836: ⚠️ Potential issue | 🟠 Major

Don't reuse pendingLogs as a single traceID slot.

pendingLogs already holds request-scoped *PendingLogData earlier in the flow. Storing one *logstore.Log by traceID here means fallback/retry attempts on the same trace overwrite each other, and Inject() only persists the last one. Please buffer a collection per trace, or key by both traceID and entry.ID.

Also applies to: 850-868

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 833 - 836, The current storing of a
single *logstore.Log into pendingLogs under traceID (in the block that reads
ctx.Value(schemas.BifrostContextKeyTraceID) and calls
p.pendingLogs.Store(traceID, entry)) overwrites concurrent/fallback log attempts
for the same trace; change pendingLogs to buffer multiple entries per trace
(e.g., store a slice or map of entries keyed by entry.ID under each traceID) or
use a composite key (traceID + entry.ID) when calling p.pendingLogs.Store so
each log is unique, and update the matching retrieval logic used by Inject() to
iterate/persist all buffered plugin_logs for that trace instead of only the last
stored value (adjust any types around PendingLogData, pendingLogs store/Load,
and Inject() accordingly).

832-840: ⚠️ Potential issue | 🟠 Major

Deferred usage updates race the deferred insert.

For traced requests storeOrEnqueueEntry() no longer inserts immediately, but scheduleDeferredUsageUpdate() still updates by requestID as soon as phase-B usage arrives. If that happens before Inject() enqueues the row, token usage and cost data are dropped. Merge deferred usage into the buffered entry, or delay the DB update until after the insert is queued.

Also applies to: 845-868

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 832 - 840, The deferred usage update
races the deferred insert because storeOrEnqueueEntry stores the log in
p.pendingLogs for a traceID but scheduleDeferredUsageUpdate updates DB by
requestID immediately; change scheduleDeferredUsageUpdate and
storeOrEnqueueEntry/Inject logic so deferred usage is merged into the buffered
entry instead of writing separately or delayed until after enqueue:
specifically, when scheduleDeferredUsageUpdate(requestID...) runs, check
p.pendingLogs for the traceID key and merge token usage/cost into that
*logstore.Log* before any DB write, or if not present, only then perform the DB
update; alternatively make Inject (or enqueueLogEntry) signal insertion
completion (e.g., run the existing callback) and have
scheduleDeferredUsageUpdate wait for that signal before updating. Ensure merging
logic updates the same fields used by Inject/enqueueLogEntry and that
scheduleDeferredUsageUpdate references the same traceID/requestID mapping used
in storeOrEnqueueEntry and Inject.
transports/bifrost-http/handlers/middlewares.go (2)

305-325: ⚠️ Potential issue | 🟠 Major

Drain transport plugin logs on every short-circuit path.

Only the happy path reaches DrainPluginLogs() right now. If a pre-hook returns an error/response, or a post-hook returns an error, the scoped logs emitted up to that point never make it into BifrostContextKeyTransportPluginLogs, so traces miss the exact failure-path data this change is trying to preserve. Please extract the merge/drain into a helper and call it immediately before each early return.

Also applies to: 366-379

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/middlewares.go` around lines 305 - 325, The
pre/post-hook short-circuit paths are not draining plugin logs before returning,
so extract a helper (e.g., drainAndAttachPluginLogs(bifrostCtx, ctx)) that calls
bifrostCtx.DrainPluginLogs(), and if non-empty sets
ctx.SetUserValue(schemas.BifrostContextKeyTransportPluginLogs, logs); then
invoke this helper immediately before every early return in the HTTP pre-hook
and post-hook handling (locations around
HTTPTransportPreHook/HTTPTransportPostHook usage, before the error short-circuit
that sets ctx.SetStatusCode/SetBodyString and before the resp short-circuit that
calls applyHTTPResponseToCtx), ensuring pluginCtx.ReleasePluginScope() remains
called where appropriate.

334-339: ⚠️ Potential issue | 🟠 Major

Don't run HTTPTransportPostHook from the streaming completer.

core/schemas/plugin.go explicitly says streaming responses should use HTTPTransportStreamChunkHook instead. By the time this callback runs, the stream has already finished, so any HTTPTransportPostHook mutation is both contract-breaking and too late to affect the client. The deferred callback should flush logs / non-mutating cleanup only.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/middlewares.go` around lines 334 - 339, The
deferred completer stored under
schemas.BifrostContextKeyTransportPostHookCompleter currently calls
runTransportPostHooks (which runs HTTPTransportPostHook mutations after a stream
ends); change it so the deferred callback does not invoke runTransportPostHooks
or any mutating post-hook (HTTPTransportPostHook) — instead implement a
non-mutating flush/cleanup callback (e.g., flush logs, release resources) that
is safe to run after streaming completion; keep streaming mutations
responsibility with HTTPTransportStreamChunkHook in the streaming handler and
ensure the new callback only performs non-mutating cleanup.
core/schemas/context.go (2)

401-428: ⚠️ Potential issue | 🟠 Major

Fully zero scoped contexts before returning them to pluginScopePool.

ReleasePluginScope only clears a few pointers, so the next borrower can inherit stale userValues, deadline/error state, doneOnce, and blockRestrictedWrites. That is observable here because methods like GetUserValues() and Cancel() still read receiver-local state.

♻️ Minimal fix
func (bc *BifrostContext) WithPluginScope(name *string) *BifrostContext {
	if bc.pluginLogs == nil {
		bc.pluginLogs = pluginLogStorePool.Get().(*pluginLogStore)
	}

	scoped := pluginScopePool.Get().(*BifrostContext)
-	scoped.parent = bc.parent
-	scoped.done = bc.done
-	scoped.pluginScope = name
-	scoped.pluginLogs = bc.pluginLogs
-	scoped.valueDelegate = bc
+	*scoped = BifrostContext{
+		parent:        bc.parent,
+		done:          bc.done,
+		pluginScope:   name,
+		pluginLogs:    bc.pluginLogs,
+		valueDelegate: bc,
+	}
	return scoped
}

func (bc *BifrostContext) ReleasePluginScope() {
	if bc.valueDelegate == nil {
		return
	}
-	bc.parent = nil
-	bc.done = nil
-	bc.pluginScope = nil
-	bc.pluginLogs = nil
-	bc.valueDelegate = nil
+	*bc = BifrostContext{}
	pluginScopePool.Put(bc)
}

As per coding guidelines, "Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/context.go` around lines 401 - 428, ReleasePluginScope currently
clears only a subset of BifrostContext fields and can leak state; update
ReleasePluginScope (the method on BifrostContext) to clear all fields before
putting the object back into pluginScopePool: explicitly set userValues = nil,
deadline = zero value, err = nil, reset doneOnce (call its Reset method if
available), set blockRestrictedWrites = false, and clear any other
receiver-local state in addition to
parent/done/pluginScope/pluginLogs/valueDelegate, then call
pluginScopePool.Put(bc).

401-405: ⚠️ Potential issue | 🟠 Major

Make pluginLogs ownership root-only and synchronized.

WithPluginScope and DrainPluginLogs both mutate bc.pluginLogs directly. That allows concurrent first-use/drain races, and it also lets a scoped context return the shared store to pluginLogStorePool while the root still points at it. Please resolve the store on the root context under a guard and have scoped callers delegate ownership-changing operations like DrainPluginLogs() to valueDelegate.

Also applies to: 467-481

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/context.go` around lines 401 - 405, The BifrostContext methods
WithPluginScope and DrainPluginLogs currently mutate bc.pluginLogs unsafely;
make pluginLogs root-only and thread-safe by adding a guard (e.g., a mutex on
BifrostContext) and ensure only the root context initializes or returns
pluginLogs to pluginLogStorePool under that lock. Change WithPluginScope so it
does not mutate a scoped context's pluginLogs; instead resolve/initialize the
store on the root under the mutex and have scoped contexts hold a reference but
not own it. Move ownership-changing operations (like DrainPluginLogs) onto the
root via valueDelegate (call valueDelegate.DrainPluginLogs() from scoped
contexts) so only the root can return the store to pluginLogStorePool; update
DrainPluginLogs and WithPluginScope to use the new mutex and delegate ownership
changes to valueDelegate.
🧹 Nitpick comments (2)
framework/tracing/tracer.go (1)

334-344: Move the Stop doc block back onto Stop.

Lines 332-333 still describe Stop, but they now sit immediately above AttachPluginLogs. Godoc will attach that text to AttachPluginLogs, and Stop ends up undocumented.

♻️ Proposed cleanup
-// Stop stops the tracer and releases its resources.
-// This stops the internal TraceStore's cleanup goroutine.
 // AttachPluginLogs appends plugin log entries to the trace identified by traceID.
 func (t *Tracer) AttachPluginLogs(traceID string, logs []schemas.PluginLogEntry) {
 	if len(logs) == 0 || traceID == "" {
 		return
@@
 	trace.AppendPluginLogs(logs)
 }
 
+// Stop stops the tracer and releases its resources.
+// This stops the internal TraceStore's cleanup goroutine.
 func (t *Tracer) Stop() {
 	if t.store != nil {
 		t.store.Stop()
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/tracing/tracer.go` around lines 334 - 344, The doc comment
currently intended for Stop was left above AttachPluginLogs and is being
associated with AttachPluginLogs; move the comment block that documents Stop so
it sits immediately above the Stop method declaration (not above
AttachPluginLogs), ensuring the Stop function (e.g., func (t *Tracer) Stop(...)
) has its original doc comment and AttachPluginLogs retains only its own comment
(or none) so godoc attaches the correct documentation to each function.
core/schemas/tracer.go (1)

114-116: Consider extracting AttachPluginLogs to an optional interface for forward compatibility.

While the method is correctly implemented in both NoOpTracer and the production tracer, expanding an exported interface is technically a breaking change for any downstream custom implementations. Since all implementations are internal and updated, this works today, but for a shipped product like Bifrost, an optional interface pattern (with a type assertion at call sites) would better protect against accidental breaks if custom tracers are added in the future.

For example:

type PluginLogAttacher interface {
    AttachPluginLogs(traceID string, logs []PluginLogEntry)
}

Then use a type assertion before calling: if pla, ok := tracer.(PluginLogAttacher); ok { pla.AttachPluginLogs(...) }.

This is a low-priority refactor—the current implementation is correct and functional. Consider for future extensibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/tracer.go` around lines 114 - 116, Extract AttachPluginLogs into
an optional interface and change call sites to use a type assertion: add a new
exported interface PluginLogAttacher with AttachPluginLogs(traceID string, logs
[]PluginLogEntry), keep existing implementations (NoOpTracer and the production
tracer) unchanged, and update callers that currently call AttachPluginLogs
directly to first check if tracer implements PluginLogAttacher (e.g., if pla, ok
:= tracer.(PluginLogAttacher); ok { pla.AttachPluginLogs(...) }) so the core
Tracer interface need not be expanded and external/custom tracers remain
backward-compatible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 4529-4533: The pipeline is being released too early while
pipelinePostHookRunner closes over pipeline; modify tryStreamRequest’s
short-circuit handling so the finalize/release
(bifrost.releasePluginPipeline(pipeline) and any finalize of aggregated
post-hook spans) is moved inside the goroutine that drains shortCircuit.Stream,
ensuring pipeline remains alive until that goroutine finishes; ensure
pipelinePostHookRunner calls (pipeline.RunPostLLMHooks) and
drainAndAttachPluginLogs/IsFinalChunk logic run before releasing the pipeline
and that any deferred release at the parent scope is removed to avoid
double-release.

In `@core/schemas/context_test.go`:
- Around line 305-307: Replace the raw string context key with a dedicated named
key type and use that key when calling scoped.SetValue and ctx.Value;
specifically, add a small test-only type (e.g. type testContextKey struct{} or
type contextKey string) and a package-scoped variable/const (e.g. testKey) and
change the calls from SetValue("custom-key", ...) and ctx.Value("custom-key") to
SetValue(testKey, ...) and ctx.Value(testKey) so the test follows the guideline
that context keys must be a dedicated named type; keep references to the
existing symbols SetValue, Value, scoped and ctx when making the change.

In `@examples/plugins/hello-world/main.go`:
- Line 28: The code uses inline context key casts like
schemas.BifrostContextKey("string-literal") (e.g., in calls to ctx.SetValue and
ctx.Value) which violates the no raw context keys rule; define typed constants
(e.g., helloWorldPluginTransportPreHookKey,
helloWorldPluginTransportPostHookKey, helloWorldPluginPreHookKey of type
schemas.BifrostContextKey) and replace all inline
schemas.BifrostContextKey("...") occurrences in functions such as the transport
pre/post hooks and pre-hook handlers with these constants so ctx.SetValue(...)
and ctx.Value(...) use the named constants instead of raw string casts.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 5817-5831: The resetPluginPipeline method currently only clears
some fields; fully scrub the PluginPipeline before returning it to the pool by
nil-ing or zeroing all remaining state: set llmPlugins and mcpPlugins to nil,
set logger and tracer to nil, set preHookErrors, postHookErrors,
postHookPluginOrder to nil (not just [:0]) to drop backing arrays, clear and nil
postHookTimings (or reinitialize to nil) to release map backing storage, and nil
any other retained references on the PluginPipeline struct so no stale
plugin/tracer state is kept alive across requests; make these changes in
resetPluginPipeline (the method on PluginPipeline) before pool.Put().

---

Duplicate comments:
In `@core/schemas/context.go`:
- Around line 401-428: ReleasePluginScope currently clears only a subset of
BifrostContext fields and can leak state; update ReleasePluginScope (the method
on BifrostContext) to clear all fields before putting the object back into
pluginScopePool: explicitly set userValues = nil, deadline = zero value, err =
nil, reset doneOnce (call its Reset method if available), set
blockRestrictedWrites = false, and clear any other receiver-local state in
addition to parent/done/pluginScope/pluginLogs/valueDelegate, then call
pluginScopePool.Put(bc).
- Around line 401-405: The BifrostContext methods WithPluginScope and
DrainPluginLogs currently mutate bc.pluginLogs unsafely; make pluginLogs
root-only and thread-safe by adding a guard (e.g., a mutex on BifrostContext)
and ensure only the root context initializes or returns pluginLogs to
pluginLogStorePool under that lock. Change WithPluginScope so it does not mutate
a scoped context's pluginLogs; instead resolve/initialize the store on the root
under the mutex and have scoped contexts hold a reference but not own it. Move
ownership-changing operations (like DrainPluginLogs) onto the root via
valueDelegate (call valueDelegate.DrainPluginLogs() from scoped contexts) so
only the root can return the store to pluginLogStorePool; update DrainPluginLogs
and WithPluginScope to use the new mutex and delegate ownership changes to
valueDelegate.

In `@core/schemas/trace.go`:
- Around line 49-50: The Trace reset leaves PluginLogs' backing array reachable
by slicing to zero length; replace the truncate with clearing the slice backing
storage (e.g., set t.PluginLogs = nil) so previous plugin log strings are not
retained, and ensure this niling occurs alongside resetting t.Attributes and
before putting the Trace back into the pool (look for the Trace type/reset
function and pool.Put usage to update).

In `@plugins/logging/main.go`:
- Around line 833-836: The current storing of a single *logstore.Log into
pendingLogs under traceID (in the block that reads
ctx.Value(schemas.BifrostContextKeyTraceID) and calls
p.pendingLogs.Store(traceID, entry)) overwrites concurrent/fallback log attempts
for the same trace; change pendingLogs to buffer multiple entries per trace
(e.g., store a slice or map of entries keyed by entry.ID under each traceID) or
use a composite key (traceID + entry.ID) when calling p.pendingLogs.Store so
each log is unique, and update the matching retrieval logic used by Inject() to
iterate/persist all buffered plugin_logs for that trace instead of only the last
stored value (adjust any types around PendingLogData, pendingLogs store/Load,
and Inject() accordingly).
- Around line 832-840: The deferred usage update races the deferred insert
because storeOrEnqueueEntry stores the log in p.pendingLogs for a traceID but
scheduleDeferredUsageUpdate updates DB by requestID immediately; change
scheduleDeferredUsageUpdate and storeOrEnqueueEntry/Inject logic so deferred
usage is merged into the buffered entry instead of writing separately or delayed
until after enqueue: specifically, when
scheduleDeferredUsageUpdate(requestID...) runs, check p.pendingLogs for the
traceID key and merge token usage/cost into that *logstore.Log* before any DB
write, or if not present, only then perform the DB update; alternatively make
Inject (or enqueueLogEntry) signal insertion completion (e.g., run the existing
callback) and have scheduleDeferredUsageUpdate wait for that signal before
updating. Ensure merging logic updates the same fields used by
Inject/enqueueLogEntry and that scheduleDeferredUsageUpdate references the same
traceID/requestID mapping used in storeOrEnqueueEntry and Inject.

In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 305-325: The pre/post-hook short-circuit paths are not draining
plugin logs before returning, so extract a helper (e.g.,
drainAndAttachPluginLogs(bifrostCtx, ctx)) that calls
bifrostCtx.DrainPluginLogs(), and if non-empty sets
ctx.SetUserValue(schemas.BifrostContextKeyTransportPluginLogs, logs); then
invoke this helper immediately before every early return in the HTTP pre-hook
and post-hook handling (locations around
HTTPTransportPreHook/HTTPTransportPostHook usage, before the error short-circuit
that sets ctx.SetStatusCode/SetBodyString and before the resp short-circuit that
calls applyHTTPResponseToCtx), ensuring pluginCtx.ReleasePluginScope() remains
called where appropriate.
- Around line 334-339: The deferred completer stored under
schemas.BifrostContextKeyTransportPostHookCompleter currently calls
runTransportPostHooks (which runs HTTPTransportPostHook mutations after a stream
ends); change it so the deferred callback does not invoke runTransportPostHooks
or any mutating post-hook (HTTPTransportPostHook) — instead implement a
non-mutating flush/cleanup callback (e.g., flush logs, release resources) that
is safe to run after streaming completion; keep streaming mutations
responsibility with HTTPTransportStreamChunkHook in the streaming handler and
ensure the new callback only performs non-mutating cleanup.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx`:
- Around line 48-52: The toggle button for the plugin logs list is missing the
required data-testid; add a stable 3-part test id to the <button> (the element
using setIsOpen/isOpen and rendering name and entries) by setting
data-testid={`plugin-logs-toggle-${slug}`}; if a slug prop already exists, use
that slug, otherwise compute a stable slug (e.g., lowercased, hyphenated from
name) into a local variable and use it in the data-testid on the button.
- Around line 20-35: The parsed JSON may not be a Record<string,
PluginLogEntry[]> so validate and sanitize before indexing: after
JSON.parse(pluginLogs) check that parsed is an object and then build pluginNames
by filtering Object.keys(parsed) to only include names where
Array.isArray(parsed[name]) (optionally further validate array items' shape) and
map invalid or non-array entries to an empty array or skip them; then pass
sanitized entries to <PluginSection name={name} entries={parsed[name]}> to avoid
the spread/lookup throwing in PluginSection.

---

Nitpick comments:
In `@core/schemas/tracer.go`:
- Around line 114-116: Extract AttachPluginLogs into an optional interface and
change call sites to use a type assertion: add a new exported interface
PluginLogAttacher with AttachPluginLogs(traceID string, logs []PluginLogEntry),
keep existing implementations (NoOpTracer and the production tracer) unchanged,
and update callers that currently call AttachPluginLogs directly to first check
if tracer implements PluginLogAttacher (e.g., if pla, ok :=
tracer.(PluginLogAttacher); ok { pla.AttachPluginLogs(...) }) so the core Tracer
interface need not be expanded and external/custom tracers remain
backward-compatible.

In `@framework/tracing/tracer.go`:
- Around line 334-344: The doc comment currently intended for Stop was left
above AttachPluginLogs and is being associated with AttachPluginLogs; move the
comment block that documents Stop so it sits immediately above the Stop method
declaration (not above AttachPluginLogs), ensuring the Stop function (e.g., func
(t *Tracer) Stop(...) ) has its original doc comment and AttachPluginLogs
retains only its own comment (or none) so godoc attaches the correct
documentation to each function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d753376-8a28-4f3c-921f-54fbb25365cd

📥 Commits

Reviewing files that changed from the base of the PR and between 31a1b3f and 6f32327.

📒 Files selected for processing (17)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/tables.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts

Comment thread core/bifrost.go
Comment thread core/schemas/context_test.go Outdated
Comment thread examples/plugins/hello-world/main.go
@akshaydeo akshaydeo force-pushed the 03-23-logging_in_plugins branch from 6f32327 to 8a53717 Compare March 23, 2026 02:36
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
framework/tracing/store.go (1)

75-89: ⚠️ Potential issue | 🟠 Major

Initialize RequestID on every pooled trace acquisition.

tracePool reuses schemas.Trace, but CreateTrace only writes RequestID when the variadic arg is present. A trace created without requestID can inherit the previous request's ID from the pool. Clear it before the optional assignment.

🛠️ Suggested fix
 	// Parent-child relationships are between spans, not traces.
 	// The root span's ParentID is set in StartSpan from context.
 	trace.ParentID = ""
+	trace.RequestID = ""
 	if len(requestID) > 0 {
 		trace.RequestID = requestID[0]
 	}
As per coding guidelines, "Always reset ALL fields of pooled objects before calling `pool.Put()` to prevent stale data from leaking to subsequent users."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/tracing/store.go` around lines 75 - 89, CreateTrace is reusing
pooled schemas.Trace objects but only sets RequestID when an argument is passed,
allowing stale RequestID to leak; inside TraceStore.CreateTrace (and when
getting from tracePool.Get()), explicitly reset trace.RequestID = "" before the
optional assignment and ensure any other pooled fields are cleared similarly so
no previous trace data is retained.
core/bifrost.go (2)

5826-5846: 🛠️ Refactor suggestion | 🟠 Major

Fully reset PluginPipeline before returning it to pluginPipelinePool.

resetPluginPipeline() releases the scoped contexts, but it still leaves llmPlugins, mcpPlugins, logger, tracer, and the backing storage for the error/name collections attached to the pooled object. That retains stale plugin state across reloads and breaks the pool-reset contract.

Suggested fix
 func (p *PluginPipeline) resetPluginPipeline() {
-	p.executedPreHooks = 0
-	p.preHookErrors = p.preHookErrors[:0]
-	p.postHookErrors = p.postHookErrors[:0]
-	// Reset streaming timing accumulation
-	p.chunkCount = 0
-	if p.postHookTimings != nil {
-		clear(p.postHookTimings)
-	}
-	p.postHookPluginOrder = p.postHookPluginOrder[:0]
 	// Release cached scoped contexts for streaming
 	for _, scopedCtx := range p.streamScopedCtxs {
 		scopedCtx.ReleasePluginScope()
 	}
-	p.streamScopedCtxs = nil
+	*p = PluginPipeline{}
 }

As per coding guidelines, "Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users."

Also applies to: 5954-5956

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5826 - 5846, The resetPluginPipeline method
leaves several fields populated and can leak stale state when the PluginPipeline
is returned to pluginPipelinePool; update resetPluginPipeline on type
PluginPipeline to clear all remaining references: set llmPlugins and mcpPlugins
to nil (or zero-length slices), set logger and tracer to nil, reset/clear
preHookErrors and postHookErrors backing storage (not just slice headers) and
any name/error backing maps or slices used for error/name collections, clear
postHookPluginOrder and postHookTimings (or set to nil), and ensure any other
pointers or cached resources used by PluginPipeline are nil'd out before
returning to the pool so no stale plugin state survives across uses.

3788-3791: ⚠️ Potential issue | 🟠 Major

Drain pre-hook logs on every exit that bypasses post-hooks.

These paths return after pre-hooks have already run, but before any drainAndAttachPluginLogs() call. That means logs emitted during pre-hooks disappear on preReq == nil, queue-shutdown, queue-full, and cancellation failures—the exact cases you usually need for debugging. Add a best-effort drain before each terminal return that does not hand off to later post-hook or stream-finalization logic.

Also applies to: 4311-4403, 4601-4693, 5494-5504

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 3788 - 3791, The pre-hook exit paths (after
pipeline.RunLLMPreHooks returns) return without draining plugin logs; before any
terminal return where preReq is nil or when
short-circuiting/queue-shutdown/queue-full/cancellation errors occur, call the
existing drainAndAttachPluginLogs (best-effort) so pre-hook logs are not lost;
specifically, insert a call to drainAndAttachPluginLogs(...) (using the same
ctx/req/request context variables available in bifrost.go) immediately before
each cleanup()/return in the pre-hook early-return branches (including the
blocks around preReq == nil, queue shutdown/full, and cancellation failure) and
mirror this change in the other noted ranges (4311-4403, 4601-4693, 5494-5504).
Ensure the drain call is best-effort (ignore errors) and does not alter the
return error flow.
♻️ Duplicate comments (1)
core/schemas/context.go (1)

411-417: ⚠️ Potential issue | 🟠 Major

Fully zero scoped contexts before returning them to pluginScopePool.

Only the scoping fields are cleared here. A reused BifrostContext can still carry stale userValues, doneOnce, err, deadline, and blockRestrictedWrites state into the next plugin scope.

Suggested fix
 func (bc *BifrostContext) WithPluginScope(name *string) *BifrostContext {
 	// Lazily initialize the plugin log store on the root context (CAS to avoid race)
 	if bc.pluginLogs.Load() == nil {
 		newStore := pluginLogStorePool.Get().(*pluginLogStore)
 		if !bc.pluginLogs.CompareAndSwap(nil, newStore) {
 			// Another goroutine initialized first — return unused store to pool
 			pluginLogStorePool.Put(newStore)
 		}
 	}
 
 	scoped := pluginScopePool.Get().(*BifrostContext)
-	scoped.parent = bc.parent
-	scoped.done = bc.done
-	scoped.pluginScope = name
-	scoped.pluginLogs.Store(bc.pluginLogs.Load())
-	scoped.valueDelegate = bc
+	*scoped = BifrostContext{
+		parent:        bc.parent,
+		done:          bc.done,
+		pluginScope:   name,
+		valueDelegate: bc,
+	}
+	scoped.pluginLogs.Store(bc.pluginLogs.Load())
 	return scoped
 }
@@
 func (bc *BifrostContext) ReleasePluginScope() {
 	if bc.valueDelegate == nil {
 		return // not a scoped context
 	}
-	bc.parent = nil
-	bc.done = nil
-	bc.pluginScope = nil
-	bc.pluginLogs.Store(nil)
-	bc.valueDelegate = nil
+	*bc = BifrostContext{}
 	pluginScopePool.Put(bc)
 }

As per coding guidelines, "Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users."

Also applies to: 423-432

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/context.go` around lines 411 - 417, The pooled BifrostContext
returned from pluginScopePool is only partially reset (fields like pluginScope,
pluginLogs, valueDelegate are set) but other mutable state (userValues,
doneOnce, err, deadline, blockRestrictedWrites, etc.) can leak; update the code
paths that obtain and return BifrostContext (e.g., where pluginScopePool.Get()
is used and where instances are Put back) to explicitly zero or reinitialize all
fields on the BifrostContext struct (including userValues map, doneOnce, err,
deadline, blockRestrictedWrites, and any other non-scope fields) before handing
it out and also before calling pluginScopePool.Put(), ensuring the object is
fully reset for the next user while preserving only intended parent/done
references when creating a scoped instance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/context.go`:
- Around line 415-416: WithPluginScope() currently assigns scoped.pluginLogs to
point directly at bc.pluginLogs, allowing a plugin to call DrainPluginLogs() and
return the shared store to pluginLogStorePool while other scopes still reference
it; instead, create a fresh plugin log store for scoped.pluginLogs and copy or
clone the entries from bc.pluginLogs (do not share the pointer) so
DrainPluginLogs() on the scoped context only drains that private copy; update
the two assignment sites (the scoped.pluginLogs.Store(bc.pluginLogs.Load())
usage around scoped.valueDelegate = bc and the similar block in lines ~473-492)
to allocate and store a cloned/copied store instance rather than the original
shared store and ensure DrainPluginLogs() returns only stores that were actually
allocated for that scope.

In `@core/schemas/trace.go`:
- Around line 52-65: In Trace.Reset, t.Spans is only truncated which leaves
references in the backing array; update Reset (function Trace.Reset) to clear
each element before shrinking (e.g. for i := range t.Spans { t.Spans[i] = Span{}
}) and then set t.Spans = nil (or t.Spans = t.Spans[:0] after zeroing) so no
stale pointers remain — mirror the existing PluginLogs clearing pattern to fully
reset pooled traces.
- Around line 42-50: GetRequestID and SetRequestID access Trace fields without
locking and Reset() mutates fields without holding the mutex, creating race
conditions; update Trace.GetRequestID and Trace.SetRequestID to acquire t.mu
(Lock/Unlock or Lock/defer Unlock) around their body and update Trace.Reset to
perform all field assignments while holding t.mu to consistently protect mutable
state (use the existing t.mu field used by AddSpan/AppendPluginLogs).

In `@core/schemas/tracer.go`:
- Around line 38-42: The change breaks the exported Tracer API by modifying
Tracer.CreateTrace and adding AttachPluginLogs, which will break any
caller-implemented tracer injected via BifrostConfig; revert the direct changes
to the Tracer interface and instead add a separate optional extension interface
(e.g., PluginLoggableTracer with AttachPluginLogs) and/or keep CreateTrace
signature unchanged, then update internal code to type-assert to the new
optional interface where needed (reference: Tracer, CreateTrace,
AttachPluginLogs, BifrostConfig).

In `@plugins/logging/writer.go`:
- Around line 39-43: The pendingInjectEntries struct currently only holds
entries []*logstore.Log and eviction uses entries[0].Timestamp (log timestamp)
which is wrong; add a createdAt time.Time field to pendingInjectEntries and
change eviction logic that checks pendingLogTTL to compare against createdAt
(not entries[0].Timestamp) so batches are expired by enqueue time; update all
places that construct pendingInjectEntries (including where pendingInjectEntries
is instantiated around lines referenced) to set createdAt = time.Now() when
creating the struct and adjust any cleanup code that referenced
entries[0].Timestamp to use the new createdAt field.

In `@transports/bifrost-http/lib/config.go`:
- Around line 2336-2341: The plugin scope created with pluginName via
ctx.WithPluginScope is not guaranteed to be released if
i.plugins[j].HTTPTransportStreamChunkHook panics; update the code around
pluginCtx := ctx.WithPluginScope(&pluginName) to call
pluginCtx.ReleasePluginScope() via defer immediately after creation so the
pooled context is always returned (even on panic) while keeping the existing
error handling that returns modified and wraps err from
HTTPTransportStreamChunkHook.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 5826-5846: The resetPluginPipeline method leaves several fields
populated and can leak stale state when the PluginPipeline is returned to
pluginPipelinePool; update resetPluginPipeline on type PluginPipeline to clear
all remaining references: set llmPlugins and mcpPlugins to nil (or zero-length
slices), set logger and tracer to nil, reset/clear preHookErrors and
postHookErrors backing storage (not just slice headers) and any name/error
backing maps or slices used for error/name collections, clear
postHookPluginOrder and postHookTimings (or set to nil), and ensure any other
pointers or cached resources used by PluginPipeline are nil'd out before
returning to the pool so no stale plugin state survives across uses.
- Around line 3788-3791: The pre-hook exit paths (after pipeline.RunLLMPreHooks
returns) return without draining plugin logs; before any terminal return where
preReq is nil or when short-circuiting/queue-shutdown/queue-full/cancellation
errors occur, call the existing drainAndAttachPluginLogs (best-effort) so
pre-hook logs are not lost; specifically, insert a call to
drainAndAttachPluginLogs(...) (using the same ctx/req/request context variables
available in bifrost.go) immediately before each cleanup()/return in the
pre-hook early-return branches (including the blocks around preReq == nil, queue
shutdown/full, and cancellation failure) and mirror this change in the other
noted ranges (4311-4403, 4601-4693, 5494-5504). Ensure the drain call is
best-effort (ignore errors) and does not alter the return error flow.

In `@framework/tracing/store.go`:
- Around line 75-89: CreateTrace is reusing pooled schemas.Trace objects but
only sets RequestID when an argument is passed, allowing stale RequestID to
leak; inside TraceStore.CreateTrace (and when getting from tracePool.Get()),
explicitly reset trace.RequestID = "" before the optional assignment and ensure
any other pooled fields are cleared similarly so no previous trace data is
retained.

---

Duplicate comments:
In `@core/schemas/context.go`:
- Around line 411-417: The pooled BifrostContext returned from pluginScopePool
is only partially reset (fields like pluginScope, pluginLogs, valueDelegate are
set) but other mutable state (userValues, doneOnce, err, deadline,
blockRestrictedWrites, etc.) can leak; update the code paths that obtain and
return BifrostContext (e.g., where pluginScopePool.Get() is used and where
instances are Put back) to explicitly zero or reinitialize all fields on the
BifrostContext struct (including userValues map, doneOnce, err, deadline,
blockRestrictedWrites, and any other non-scope fields) before handing it out and
also before calling pluginScopePool.Put(), ensuring the object is fully reset
for the next user while preserving only intended parent/done references when
creating a scoped instance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5094a83-5fa9-4bc9-9ba1-71f4b1c33a71

📥 Commits

Reviewing files that changed from the base of the PR and between 6f32327 and 8a53717.

📒 Files selected for processing (22)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/tracing/store.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • plugins/logging/operations_test.go
  • plugins/logging/writer.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts
✅ Files skipped from review due to trivial changes (1)
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
🚧 Files skipped from review as they are similar to previous changes (7)
  • ui/lib/types/logs.ts
  • examples/plugins/hello-world/main.go
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • core/schemas/bifrost.go
  • core/schemas/context_test.go
  • plugins/logging/main.go
  • transports/bifrost-http/handlers/middlewares.go

Comment thread core/schemas/context.go
Comment thread core/schemas/trace.go
Comment thread core/schemas/trace.go
Comment thread core/schemas/tracer.go
Comment thread plugins/logging/writer.go Outdated
Comment thread transports/bifrost-http/lib/config.go Outdated
@akshaydeo akshaydeo force-pushed the 03-23-logging_in_plugins branch from 8a53717 to c8fd729 Compare March 23, 2026 08:42
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/bifrost.go (3)

5831-5845: 🛠️ Refactor suggestion | 🟠 Major

Reset the remaining PluginPipeline references before pooling it.

This now clears the new streaming-scope state, but llmPlugins, mcpPlugins, logger, and tracer still survive across sync.Pool reuse. That keeps stale references alive longer than necessary and breaks the repo rule for pooled objects.

♻️ Suggested fix
 func (p *PluginPipeline) resetPluginPipeline() {
+	p.llmPlugins = nil
+	p.mcpPlugins = nil
+	p.logger = nil
+	p.tracer = nil
 	p.executedPreHooks = 0
 	p.preHookErrors = p.preHookErrors[:0]
 	p.postHookErrors = p.postHookErrors[:0]

As per coding guidelines, **/*.go: Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5831 - 5845, The resetPluginPipeline method
currently clears streaming state but leaves other pooled references alive;
update PluginPipeline.resetPluginPipeline to also clear all remaining plugin and
observability references by setting p.llmPlugins = nil, p.mcpPlugins = nil and
clearing any other plugin slices, and set p.logger = nil and p.tracer = nil (or
their zero values) so no stale plugin or observability objects survive sync.Pool
reuse; keep the existing slice clears and scoped-context releases and ensure any
other plugin-related fields are similarly reset.

4294-4305: ⚠️ Potential issue | 🟠 Major

Seed a trace ID and drain once for the whole non-stream request.

These new attach sites still miss the default SDK path: drainAndAttachPluginLogs requires BifrostContextKeyTraceID, but tryRequest never initializes one, so calls that start from bifrost.ctx silently drop plugin logs. The helper is also only reached on a subset of exits here, which means pre-hook logs are still lost on earlier returns like preReq == nil, queue shutdown/full, or ctx.Done() before the worker replies.

🛠️ Suggested fix
 	tracer := bifrost.getTracer()
 	if tracer == nil {
 		return nil, newBifrostErrorFromMsg("tracer not found in context")
 	}
 
 	// Store tracer in context BEFORE calling requestHandler, so streaming goroutines
 	// have access to it for completing deferred spans when the stream ends.
 	// The streaming goroutine captures the context when it starts, so these values
 	// must be set before requestHandler() is called.
 	ctx.SetValue(schemas.BifrostContextKeyTracer, tracer)
+	if _, ok := ctx.Value(schemas.BifrostContextKeyTraceID).(string); !ok {
+		if traceID := tracer.CreateTrace(""); traceID != "" {
+			ctx.SetValue(schemas.BifrostContextKeyTraceID, traceID)
+		}
+	}
 
 	pipeline := bifrost.getPluginPipeline()
 	defer bifrost.releasePluginPipeline(pipeline)
+	defer drainAndAttachPluginLogs(ctx)

Also applies to: 4411-4430

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4294 - 4305, tryRequest currently never seeds
BifrostContextKeyTraceID and only calls drainAndAttachPluginLogs on some return
paths, causing plugin logs to be dropped; update tryRequest to generate/attach a
trace ID to the bifrost context at the start (set BifrostContextKeyTraceID in
the ctx used throughout), and restructure the non-stream exit paths so
drainAndAttachPluginLogs is invoked exactly once before every return (including
early returns like preReq == nil, queue shutdown/full, context cancellation, and
the short-circuit branches that call RunPostLLMHooks), ensuring the same ctx
with the seeded trace ID is passed to RunPostLLMHooks and any other callers.

5470-5480: ⚠️ Potential issue | 🟠 Major

Direct MCP tool calls still have no trace context for these new attaches.

ExecuteChatMCPTool and ExecuteResponsesMCPTool can reach handleMCPToolExecution with bifrost.ctx, but this function never seeds BifrostContextKeyTracer or BifrostContextKeyTraceID. The new drainAndAttachPluginLogs calls therefore no-op for the public manual MCP entrypoints, and the preReq == nil branch still bypasses attachment entirely.

🛠️ Suggested fix
 func (bifrost *Bifrost) handleMCPToolExecution(ctx *schemas.BifrostContext, mcpRequest *schemas.BifrostMCPRequest, requestType schemas.RequestType) (*schemas.BifrostMCPResponse, *schemas.BifrostError) {
 	if bifrost.MCPManager == nil {
 		return nil, &schemas.BifrostError{
@@
 	if _, ok := ctx.Value(schemas.BifrostContextKeyRequestID).(string); !ok {
 		ctx.SetValue(schemas.BifrostContextKeyRequestID, uuid.New().String())
 	}
+
+	tracer := bifrost.getTracer()
+	ctx.SetValue(schemas.BifrostContextKeyTracer, tracer)
+	if _, ok := ctx.Value(schemas.BifrostContextKeyTraceID).(string); !ok {
+		if traceID := tracer.CreateTrace(""); traceID != "" {
+			ctx.SetValue(schemas.BifrostContextKeyTraceID, traceID)
+		}
+	}
+	defer drainAndAttachPluginLogs(ctx)
 
 	// Get plugin pipeline for MCP hooks
 	pipeline := bifrost.getPluginPipeline()
 	defer bifrost.releasePluginPipeline(pipeline)

Also applies to: 5539-5540

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5470 - 5480, Direct MCP tool entrypoints
(ExecuteChatMCPTool, ExecuteResponsesMCPTool) call handleMCPToolExecution with
bifrost.ctx but the function never seeds BifrostContextKeyTracer or
BifrostContextKeyTraceID, so drainAndAttachPluginLogs and the preReq == nil
branch no-op for manual calls; fix by ensuring handleMCPToolExecution (at its
start) checks for missing BifrostContextKeyTracer/BifrostContextKeyTraceID and
seeds them into the context (e.g., create a tracer/trace ID or derive from
existing tracing provider and store via context.WithValue) before any calls to
drainAndAttachPluginLogs or before the preReq nil-check path so attachments run
correctly; apply the same seeding logic to the other occurrence noted around the
5539-5540 area.
♻️ Duplicate comments (2)
ui/app/workspace/logs/views/pluginLogsView.tsx (1)

19-30: ⚠️ Potential issue | 🟠 Major

Validate each parsed array item before rendering it.

Array.isArray(value) only proves the group is an array. A parseable payload like [null] or [{ timestamp: "x" }] still reaches the sort/render path and can blow up when PluginSection reads timestamp, level, or message. Please type-guard each item before casting it to PluginLogEntry[].

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx` around lines 19 - 30,
PluginLogsView currently accepts any array values as PluginLogEntry[] after
JSON.parse, which lets invalid items like null or malformed objects through;
update the parsing branch in PluginLogsView to validate each array element
before casting by iterating over each entry in the filtered arrays (the variable
parsed) and keeping only items that match the PluginLogEntry shape (e.g., object
with timestamp as a string or number, level as expected enum/string, and message
as string), discarding or logging invalid entries so downstream consumers such
as PluginSection only receive properly typed PluginLogEntry[].
plugins/logging/main.go (1)

103-163: ⚠️ Potential issue | 🟠 Major

Don't make deferred usage depend on a timed DB poll.

Line 147 checks for a row that traced requests intentionally will not have until Inject() drains pendingLogsToInject. That still makes deferred usage a best-effort 3-retry window, so slow trace flushing or write backlog can silently drop usage for large responses. Please merge deferred usage into the buffered entry while the trace path is active, and only fall back to store.Update after the insert is actually enqueued.

Also applies to: 876-905

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 103 - 163, scheduleDeferredUsageUpdate
currently polls the DB (via IsLogEntryPresent and Update) and may drop deferred
usage when Inject() hasn't yet enqueued the pending log; instead, merge the
deferred usage into the in-memory buffered log during the trace path so it is
written as part of the original insert and only use store.Update as a fallback.
Concretely: when deferredChan yields deferredUsage in
scheduleDeferredUsageUpdate, first try to attach it to the pending log that
Inject() is about to enqueue (use the same pendingLogsToInject structure/queue
or add an accessor on the LoggerPlugin so scheduleDeferredUsageUpdate can call a
method to merge TokenUsageParsed into the pending log entry); only if that merge
cannot be performed (not enqueued after a short wait or missing) then perform
the DB existence/backoff logic and call store.Update. Update Inject() to
accept/merge deferred usage (or expose a thread-safe merge helper) so deferred
usage is never lost by racing the insert.
🧹 Nitpick comments (2)
transports/bifrost-http/server/server.go (1)

1318-1322: Add a regression test for this tracing/transport nesting.

This ordering looks correct, but it depends on a subtle contract with TransportInterceptorMiddleware’s deferred post-hook completion path. A focused test here would help prevent future reorderings from dropping transport plugin logs from traces, especially on streaming requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/server/server.go` around lines 1318 - 1322, Add a
regression test that asserts the current middleware nesting
(TransportInterceptorMiddleware runs inside TracingMiddleware so tracing defer
executes after transport post-hooks) by exercising a request (including a
streaming case) through the server and verifying that transport plugin logs
produced in TransportInterceptorMiddleware's post-hook appear in the trace
emitted by TracingMiddleware; target the middleware stack created where
inferenceMiddlewares are composed (reference
TransportInterceptorMiddleware/handlers.TransportInterceptorMiddleware and
s.TracingMiddleware.Middleware()) and simulate a transport plugin that logs in
its deferred/post path to ensure the tracing defer runs after the transport
post-hook.
plugins/logging/main.go (1)

852-874: Consider defensive slice copying in CAS loop.

While the current code is safe (slice literals with cap=1 force new array allocation on first append), the suggested fix defensively ensures no shared backing array between CAS attempts—preventing potential regressions if initialization changes or if similar patterns appear elsewhere.

Suggested fix
-			pending := existing.(*pendingInjectEntries)
-			updated := &pendingInjectEntries{entries: append(pending.entries, entry), createdAt: pending.createdAt}
+			pending := existing.(*pendingInjectEntries)
+			nextEntries := make([]*logstore.Log, len(pending.entries)+1)
+			copy(nextEntries, pending.entries)
+			nextEntries[len(pending.entries)] = entry
+			updated := &pendingInjectEntries{entries: nextEntries, createdAt: pending.createdAt}
 			if p.pendingLogsToInject.CompareAndSwap(traceID, existing, updated) {
 				return
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 852 - 874, The CAS loop in
storeOrEnqueueEntry may append to pending.entries and risk sharing the
underlying slice across retries; modify the update path in the for-loop inside
storeOrEnqueueEntry so you defensively copy the existing slice before appending
(e.g., newEntries := append([]*logstore.Log(nil), pending.entries...);
newEntries = append(newEntries, entry)) and then create updated :=
&pendingInjectEntries{entries: newEntries, createdAt: pending.createdAt} before
calling p.pendingLogsToInject.CompareAndSwap(traceID, existing, updated); keep
the rest of the loop and behavior for pendingLogsToInject and
pendingInjectEntries unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 4535-4546: The current code only calls drainAndAttachPluginLogs
when IsFinalChunk(ctx) is true, which misses cases where plugin-owned
shortCircuit.Stream closes without emitting a terminal chunk; update the
goroutine cleanup so drainAndAttachPluginLogs(ctx) is invoked unconditionally
when the stream goroutine exits (e.g., add drainAndAttachPluginLogs(ctx) into
the goroutine's defer before pipeline.FinalizeStreamingPostHookSpans(ctx) and
bifrost.releasePluginPipeline(pipeline)), ensuring logs are always drained and
attached even if Stream never emits StreamEndIndicator.

In `@framework/logstore/tables.go`:
- Around line 131-132: PluginLogs is currently exposed as a raw JSON blob;
change it to follow the same pattern as Metadata by hiding the raw storage blob
(make PluginLogs a *string with json:"-") and add a structured field (e.g.,
PluginLogsMap or PluginLogsParsed) that holds the deserialized map/struct to be
returned by the API. Update SerializeFields and DeserializeFields to marshal the
structured field into PluginLogs (when saving) and unmarshal PluginLogs into the
structured field (when loading), mirroring the Metadata
serialization/deserialization logic and using the PluginLogs/PluginLogsMap
identifiers to locate the changes.

In `@framework/tracing/tracer.go`:
- Around line 332-346: The documentation comment for Stop was accidentally left
above AttachPluginLogs; move the comment block "Stop stops the tracer and
releases its resources. This stops the internal TraceStore's cleanup goroutine."
so it sits immediately above the Stop method declaration (not above
AttachPluginLogs), leaving AttachPluginLogs with its own comment
("AttachPluginLogs appends plugin log entries...") or no orphaned doc; ensure
the comment remains a proper Go doc comment for the Tracer.Stop method and that
AttachPluginLogs and Stop definitions are contiguous with their respective
comments.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 5831-5845: The resetPluginPipeline method currently clears
streaming state but leaves other pooled references alive; update
PluginPipeline.resetPluginPipeline to also clear all remaining plugin and
observability references by setting p.llmPlugins = nil, p.mcpPlugins = nil and
clearing any other plugin slices, and set p.logger = nil and p.tracer = nil (or
their zero values) so no stale plugin or observability objects survive sync.Pool
reuse; keep the existing slice clears and scoped-context releases and ensure any
other plugin-related fields are similarly reset.
- Around line 4294-4305: tryRequest currently never seeds
BifrostContextKeyTraceID and only calls drainAndAttachPluginLogs on some return
paths, causing plugin logs to be dropped; update tryRequest to generate/attach a
trace ID to the bifrost context at the start (set BifrostContextKeyTraceID in
the ctx used throughout), and restructure the non-stream exit paths so
drainAndAttachPluginLogs is invoked exactly once before every return (including
early returns like preReq == nil, queue shutdown/full, context cancellation, and
the short-circuit branches that call RunPostLLMHooks), ensuring the same ctx
with the seeded trace ID is passed to RunPostLLMHooks and any other callers.
- Around line 5470-5480: Direct MCP tool entrypoints (ExecuteChatMCPTool,
ExecuteResponsesMCPTool) call handleMCPToolExecution with bifrost.ctx but the
function never seeds BifrostContextKeyTracer or BifrostContextKeyTraceID, so
drainAndAttachPluginLogs and the preReq == nil branch no-op for manual calls;
fix by ensuring handleMCPToolExecution (at its start) checks for missing
BifrostContextKeyTracer/BifrostContextKeyTraceID and seeds them into the context
(e.g., create a tracer/trace ID or derive from existing tracing provider and
store via context.WithValue) before any calls to drainAndAttachPluginLogs or
before the preReq nil-check path so attachments run correctly; apply the same
seeding logic to the other occurrence noted around the 5539-5540 area.

---

Duplicate comments:
In `@plugins/logging/main.go`:
- Around line 103-163: scheduleDeferredUsageUpdate currently polls the DB (via
IsLogEntryPresent and Update) and may drop deferred usage when Inject() hasn't
yet enqueued the pending log; instead, merge the deferred usage into the
in-memory buffered log during the trace path so it is written as part of the
original insert and only use store.Update as a fallback. Concretely: when
deferredChan yields deferredUsage in scheduleDeferredUsageUpdate, first try to
attach it to the pending log that Inject() is about to enqueue (use the same
pendingLogsToInject structure/queue or add an accessor on the LoggerPlugin so
scheduleDeferredUsageUpdate can call a method to merge TokenUsageParsed into the
pending log entry); only if that merge cannot be performed (not enqueued after a
short wait or missing) then perform the DB existence/backoff logic and call
store.Update. Update Inject() to accept/merge deferred usage (or expose a
thread-safe merge helper) so deferred usage is never lost by racing the insert.

In `@ui/app/workspace/logs/views/pluginLogsView.tsx`:
- Around line 19-30: PluginLogsView currently accepts any array values as
PluginLogEntry[] after JSON.parse, which lets invalid items like null or
malformed objects through; update the parsing branch in PluginLogsView to
validate each array element before casting by iterating over each entry in the
filtered arrays (the variable parsed) and keeping only items that match the
PluginLogEntry shape (e.g., object with timestamp as a string or number, level
as expected enum/string, and message as string), discarding or logging invalid
entries so downstream consumers such as PluginSection only receive properly
typed PluginLogEntry[].

---

Nitpick comments:
In `@plugins/logging/main.go`:
- Around line 852-874: The CAS loop in storeOrEnqueueEntry may append to
pending.entries and risk sharing the underlying slice across retries; modify the
update path in the for-loop inside storeOrEnqueueEntry so you defensively copy
the existing slice before appending (e.g., newEntries :=
append([]*logstore.Log(nil), pending.entries...); newEntries =
append(newEntries, entry)) and then create updated :=
&pendingInjectEntries{entries: newEntries, createdAt: pending.createdAt} before
calling p.pendingLogsToInject.CompareAndSwap(traceID, existing, updated); keep
the rest of the loop and behavior for pendingLogsToInject and
pendingInjectEntries unchanged.

In `@transports/bifrost-http/server/server.go`:
- Around line 1318-1322: Add a regression test that asserts the current
middleware nesting (TransportInterceptorMiddleware runs inside TracingMiddleware
so tracing defer executes after transport post-hooks) by exercising a request
(including a streaming case) through the server and verifying that transport
plugin logs produced in TransportInterceptorMiddleware's post-hook appear in the
trace emitted by TracingMiddleware; target the middleware stack created where
inferenceMiddlewares are composed (reference
TransportInterceptorMiddleware/handlers.TransportInterceptorMiddleware and
s.TracingMiddleware.Middleware()) and simulate a transport plugin that logs in
its deferred/post path to ensure the tracing defer runs after the transport
post-hook.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9b5b4ebd-d769-4ee5-99af-570d6e727ac1

📥 Commits

Reviewing files that changed from the base of the PR and between 8a53717 and c8fd729.

📒 Files selected for processing (22)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/tracing/store.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • plugins/logging/operations_test.go
  • plugins/logging/writer.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts
✅ Files skipped from review due to trivial changes (4)
  • ui/lib/types/logs.ts
  • core/schemas/context.go
  • core/schemas/context_test.go
  • transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • framework/logstore/store.go
  • plugins/logging/writer.go
  • framework/tracing/store.go
  • plugins/logging/operations_test.go
  • core/schemas/tracer.go
  • core/schemas/bifrost.go
  • core/schemas/trace.go
  • examples/plugins/hello-world/main.go

Comment thread core/bifrost.go
Comment thread framework/logstore/tables.go
Comment thread framework/tracing/tracer.go Outdated
@akshaydeo akshaydeo force-pushed the 03-23-logging_in_plugins branch from c8fd729 to 7150013 Compare March 23, 2026 13:28
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
plugins/logging/main.go (1)

141-161: ⚠️ Potential issue | 🟠 Major

Retrying IsLogEntryPresent still leaves deferred usage lossy.

Traced requests keep the row in pendingLogsToInject, and later in the async write queue, so the store can legitimately report “missing” longer than this fixed retry window. After that, the deferred token-usage update is dropped permanently. Merge the usage into the buffered entry or apply it only after the insert has been acknowledged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 141 - 161, The current retry loop on
p.store.IsLogEntryPresent can permanently drop deferred usage; instead, when
IsLogEntryPresent returns false, check the in-memory buffer
(pendingLogsToInject) for requestID and merge usageUpdates into that buffered
entry (or attach the usage update to the async write queue entry) so the token
usage is applied when the pending log is flushed; only fall back to the
store.Update call (p.store.Update) if no buffered entry exists, and ensure any
enqueue logic ties the merged usage to the same requestID so the insert
acknowledgement applies the update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/plugins/hello-world/main.go`:
- Around line 34-35: The example uses a non-existent method ctx.Log which breaks
compilation; update the example hooks (e.g., HTTPTransportPreHook and the other
occurrences) to use the actual logging surface instead of ctx.Log — for example
call the context's logger methods (e.g., ctx.Logger().Info/Debug/Error or
whatever logger accessor exists on schemas.BifrostContext) with the same message
and appropriate level, or alternatively add a Log(level schemas.LogLevel, msg
string) method to *schemas.BifrostContext that delegates to the real logger;
apply the same change to all occurrences mentioned so the example compiles.

In `@plugins/logging/main.go`:
- Around line 858-866: The CAS loop in storeOrEnqueueEntry mutates a shared
slice backing array (pendingInjectEntries.entries) via append, risking
lost/corrupted entries under concurrent calls; fix by making a copy of
pending.entries before appending (e.g., newEntries :=
append([]*logstore.Log(nil), pending.entries...); newEntries =
append(newEntries, entry)) and use that copied slice when constructing the
updated *pendingInjectEntries used in p.pendingLogsToInject.CompareAndSwap,
leaving the original pending.entries unmodified.

---

Duplicate comments:
In `@plugins/logging/main.go`:
- Around line 141-161: The current retry loop on p.store.IsLogEntryPresent can
permanently drop deferred usage; instead, when IsLogEntryPresent returns false,
check the in-memory buffer (pendingLogsToInject) for requestID and merge
usageUpdates into that buffered entry (or attach the usage update to the async
write queue entry) so the token usage is applied when the pending log is
flushed; only fall back to the store.Update call (p.store.Update) if no buffered
entry exists, and ensure any enqueue logic ties the merged usage to the same
requestID so the insert acknowledgement applies the update.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 23fd57c4-5b98-4f09-af9d-31b3d9235af3

📥 Commits

Reviewing files that changed from the base of the PR and between c8fd729 and 01f1f31.

📒 Files selected for processing (22)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/tracing/store.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • plugins/logging/operations_test.go
  • plugins/logging/writer.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts
✅ Files skipped from review due to trivial changes (4)
  • core/schemas/context_test.go
  • framework/logstore/rdb.go
  • core/schemas/bifrost.go
  • core/bifrost.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • framework/tracing/store.go
  • framework/tracing/tracer.go
  • core/schemas/context.go
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • plugins/logging/operations_test.go

Comment thread examples/plugins/hello-world/main.go
Comment thread plugins/logging/main.go Outdated
@akshaydeo akshaydeo changed the base branch from main to graphite-base/2215 March 23, 2026 14:11
@akshaydeo akshaydeo force-pushed the graphite-base/2215 branch from a3b439d to d41b51a Compare March 23, 2026 14:11
@akshaydeo akshaydeo force-pushed the 03-23-logging_in_plugins branch from 01f1f31 to 3657b8d Compare March 23, 2026 14:11
@akshaydeo akshaydeo changed the base branch from graphite-base/2215 to v1.5.0 March 23, 2026 14:11
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
core/schemas/context.go (1)

401-432: ⚠️ Potential issue | 🟠 Major

Fully reset and finish delegating scoped contexts before pooling them.

WithPluginScope only rehydrates a few fields and ReleasePluginScope only clears a few. The wrapper still has independent doneOnce, userValues, hasDeadline, err, and blockRestrictedWrites state, while methods like Cancel(), GetUserValues(), and GetParentCtxWithUserValues() can still observe that local state. That makes stale state leakable on reuse, and Cancel() on a scoped context can double-close the shared done channel because scoped.done = bc.done but doneOnce is not shared. Zero the whole struct before pluginScopePool.Put() and either delegate the remaining helpers or parent the scope to the root context. As per coding guidelines, "Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/context.go` around lines 401 - 432, The scoped BifrostContext
returned by WithPluginScope and returned to pluginScopePool in
ReleasePluginScope can leak stale local state (doneOnce, userValues,
hasDeadline, err, blockRestrictedWrites, etc.) because only some fields are
cleared; update ReleasePluginScope to fully reset the scoped wrapper before
pluginScopePool.Put(): ensure you clear or zero all fields (including doneOnce,
userValues map/slice, hasDeadline, err, blockRestrictedWrites and any other
local fields), preserve shared references only via
valueDelegate/parent/done/pluginLogs as intended (or explicitly delegate helper
methods to the root context), and then put the fully-zeroed struct back into
pluginScopePool to prevent double-close and state leakage when reusing the
pooled object (references: WithPluginScope, ReleasePluginScope, pluginScopePool,
pluginLogStorePool, and the fields
doneOnce/userValues/hasDeadline/err/blockRestrictedWrites/done/pluginLogs/valueDelegate).
plugins/logging/main.go (2)

860-867: ⚠️ Potential issue | 🔴 Critical

Copy-on-write is required in the CAS loop to avoid dropped/corrupted pending entries.

append(pending.entries, entry) can mutate a shared backing array before CompareAndSwap succeeds. Concurrent writers on the same traceID can clobber each other.

💡 Proposed fix
-			pending := existing.(*pendingInjectEntries)
-			updated := &pendingInjectEntries{entries: append(pending.entries, entry), createdAt: pending.createdAt}
+			pending := existing.(*pendingInjectEntries)
+			nextEntries := make([]*logstore.Log, len(pending.entries)+1)
+			copy(nextEntries, pending.entries)
+			nextEntries[len(pending.entries)] = entry
+			updated := &pendingInjectEntries{
+				entries:   nextEntries,
+				createdAt: pending.createdAt,
+			}
 			if p.pendingLogsToInject.CompareAndSwap(traceID, existing, updated) {
 				return
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 860 - 867, The CAS loop in
pendingLogsToInject risks corrupting shared slice backing arrays because
append(pending.entries, entry) may mutate pending.entries before CompareAndSwap
succeeds; modify the loop that uses p.pendingLogsToInject.LoadOrStore /
CompareAndSwap to perform true copy-on-write by allocating a new slice (make
with len+1), copying pending.entries into it, appending the new entry into that
fresh slice, and then use CompareAndSwap with the updated pendingInjectEntries;
ensure this change is applied where pendingInjectEntries is constructed/updated
so concurrent writers on the same traceID cannot clobber each other.

141-160: ⚠️ Potential issue | 🟠 Major

Deferred usage can still be lost when DB visibility lags behind queued inserts.

A fixed 3-retry presence check can fail under queue backlog or delayed Inject(), causing token/cost backfill to be dropped even though the entry is eventually persisted.

Please merge deferred usage into the in-memory pending trace entry (when available) and let Inject() enqueue the merged record, instead of relying on short DB-presence retries.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/main.go` around lines 141 - 160, The current 3-attempt DB
presence check using p.store.IsLogEntryPresent(...) can drop deferred usage if
DB visibility lags; instead, locate the retry block (the loop referencing
p.store.IsLogEntryPresent, p.logger.Warn and requestID) and when the entry is
not found, retrieve the in-memory pending trace entry for requestID (the
pending/queue structure used by Inject()), merge the deferred usage/cost into
that pending trace object, and then call p.Inject(...) to enqueue the merged
record for persistence; remove the early return so deferred usage is not lost
and ensure errors from IsLogEntryPresent still log via p.logger.Warn.
🧹 Nitpick comments (1)
transports/bifrost-http/handlers/middlewares.go (1)

341-346: Align the plugin contract with the new streaming post-hook behavior.

runTransportPostHooks(...) is now intentionally deferred until streamed responses finish, but core/schemas/plugin.go still says HTTPTransportPostHook is not called for streaming responses. Please update that interface comment/tests in this stack so plugin authors do not implement against stale semantics. Based on learnings, running HTTPTransportPostHook after stream completion here is intentional for logging, metrics, and cleanup; the contract should say that explicitly.

Also applies to: 355-365

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/middlewares.go` around lines 341 - 346, The
plugin contract comment and related tests are stale: streaming responses now
defer calling runTransportPostHooks until after the stream completes (see usage
of schemas.BifrostContextKeyDeferTraceCompletion and the callback stored under
schemas.BifrostContextKeyTransportPostHookCompleter), so update
core/schemas/plugin.go to state that HTTPTransportPostHook will be invoked after
streamed responses finish (not skipped), and adjust any tests that assert the
old behavior to expect post-hook execution after stream completion; ensure
references to HTTPTransportPostHook, runTransportPostHooks, and the
BifrostContextKey* symbols are mentioned so plugin authors understand the new
post-stream invocation contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 311-317: The code currently writes err.Error() to the response
body exposing sensitive plugin/provider details; instead, log the full error
(and plugin logs drained via bifrostCtx.DrainPluginLogs()) to your internal
logger (e.g., processLogger or bifrostCtx.Logger) with context, then return a
stable non-sensitive message to the client—replace
ctx.SetBodyString(err.Error()) with something like ctx.SetBodyString("transport
plugin error") (keep ctx.SetStatusCode(fasthttp.StatusInternalServerError) and
ctx.SetUserValue(schemas.BifrostContextKeyTransportPluginLogs, logs) as-is).

---

Duplicate comments:
In `@core/schemas/context.go`:
- Around line 401-432: The scoped BifrostContext returned by WithPluginScope and
returned to pluginScopePool in ReleasePluginScope can leak stale local state
(doneOnce, userValues, hasDeadline, err, blockRestrictedWrites, etc.) because
only some fields are cleared; update ReleasePluginScope to fully reset the
scoped wrapper before pluginScopePool.Put(): ensure you clear or zero all fields
(including doneOnce, userValues map/slice, hasDeadline, err,
blockRestrictedWrites and any other local fields), preserve shared references
only via valueDelegate/parent/done/pluginLogs as intended (or explicitly
delegate helper methods to the root context), and then put the fully-zeroed
struct back into pluginScopePool to prevent double-close and state leakage when
reusing the pooled object (references: WithPluginScope, ReleasePluginScope,
pluginScopePool, pluginLogStorePool, and the fields
doneOnce/userValues/hasDeadline/err/blockRestrictedWrites/done/pluginLogs/valueDelegate).

In `@plugins/logging/main.go`:
- Around line 860-867: The CAS loop in pendingLogsToInject risks corrupting
shared slice backing arrays because append(pending.entries, entry) may mutate
pending.entries before CompareAndSwap succeeds; modify the loop that uses
p.pendingLogsToInject.LoadOrStore / CompareAndSwap to perform true copy-on-write
by allocating a new slice (make with len+1), copying pending.entries into it,
appending the new entry into that fresh slice, and then use CompareAndSwap with
the updated pendingInjectEntries; ensure this change is applied where
pendingInjectEntries is constructed/updated so concurrent writers on the same
traceID cannot clobber each other.
- Around line 141-160: The current 3-attempt DB presence check using
p.store.IsLogEntryPresent(...) can drop deferred usage if DB visibility lags;
instead, locate the retry block (the loop referencing p.store.IsLogEntryPresent,
p.logger.Warn and requestID) and when the entry is not found, retrieve the
in-memory pending trace entry for requestID (the pending/queue structure used by
Inject()), merge the deferred usage/cost into that pending trace object, and
then call p.Inject(...) to enqueue the merged record for persistence; remove the
early return so deferred usage is not lost and ensure errors from
IsLogEntryPresent still log via p.logger.Warn.

---

Nitpick comments:
In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 341-346: The plugin contract comment and related tests are stale:
streaming responses now defer calling runTransportPostHooks until after the
stream completes (see usage of schemas.BifrostContextKeyDeferTraceCompletion and
the callback stored under schemas.BifrostContextKeyTransportPostHookCompleter),
so update core/schemas/plugin.go to state that HTTPTransportPostHook will be
invoked after streamed responses finish (not skipped), and adjust any tests that
assert the old behavior to expect post-hook execution after stream completion;
ensure references to HTTPTransportPostHook, runTransportPostHooks, and the
BifrostContextKey* symbols are mentioned so plugin authors understand the new
post-stream invocation contract.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6816d994-80a7-442f-9140-ef0f984643d1

📥 Commits

Reviewing files that changed from the base of the PR and between 01f1f31 and 3657b8d.

📒 Files selected for processing (22)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/tracing/store.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • plugins/logging/operations_test.go
  • plugins/logging/writer.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts
✅ Files skipped from review due to trivial changes (6)
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • plugins/logging/operations_test.go
  • ui/lib/types/logs.ts
  • transports/bifrost-http/lib/config.go
  • plugins/logging/writer.go
  • core/schemas/bifrost.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/logstore/migrations.go
  • transports/bifrost-http/server/server.go
  • core/schemas/trace.go
  • framework/tracing/store.go
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • core/bifrost.go

Comment thread transports/bifrost-http/handlers/middlewares.go
@akshaydeo akshaydeo force-pushed the 03-23-logging_in_plugins branch from 3657b8d to 7939f85 Compare March 23, 2026 17:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/schemas/context.go (1)

321-339: ⚠️ Potential issue | 🟡 Minor

GetUserValues may not return complete values for scoped contexts.

Scoped contexts delegate Value/SetValue to valueDelegate (the root), but GetUserValues reads directly from bc.userValues and merges with bc.parent. Since scoped.parent is set to the root's parent (not the root itself), and scoped.userValues is empty (all writes delegate), GetUserValues() on a scoped context will miss values stored on the root.

If plugins call GetUserValues() on their scoped context, they won't see values set on the root BifrostContext.

🔧 Suggested fix to delegate GetUserValues for scoped contexts
 func (bc *BifrostContext) GetUserValues() map[any]any {
+	if bc.valueDelegate != nil {
+		return bc.valueDelegate.GetUserValues()
+	}
 	result := make(map[any]any)
 
 	// First, get parent's user values if parent is a PluginContext
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/context.go` around lines 321 - 339, GetUserValues currently
reads bc.userValues and bc.parent, which misses values when this BifrostContext
is a scoped context delegating to valueDelegate; modify GetUserValues in
BifrostContext to detect when bc.valueDelegate != nil and then delegate to the
root delegate's GetUserValues (or merge delegate values with bc.userValues so
scoped overrides still take precedence). Update logic using the existing
symbols: BifrostContext.GetUserValues, bc.valueDelegate, bc.userValues,
bc.parent, and bc.valuesMu so that scoped contexts return the complete merged
set including values from the root delegate.
♻️ Duplicate comments (1)
transports/bifrost-http/handlers/middlewares.go (1)

315-317: ⚠️ Potential issue | 🟠 Major

Avoid exposing raw plugin error to clients.

err.Error() from HTTPTransportPreHook may contain internal plugin/provider details. Per repository conventions, log the full error internally and return a stable, non-sensitive message to clients.

🔒 Suggested fix
 			if err != nil {
+				logger.Error("HTTPTransportPreHook error for plugin %s: %v", pluginName, err)
 				// Short-circuit with error — drain plugin logs before returning
 				if logs := bifrostCtx.DrainPluginLogs(); len(logs) > 0 {
 					ctx.SetUserValue(schemas.BifrostContextKeyTransportPluginLogs, logs)
 				}
 				ctx.SetStatusCode(fasthttp.StatusInternalServerError)
-				ctx.SetBodyString(err.Error())
+				ctx.SetBodyString("transport plugin error")
 				return
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/middlewares.go` around lines 315 - 317, In
HTTPTransportPreHook where you currently call ctx.SetBodyString(err.Error()) and
ctx.SetStatusCode(fasthttp.StatusInternalServerError), avoid returning the raw
err to clients: log the full error internally (use the existing logger used in
this package/handler, e.g., transport logger or ctx logger) including context
and err, then replace the response body with a stable, non-sensitive message
like "internal server error" or "plugin error" while keeping the 500 status;
ensure the variable err is still logged but not exposed to the client.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@core/schemas/context.go`:
- Around line 321-339: GetUserValues currently reads bc.userValues and
bc.parent, which misses values when this BifrostContext is a scoped context
delegating to valueDelegate; modify GetUserValues in BifrostContext to detect
when bc.valueDelegate != nil and then delegate to the root delegate's
GetUserValues (or merge delegate values with bc.userValues so scoped overrides
still take precedence). Update logic using the existing symbols:
BifrostContext.GetUserValues, bc.valueDelegate, bc.userValues, bc.parent, and
bc.valuesMu so that scoped contexts return the complete merged set including
values from the root delegate.

---

Duplicate comments:
In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 315-317: In HTTPTransportPreHook where you currently call
ctx.SetBodyString(err.Error()) and
ctx.SetStatusCode(fasthttp.StatusInternalServerError), avoid returning the raw
err to clients: log the full error internally (use the existing logger used in
this package/handler, e.g., transport logger or ctx logger) including context
and err, then replace the response body with a stable, non-sensitive message
like "internal server error" or "plugin error" while keeping the 500 status;
ensure the variable err is still logged but not exposed to the client.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4e01682e-b0ec-4dbd-8589-c05408d2bf14

📥 Commits

Reviewing files that changed from the base of the PR and between 3657b8d and 7939f85.

📒 Files selected for processing (22)
  • core/bifrost.go
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • core/schemas/trace.go
  • core/schemas/tracer.go
  • examples/plugins/hello-world/main.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/tracing/store.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • plugins/logging/operations_test.go
  • plugins/logging/writer.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • ui/lib/types/logs.ts
✅ Files skipped from review due to trivial changes (3)
  • plugins/logging/operations_test.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • plugins/logging/writer.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • transports/bifrost-http/server/server.go
  • framework/tracing/store.go
  • framework/logstore/rdb.go
  • transports/bifrost-http/lib/config.go
  • core/schemas/trace.go
  • plugins/logging/main.go
  • ui/app/workspace/logs/views/pluginLogsView.tsx
  • core/bifrost.go

Copy link
Copy Markdown
Contributor Author

akshaydeo commented Mar 24, 2026

Merge activity

  • Mar 24, 11:58 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 24, 11:59 AM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo merged commit 0e06156 into v1.5.0 Mar 24, 2026
4 of 5 checks passed
@akshaydeo akshaydeo deleted the 03-23-logging_in_plugins branch March 24, 2026 11:59
@coderabbitai coderabbitai Bot mentioned this pull request Mar 24, 2026
18 tasks
akshaydeo added a commit that referenced this pull request Apr 18, 2026
* refactor: standardize empty array conventions for VK Provider & MCP Configs, and makes Provider Config weight optional for routing (#1932)

## Summary

Changes Virtual Key provider and MCP configurations from "allow-all by default" to "deny-by-default" security model. Virtual Keys now require explicit provider and MCP client configurations to allow access, improving security posture.

## Changes

- **Provider Configs**: Empty `provider_configs` now blocks all providers instead of allowing all
- **MCP Configs**: Empty `mcp_configs` now blocks all MCP tools instead of allowing all  
- **Weight Field**: Changed provider `weight` from required `float64` to optional `*float64` - null weight excludes provider from weighted routing
- **Migration**: Added automatic backfill migration to preserve existing Virtual Key behavior by adding all available providers/MCP clients to VKs with empty configs
- **Documentation**: Updated all references to reflect new deny-by-default behavior
- **UI Updates**: Modified Virtual Key creation/editing interface to reflect new behavior and weight handling

## Type of change

- [x] Feature
- [x] Refactor
- [x] Documentation

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Plugins
- [x] UI (Next.js)
- [x] Docs

## How to test

Test Virtual Key creation and provider/MCP access:

```sh
# Core/Transports
go version
go test ./...

# Test Virtual Key with no provider configs blocks requests
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-bf-empty-vk" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "test"}]}'
# Should return error about no providers configured

# Test Virtual Key with provider configs allows requests  
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-bf-configured-vk" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "test"}]}'
# Should work normally

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

## Breaking changes

- [x] Yes

**Impact**: Existing Virtual Keys with empty `provider_configs` or `mcp_configs` would be blocked after this change.

**Migration**: Automatic migration `migrationBackfillEmptyVirtualKeyConfigs` runs on startup to backfill existing Virtual Keys with all available providers/MCP clients, preserving current behavior. New Virtual Keys created after this change will use deny-by-default.

## Security considerations

This change significantly improves security posture by requiring explicit configuration of allowed providers and MCP tools for Virtual Keys. The automatic migration ensures no disruption to existing deployments while new Virtual Keys benefit from the more secure default behavior.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: add MCP auto tool injection toggle (#1933)

## Summary

Adds a new configuration option `DisableAutoToolInject` to the MCP (Model Context Protocol) system that allows disabling automatic tool injection into requests. When enabled, MCP tools are only included when explicitly requested via context headers or filters, providing more granular control over tool availability.

## Changes

- Added `DisableAutoToolInject` field to `MCPToolManagerConfig` schema with runtime update support
- Implemented atomic boolean storage in `ToolsManager` to safely handle concurrent access
- Added logic in `ParseAndAddToolsToRequest` to respect the disable flag and only inject tools when explicit context filters are present
- Extended configuration management with database migration, UI controls, and API endpoints
- Added hot-reload capability through `UpdateMCPDisableAutoToolInject` methods across the stack
- Updated UI with a toggle switch and clear documentation about the feature's behavior

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [x] UI (Next.js)
- [ ] Docs

## How to test

Validate the new MCP auto tool injection toggle:

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

Test the feature:
1. Configure MCP clients and tools
2. Enable "Disable Auto Tool Injection" in the MCP configuration UI
3. Make requests without explicit tool headers - tools should not be injected
4. Make requests with `x-bf-mcp-include-tools` header - tools should be injected
5. Verify hot-reload works by toggling the setting without server restart

## Screenshots/Recordings

UI changes include a new toggle switch in the MCP configuration view with descriptive text explaining when tools are injected based on explicit headers.

## Breaking changes

- [ ] Yes
- [x] No

This is a backward-compatible addition with a default value of `false` (auto injection enabled).

## Related issues

This addresses the need for more granular control over MCP tool injection behavior in request processing.

## Security considerations

The feature provides better control over tool exposure by allowing administrators to require explicit opt-in for tool injection, potentially reducing unintended tool access.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: VK MCP config now works as an AllowList (#1940)

## Summary

This PR implements MCP tool governance by enforcing virtual key MCP configurations as an execution-time allow-list. When virtual keys have empty MCPConfigs, all MCP tools are denied. When non-empty, each tool is validated against the configured allow-list at both inference time and MCP tool execution.

## Changes

- **Context parameter updates**: Changed MCP-related functions to use `*schemas.BifrostContext` instead of `context.Context` to enable tool tracking
- **Tool tracking**: Added `BifrostContextKeyMCPAddedTools` context key to track which MCP tools are added to requests
- **Governance enforcement**: Virtual key MCP configurations now act as execution-time allow-lists with validation in both `PreMCPHook` and `evaluateGovernanceRequest`
- **Auto-injection control**: Added `DisableAutoToolInject` configuration option that respects the toggle and skips auto-injection when headers are already set by callers
- **Decision type**: Added `DecisionMCPToolBlocked` for MCP tool governance violations
- **UI improvements**: Updated MCP view description and sidebar item naming for better clarity

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [x] Plugins
- [x] UI (Next.js)
- [ ] Docs

## How to test

Test MCP tool governance with virtual keys:

```sh
# Core/Transports
go version
go test ./...

# Test with virtual key having empty MCPConfigs (should deny all MCP tools)
curl -X POST /v1/chat/completions \
  -H "x-bf-virtual-key: test-vk-empty-mcp" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "test"}]}'

# Test with virtual key having specific MCP tool allowlist
curl -X POST /v1/chat/completions \
  -H "x-bf-virtual-key: test-vk-with-mcp" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "test"}]}'

# Test disable auto tool inject configuration
curl -X PUT /v1/config/mcp/disable-auto-tool-inject \
  -d '{"disable": true}'

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

New configuration options:
- `disable_auto_tool_inject`: Boolean flag to disable automatic MCP tool injection
- Virtual key `MCPConfigs`: Array of MCP client configurations that act as allow-lists

## Screenshots/Recordings

UI changes include updated MCP configuration view with clearer descriptions for the disable auto tool injection toggle and improved sidebar navigation labels.

## Breaking changes

- [x] Yes
- [ ] No

**Impact**: MCP-related function signatures now require `*schemas.BifrostContext` instead of `context.Context`. Virtual keys with empty MCPConfigs will now deny all MCP tools by default.

**Migration**: Update any custom MCP integrations to use the new context parameter type. Configure MCPConfigs on virtual keys that need MCP tool access.

## Related issues

Implements MCP tool governance and execution-time validation for virtual key configurations.

## Security considerations

- **Access control**: Virtual key MCP configurations now enforce strict allow-lists for tool execution
- **Context isolation**: Tool tracking is isolated per request context to prevent cross-request leakage
- **Validation**: Both pre-execution and execution-time validation prevent unauthorized tool access

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* refactor: standardize empty array conventions for VK Provider Config Allowed Keys (#2006)

## Summary

Migrates VK provider config allowed keys from implicit allow-all semantics to explicit deny-by-default behavior. Adds `AllowAllKeys` boolean field to enable granular key access control while maintaining backward compatibility.

## Changes

- Added `AllowAllKeys` boolean field to `TableVirtualKeyProviderConfig` with database migration
- Backfilled existing configs with `allow_all_keys=true` to preserve current behavior
- Updated key resolution logic: empty keys now denies all access, `["*"]` wildcard allows all keys
- Modified governance resolver to set empty `includeOnlyKeys` slice when no keys are configured
- Enhanced HTTP handlers to recognize `["*"]` wildcard and set `AllowAllKeys` flag appropriately
- Updated UI to display "Allow All Keys" option and show deny-by-default messaging
- Added JSON unmarshaling support for `["*"]` wildcard in config files

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [x] Plugins
- [x] UI (Next.js)
- [ ] Docs

## How to test

Validate the migration and new key access control behavior:

```sh
# Core/Transports
go version
go test ./...

# Test migration runs successfully
go run main.go migrate

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

Test scenarios:
1. Create VK with empty `key_ids` - should deny all keys
2. Create VK with `key_ids: ["*"]` - should allow all keys  
3. Create VK with specific key IDs - should allow only those keys
4. Verify existing VKs maintain their current behavior after migration

## Screenshots/Recordings

UI now shows:
- "Allow All Keys" option in key selection dropdown
- "No keys allowed" vs "All keys allowed" status indicators
- "No providers configured (deny-by-default)" messaging

## Breaking changes

- [ ] Yes
- [x] No

The migration preserves existing behavior by setting `allow_all_keys=true` for configs that previously had no keys specified.

## Related issues

Part of VK access control enhancement initiative.

## Security considerations

Improves security posture by implementing deny-by-default semantics for key access. Existing deployments maintain current access patterns through automatic backfill migration.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* refactor: standardize empty array conventions for allowed models (#2113)

## Summary

Standardizes empty array conventions across Bifrost to implement deny-by-default security semantics. Previously, empty arrays for `allowed_models` and `Models` fields meant "allow all", creating potential security gaps. Now `["*"]` explicitly means "allow all" while empty arrays mean "deny all".

## Changes

- **Core Logic**: Updated model filtering in `bifrost.go` and `selectKeyFromProviderForModel` to treat empty `Models` arrays as deny-all and `["*"]` as allow-all
- **Database Migration**: Added `migrationBackfillAllowedModelsWildcard` to convert existing empty arrays to `["*"]` preserving current behavior for existing records
- **Model Catalog**: Updated `IsModelAllowedForProvider` to use wildcard semantics with deny-by-default fallback
- **Schema Defaults**: Changed default `Models` value from `[]` to `["*"]` in table definitions and form schemas
- **UI Components**: Enhanced `ModelMultiselect` with `allowAllOption` prop and updated virtual key forms to handle wildcard selection
- **Documentation**: Updated JSON schemas, comments, and tooltips to reflect new conventions
- **Governance**: Updated provider config filtering logic to use new wildcard semantics
- **Server Bootstrap**: Added wildcard filtering when loading models to prevent literal "*" from appearing as a model name

## Type of change

- [x] Refactor
- [ ] Bug fix
- [ ] Feature
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Providers/Integrations
- [x] Plugins
- [x] UI (Next.js)
- [x] Docs

## How to test

Validate the migration and new semantics:

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

Test scenarios:
1. Create new virtual keys - should default to `["*"]` for allowed models
2. Create new provider keys - should default to `["*"]` for models
3. Verify existing keys with empty arrays are migrated to `["*"]`
4. Test that empty arrays now deny all models/keys as expected
5. Verify UI shows "All models allowed" for wildcard and "No models (deny all)" for empty arrays

## Screenshots/Recordings

UI changes include:
- Model multiselect now shows "Allow All Models" option
- Virtual key details display "All Models" badge for wildcard vs "No models (deny all)" for empty
- Form placeholders updated to reflect new semantics

## Breaking changes

- [x] Yes
- [ ] No

**Migration Impact**: The database migration automatically converts existing empty `allowed_models` and `models_json` arrays to `["*"]`, preserving current behavior. However, any new configurations with empty arrays will now deny access instead of allowing all. Applications relying on "empty = allow all" semantics must be updated to use `["*"]` explicitly.

## Related issues

Part of security hardening initiative to implement explicit allow-lists and deny-by-default semantics across Bifrost configuration.

## Security considerations

This change significantly improves security posture by:
- Eliminating ambiguous "empty means allow all" semantics
- Implementing explicit deny-by-default for new configurations
- Requiring intentional wildcard usage via `["*"]` for broad access
- Maintaining backward compatibility through automatic migration

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* refactor: replace string slices with WhiteList for allowlist fields (#2125)

## Summary

Introduces a new `WhiteList` type to standardize whitelist behavior across the codebase, replacing manual slice operations and string comparisons with semantic methods for handling allow/deny lists.

## Changes

- Added `WhiteList` type with methods `IsAllowed()`, `IsUnrestricted()`, `IsEmpty()`, `Contains()`, and `Validate()`
- Replaced `[]string` fields with `WhiteList` for model restrictions, tool filtering, and key access controls
- Updated all whitelist logic to use semantic methods instead of manual `slices.Contains()` checks
- Added validation to ensure wildcards ("*") aren't mixed with specific values and prevent duplicates
- Improved case-insensitive matching for whitelist comparisons

## Type of change

- [x] Refactor

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Plugins

## How to test

Verify that whitelist behavior remains consistent across all affected components:

```sh
# Core/Transports
go version
go test ./...

# Test specific whitelist scenarios:
# - Empty lists deny all access
# - ["*"] allows all access  
# - Specific lists only allow listed items
# - Mixed wildcards and specific items are rejected
# - Duplicate entries are rejected
```

Test key model filtering, MCP tool execution, and virtual key configurations to ensure whitelist logic works correctly.

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

The `WhiteList` type maintains the same JSON serialization format as `[]string`, so existing configurations remain compatible.

## Related issues

N/A

## Security considerations

Improves security by standardizing deny-by-default behavior and adding validation to prevent misconfigured whitelists that could inadvertently grant excessive permissions.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: add request-level extra headers support for MCP tool execution (#2126)

## Summary

This PR adds support for request-level extra headers in MCP tool execution, allowing callers to forward specific headers to MCP servers at runtime based on a per-client allowlist configuration.

## Changes

- Added `AllowedExtraHeaders` field to MCP client configuration with allowlist semantics (empty array = deny all, `["*"]` = allow all)
- Introduced `BifrostContextKeyMCPExtraHeaders` context key to track headers forwarded to MCP tools
- Created `core/mcp/utils` package with `GetHeadersForToolExecution` function to merge static and dynamic headers
- Updated MCP tool execution in both regular tool manager and Starlark code mode to use the new header forwarding system
- Added database migration for `allowed_extra_headers_json` column in MCP client table
- Updated UI to include allowed extra headers configuration in MCP client management
- Enhanced auth demo server example to demonstrate tool-execution level authentication patterns

## Type of change

- [x] Feature

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] UI (Next.js)

## How to test

1. Configure an MCP client with allowed extra headers:
```json
{
  "name": "test-client",
  "connection_string": "http://localhost:3002/",
  "auth_type": "headers",
  "headers": {
    "X-API-Key": "connection-secret"
  },
  "allowed_extra_headers": ["X-Tool-Token"],
  "tools_to_execute": ["*"]
}
```

2. Make requests with extra headers that should be forwarded:
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer your-key" \
  -H "X-Tool-Token: tool-execution-secret" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Use the secret_data tool"}],
    "tools": [{"type": "function", "function": {"name": "secret_data"}}]
  }'
```

3. Test the auth demo server:
```bash
cd examples/mcps/auth-demo-server
go run main.go
# Server demonstrates two-tier auth: connection-level (X-API-Key) and tool-level (X-Tool-Token)
```

4. Run tests:
```sh
go test ./core/mcp/...
go test ./transports/bifrost-http/...

cd ui
pnpm test
pnpm build
```

## Breaking changes

- [ ] Yes
- [x] No

This is a backward-compatible addition. Existing MCP clients will have empty `allowed_extra_headers` (deny all extra headers) which maintains current behavior.

## Security considerations

- Extra headers are filtered through a strict allowlist per MCP client
- Security denylist prevents auth header overrides via extra headers
- Two-tier authentication pattern demonstrated: connection-level + tool-execution level
- Headers are only forwarded to MCP servers that explicitly allow them

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* fix: apply MCP tool filtering headers to tools/list response when using bifrost as MCP gateway (#2127)

## Summary

Adds support for `x-bf-mcp-include-clients` and `x-bf-mcp-include-tools` request headers to filter MCP tools/list response when using Bifrost as an MCP gateway. This ensures that tool filtering is respected at the MCP protocol level, not just during inference.

## Changes

- Implemented dynamic tool filtering in MCP server handlers that respects per-request include headers
- Added `makeIncludeClientsFilter()` function that filters tools based on request context values
- Registered the tool filter on both global and virtual key MCP servers during initialization
- Updated documentation to clarify that `mcp-include-tools` requires `clientName-toolName` format
- Enhanced examples in documentation to show proper tool naming format

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [x] Docs

## How to test

Test MCP gateway functionality with tool filtering:

```sh
# Test tools/list filtering with include-tools header
curl --location 'http://localhost:8080/mcp/tools/list' \
--header 'x-bf-mcp-include-tools: gmail-send_email,filesystem-read_file' \
--header 'Authorization: Bearer your-vk-here'

# Test tools/list filtering with include-clients header  
curl --location 'http://localhost:8080/mcp/tools/list' \
--header 'x-bf-mcp-include-clients: gmail,filesystem' \
--header 'Authorization: Bearer your-vk-here'

# Verify chat completions still respect the same headers
curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'x-bf-mcp-include-tools: gmail-send_email' \
--header 'Content-Type: application/json' \
--data '{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role": "user", "content": "What tools are available?"}]
}'
```

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

The tool filtering mechanism ensures that virtual key restrictions are properly enforced at the MCP protocol level, preventing unauthorized access to tools that should be filtered out based on request headers.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* refactor: parallelize model listing for providers to speed up startup time (#2151)

## Summary

Parallelizes model listing operations for providers during server startup and provider reloading to significantly reduce initialization time. Previously, model listing was performed sequentially for each provider, causing slower startup times especially when multiple providers were configured.

## Changes

- Added concurrent execution using goroutines and sync.WaitGroup for model listing operations in three key functions: `ReloadProvider`, `ForceReloadPricing`, and `Bootstrap`
- In `ReloadProvider`, both filtered and unfiltered model listing requests now run concurrently for the same provider
- In `ForceReloadPricing` and `Bootstrap`, model listing for different providers now runs in parallel instead of sequentially
- Moved provider key retrieval earlier in `ReloadProvider` to ensure it happens before concurrent model listing
- Added proper context cancellation with defer statements for bifrost contexts

## Type of change

- [x] Refactor

## Affected areas

- [x] Transports (HTTP)

## How to test

Test server startup time with multiple providers configured to verify the performance improvement:

```sh
# Core/Transports
go version
go test ./...

# Test with multiple providers configured
# Measure startup time before and after the change
time go run main.go
```

Configure multiple providers in your bifrost configuration and observe faster startup times, especially noticeable when providers have high latency or many models.

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

No security implications. The change maintains the same authentication and authorization patterns while improving performance through parallelization.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* fix: reorder migrations and set AllowAllKeys to true for virtual key provider configs (#2158)

## Summary

Fixes database migration ordering issue and ensures virtual key configurations are properly initialized with the AllowAllKeys field set to true.

## Changes

- Reordered database migrations to execute `migrationAddAllowAllKeysToProviderConfig` before `migrationBackfillEmptyVirtualKeyConfigs` to ensure the AllowAllKeys column exists before backfilling
- Added `AllowAllKeys: true` to provider configurations created during virtual key backfill migration to enable unrestricted key access by default

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Verify that database migrations run successfully and virtual key configurations are created with proper defaults:

```sh
# Core/Transports
go version
go test ./...
```

Test migration ordering by running against a fresh database to ensure no column reference errors occur.

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

This change enables unrestricted key access by default for virtual key configurations, which may have security implications depending on the intended access control model.

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* feat: implement scoped pricing override

* refactor: custom pricing refactor

* fix: resolve merge conflicts in config loading and governance functions (#2230)

## Summary

Resolves Git merge conflicts in the bifrost-http configuration loading code by cleaning up duplicate function definitions and consolidating the configuration initialization flow.

## Changes

- Removed Git merge conflict markers and duplicate code blocks from `LoadConfig` function
- Consolidated governance configuration loading by keeping both `loadGovernanceConfigFromFile` and `loadGovernanceConfig` functions with distinct purposes
- Removed duplicate `convertSchemasMCPClientConfigToTable` function definition
- Moved pricing overrides initialization logic to `initFrameworkConfig` function for better organization
- Cleaned up extensive duplicate default configuration loading code that was causing merge conflicts
- Changed error handling for pricing overrides from returning error to logging warning

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Verify that configuration loading works correctly without merge conflicts:

```sh
# Core/Transports
go version
go test ./...
go build ./transports/bifrost-http/...
```

Test configuration loading with various scenarios:
- Config file present
- Config file absent (default loading)
- Store-based configuration
- Governance and MCP configuration loading

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

No security implications - this is a merge conflict resolution that maintains existing functionality.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: add Stability AI model support for Bedrock image generation (#2180)

## Summary

Adds support for Stability AI image generation models (stability.stable-image-*) to the Bedrock provider, enabling text-to-image generation with models like stability.stable-image-core-v1:1 and stability.stable-image-ultra-v1:1.

## Changes

- Added `isStabilityAIModel()` function to detect Stability AI models by "stability." prefix
- Created `ToStabilityAIImageGenerationRequest()` to convert Bifrost requests to Stability AI's flat request format
- Implemented `StabilityAIImageGenerationRequest` type with support for prompt, mode, aspect_ratio, output_format, seed, and negative_prompt parameters
- Added conditional routing in `ImageGeneration()` to use Stability AI request format when appropriate
- Extended known fields for image generation parameters to include "aspect_ratio" and "input_images"
- Updated documentation comment to reflect Stability AI model support

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Test Stability AI image generation through the Bedrock provider:

```sh
# Core/Transports
go version
go test ./...

# Test with a Stability AI model
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-key" \
  -d '{
    "model": "stability.stable-image-core-v1:1",
    "prompt": "A beautiful sunset over mountains",
    "aspect_ratio": "16:9",
    "output_format": "PNG"
  }'
```

Ensure AWS credentials are configured for Bedrock access and the Stability AI models are available in your region.

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

No additional security implications beyond existing Bedrock provider authentication and AWS credential handling.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: add Stability AI image edit models support to Bedrock provider (#2225)

## Summary

Adds support for Stability AI image editing models in the Bedrock provider, expanding image editing capabilities beyond the existing Titan and Nova Canvas models.

## Changes

- Added `getStabilityAIEditTaskType()` function to infer edit task types from Stability AI model names (inpaint, outpaint, recolor, search-replace, erase-object, remove-bg, control-sketch, control-structure, style-guide, style-transfer, upscale-creative, upscale-conservative, upscale-fast)
- Created `ToStabilityAIImageEditRequest()` function to convert Bifrost requests to Stability AI's flat JSON format, with task-specific field validation
- Added `StabilityAIImageEditRequest` struct with comprehensive field support for all Stability AI edit operations
- Enhanced `BedrockImageGenerationResponse` with Seeds and FinishReasons fields for Stability AI compatibility
- Modified `ImageEdit()` method to route requests to appropriate conversion function based on model type
- Updated documentation to reflect expanded model support

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [x] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Test with various Stability AI edit models through the Bedrock provider:

```sh
# Core/Transports
go version
go test ./...

# Test image editing with Stability AI models
# Example: stable-image-inpaint, stable-outpaint, stable-creative-upscale, etc.
```

Verify that task-specific parameters are correctly mapped and invalid fields are filtered out based on the detected task type.

## Screenshots/Recordings

N/A - Backend functionality only

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

Image data is handled as base64-encoded strings. Mask and image parameters are properly validated before processing.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* fix: send back accumulated usage in MCP agent mode (#2246)

## Summary

This PR fixes token usage tracking in MCP agent mode by accumulating usage across all LLM calls in the agent loop and returning the total usage in the final response.

## Changes

- Added usage accumulation logic in the MCP agent execution loop to track token consumption across multiple LLM calls
- Implemented `mergeUsage` function to combine token counts and costs from multiple `BifrostLLMUsage` values, handling all detail sub-fields including prompt tokens, completion tokens, and cost breakdowns
- Extended agent API adapters with `extractUsage` and `applyUsage` methods to handle usage extraction and application for both Chat API and Responses API
- Applied accumulated usage to the final response before returning it to the client

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Test MCP agent mode with multiple tool calls to verify usage accumulation:

```sh
# Core/Transports
go version
go test ./...

# Test MCP agent mode with multiple LLM calls
# Verify that the returned usage reflects the sum of all calls in the agent loop
# Check that both token counts and cost details are properly accumulated
```

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

No security implications - this change only affects usage tracking and reporting.

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* [codemode]: fixing string escape corruption, enable top-level control flow in starlark, refining the prompt of executecode tool (#2206)

## Changes

- **Enhanced Starlark dialect configuration**: Enabled top-level control flow statements (if/for/while), while loops, set() builtin, global variable reassignment, and recursive functions for a more Python-like experience
- **Improved string escape handling**: Removed automatic `\n` to newline conversion, allowing Starlark's native string escape processing to handle `\n`, `\t`, and other escape sequences correctly
- **Updated tool description**: Streamlined the executeToolCode tool description with clearer syntax notes, explicit documentation of Starlark differences from Python (no try/except, no classes, no imports, no f-strings), and emphasis on fresh isolated scope per execution
- **Enhanced error hints**: Added specific error messages for unsupported Python features like try/except/finally/raise, with guidance on alternative approaches and scope persistence warnings
- **Comprehensive test coverage**: Added tests for dialect options, string escape preservation, unsupported feature detection, and end-to-end JSON deserialization scenarios

## Type of change

- [ ] Feature
- [ ] Bug fix
- [x] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go) - Starlark CodeMode improvements
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Test the enhanced Starlark features with MCP CodeMode:

```sh
# Test dialect options (top-level control flow, while loops, etc.)
make test-mcp TESTCASE=TestStarlarkDialectOptions

# Test string escape handling
make test-mcp PATTERN=TestStarlarkStringEscape

# Test unsupported feature detection
make test-mcp PATTERN=TestStarlarkUnsupportedFeatures
```

## Breaking changes

- [ ] Yes
- [x] No

The Starlark changes are additive and maintain backward compatibility while enabling more Python-like syntax.

## Security considerations

Starlark CodeMode maintains its existing sandboxing with no additional network or filesystem access. The dialect enhancements only affect language features within the existing security boundary.

* logging in plugins (#2215)

## Summary

Reorders middleware initialization in the Bifrost HTTP server to ensure tracing middleware is added before transport interceptor middleware in the inference pipeline.

## Changes

- Moved tracing middleware initialization and setup earlier in the bootstrap process
- Reordered middleware registration so tracing middleware is prepended before transport interceptor middleware
- Updated comments to clarify the middleware ordering logic and rationale

The change ensures that tracing context and trace IDs are properly established before other middleware components process requests.

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Verify that tracing middleware executes before transport interceptor middleware by checking trace logs and middleware execution order.

```sh
# Core/Transports
go version
go test ./...
```

Test with tracing enabled to ensure trace IDs are properly set in context before subsequent middleware processing.

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

No security implications - this is a middleware ordering change that affects observability components.

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* fix: handling text, vtt, srt response format for transcriptions (#2102)

* feat: add virtual key access management for MCP clients (#2255)

## Summary

Adds virtual key access management to MCP client configuration, allowing administrators to control which virtual keys can access specific MCP servers and which tools they can execute on a per-VK basis.

## Changes

- Added `vk_configs` field to MCP client update API that accepts an array of virtual key configurations
- Each VK config specifies a virtual key ID and the tools it's allowed to execute on that MCP server
- When `vk_configs` is provided, it atomically replaces all existing VK assignments for the MCP client
- Added database method `GetVirtualKeyMCPConfigsByMCPClientID` to retrieve VK configs by MCP client
- Updated OpenAPI documentation to describe the new VK configuration functionality
- Enhanced UI with virtual key access management section in the MCP client sheet
- Added Go SDK context keys for MCP tool filtering: `MCPContextKeyIncludeClients`, `MCPContextKeyIncludeTools`, and `BifrostContextKeyMCPExtraHeaders`
- Updated context keys documentation with comprehensive MCP configuration examples

## Type of change

- [x] Feature

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] UI (Next.js)
- [x] Docs

## How to test

1. Create an MCP client with tools available
2. Create virtual keys in the system
3. Update the MCP client with VK configurations:

```sh
curl -X PUT /api/mcp/client/{id} \
  -H "Content-Type: application/json" \
  -d '{
    "name": "test-client",
    "vk_configs": [
      {
        "virtual_key_id": "vk-123",
        "tools_to_execute": ["*"]
      },
      {
        "virtual_key_id": "vk-456", 
        "tools_to_execute": ["read_file", "write_file"]
      }
    ]
  }'
```

4. Verify VK assignments are created/updated in the database
5. Test the UI by opening an MCP client sheet and managing virtual key access

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

## Screenshots/Recordings

The UI now includes a "Virtual Key Access" section in the MCP client configuration sheet where administrators can:
- Add virtual keys to grant access to the MCP server
- Configure which specific tools each virtual key can execute
- Remove virtual key access entirely

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

This enables fine-grained access control for MCP servers at the virtual key level, complementing the existing governance and budgeting features.

## Security considerations

- VK access controls are enforced through the governance plugin during MCP tool execution
- The atomic replacement of VK assignments prevents partial updates that could leave the system in an inconsistent state
- Tool-level restrictions allow principle of least privilege by limiting which MCP tools each virtual key can access

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: adds option to allow MCP clients to run on all virtual keys (#2258)

## Summary

Adds a new `AllowOnAllVirtualKeys` configuration option for MCP clients that enables them to be accessible to all virtual keys without requiring explicit per-key assignment. When enabled, all tools from the MCP client are available to every virtual key.

## Changes

- Added `AllowOnAllVirtualKeys` boolean field to `MCPClientConfig` schema and database table
- Updated MCP client manager to handle the new field during client updates
- Modified governance plugin to check for clients with `AllowOnAllVirtualKeys` enabled and automatically include their tools for all virtual keys
- Added database migration to add the new column to `TableMCPClient`
- Updated UI to include a toggle for the new setting with tooltip explanation
- Added OpenAPI documentation for the new field
- Updated configuration store methods to persist and retrieve the new field

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [x] Plugins
- [x] UI (Next.js)
- [x] Docs

## How to test

1. Create or update an MCP client with `allow_on_all_virtual_keys: true`
2. Verify that the client's tools are available to all virtual keys without explicit assignment
3. Test that the governance plugin correctly allows tools from such clients
4. Verify the UI toggle works correctly in the MCP client edit sheet

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

The new configuration field `allow_on_all_virtual_keys` defaults to `false` to maintain backward compatibility.

## Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

## Breaking changes

- [ ] Yes
- [x] No

This is a backward-compatible addition with the new field defaulting to `false`.

## Related issues

Link related issues and discussions. Example: Closes #123

## Security considerations

This feature reduces access control granularity by allowing MCP clients to bypass virtual key restrictions when enabled. Administrators should carefully consider which MCP clients should have this permission as it grants broad access across all virtual keys.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* feat: add provider keys CRUD to configstore and in-memory store (#2159)

## Summary

Adds dedicated CRUD operations for individual provider keys at the data layer
(configstore interface + RDB implementation) and in-memory store. This enables
key-level operations without replacing the entire provider key set, which is
required for the new `/api/providers/{provider}/keys/*` endpoints.

## Changes

- Added `GetProviderKeys`, `GetProviderKey`, `CreateProviderKey`,
  `UpdateProviderKey`, `DeleteProviderKey` to `ConfigStore` interface
- Implemented all five methods in `RDBConfigStore` with proper GORM queries,
  error handling, and `ErrNotFound` propagation
- Extracted `schemaKeyFromTableKey` and `tableKeyFromSchemaKey` helpers to
  deduplicate key conversion logic (previously inlined in `GetProvidersConfig`
  and `GetProviderConfig`)
- Added `AddProviderKey`, `UpdateProviderKey`, `RemoveProviderKey` to in-memory
  `Config` with mutex locking, DB persistence, and rollback on client update
  failure
- Added `GetProviderKeysRaw`, `GetProviderKeysRedacted`, `GetProviderKeyRaw`,
  `GetProviderKeyRedacted` read methods
- Added `TestProviderKeyCRUD` and `TestProviderKeyCRUD_ProviderMustExist`
  integration tests
- Updated `MockConfigStore` with all five new interface methods

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

```sh
# Run configstore tests
go test ./framework/configstore/... -v -run TestProviderKeyCRUD

# Run config tests (mock store)
go test ./transports/bifrost-http/lib/... -v
```

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

Key values are handled through existing redaction infrastructure. No new secret
exposure paths introduced.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* feat: add provider keys HTTP handlers and refactor optional keys (#2160)

## Summary

Adds HTTP handlers for the dedicated provider keys CRUD endpoints and removes
`keys` from provider API responses and payloads. Keys are now exclusively
managed via `/api/providers/{provider}/keys/*`. Also fixes a context timeout bug
in `ReloadProvider` where model discovery could exhaust the shared context
budget, causing subsequent DB calls to fail.

## Changes

### Provider keys handlers (`provider_keys.go`)
- New file with five handlers: `listProviderKeys`, `getProviderKey`,
  `createProviderKey`, `updateProviderKey`, `deleteProviderKey`
- Includes `mergeUpdatedKey` (redacted value preservation logic used by
  `updateProviderKey`)
- Key handlers enforce keyless provider validation and trigger model discovery
  after mutations

### Provider handlers cleanup (`providers.go`)
- Registered new key routes: `GET/POST /api/providers/{provider}/keys`,
  `GET/PUT/DELETE /api/providers/{provider}/keys/{key_id}`
- Extracted inline anonymous structs into named `providerCreatePayload` and
  `providerUpdatePayload` types (without `Keys` field)
- Removed `Keys` field from `ProviderResponse`
- Switched `addProvider` from `json.Unmarshal` to `sonic.Unmarshal`
- Removed `oldConfigRedacted` fetch and the entire key merge block
  (`mergeKeys`, `hasKeys`, `slices` usage) from `updateProvider`
- Removed `Keys` from `getProviderResponseFromConfig` response builder
- Removed unused `encoding/json` import

### Context timeout fix (`server.go`)
- Split shared `bfCtx` in `ReloadProvider` into separate contexts:
  `filteredBfCtx` (15s) for filtered `ListModelsRequest` and `unfilteredBfCtx`
  (fresh 15s) for unfiltered `ListModelsRequest`, each cancelled after use
- Changed `GetKeysByProvider` to use `context.Background()` since it's a local
  DB call that shouldn't be gated by model discovery timeouts
- Added `hasNoKeys` check to emit warn-level logs instead of errors when model
  discovery fails because no keys are configured
- Read in-memory key count via `GetProviderKeysRaw` for the `hasNoKeys` check

### Tests (`providers_test.go`)
- Cleared file (contained only tests for removed inline struct decoding)

## Type of change

- [x] Feature
- [x] Bug fix
- [x] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

```sh
# Build
go build ./transports/bifrost-http/...

# Manual: start Bifrost, then test key CRUD
curl -X POST localhost:8080/api/providers/openai/keys -d '{"name":"test-key","value":"sk-test"}'
curl localhost:8080/api/providers/openai/keys
curl -X PUT localhost:8080/api/providers/openai/keys/{key_id} -d '{"name":"updated","value":"sk-new"}'
curl -X DELETE localhost:8080/api/providers/openai/keys/{key_id}

# Verify provider endpoints no longer return keys
curl localhost:8080/api/providers/openai | jq 'has("keys")'  # should be false
```

## Screenshots/Recordings

N/A

## Breaking changes

- [x] Yes
- [ ] No

Provider API responses no longer include `keys` field. Provider create/update
payloads no longer accept `keys`. Clients must use the dedicated
`/api/providers/{provider}/keys/*` endpoints for key management.

## Related issues

N/A

## Security considerations

- Key handlers use existing redaction infrastructure (`GetProviderKeyRedacted`)
  before returning responses
- Keyless provider validation prevents key creation on providers that don't
  support keys

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* feat: migrate frontend to dedicated provider keys API (#2161)

## Summary

Migrates the frontend from reading provider keys via `provider.keys` (removed
from provider API response in PR #2160) to the dedicated `getProviderKeys`
query and `/api/keys` endpoint. Removes `keys` from all provider TypeScript
types. Key mutations patch caches from authoritative server responses; provider
updates invalidate the `ProviderKeys` tag to refresh key statuses after model
discovery. Also adds a read-only routing rule info sheet.

## Changes

### Types (`config.ts`, `schemas.ts`)
- Removed `keys` field from `ModelProviderConfig`, `AddProviderRequest`, and
  `UpdateProviderRequest`
- Added `CreateProviderKeyRequest`, `UpdateProviderKeyRequest`,
  `ListProviderKeysResponse` types

### Store (`providersApi.ts`, `baseApi.ts`)
- Added `ProviderKeys` tag type to `baseApi`
- Changed `getProviderKeys`/`getProviderKey` from `Providers` tag to
  `ProviderKeys` tag (avoids invalidating provider cache on key changes)
- Added `invalidatesTags: [ProviderKeys, DBKeys]` on `updateProvider` mutation
  (refreshes key statuses after model discovery)
- Removed `getProvider`/`getProviders` cache patches from `createProviderKey`,
  `updateProviderKey`, `deleteProviderKey` (providers no longer carry keys)
- Added duplicate-check guards on `createProviderKey` cache patches to prevent
  ghost keys
- Each key mutation patches `getProviderKeys` and `getAllKeys` caches from
  authoritative server response

### Components
- **`modelProviderKeysTableView.tsx`**: Already uses `useGetProviderKeysQuery`;
  formatting/indentation fixes
- **`page.tsx`**: Removed `keys: []` from fallback provider object and
  `createProvider` call; simplified `KeyDiscoveryFailedBadge` to only check
  provider-level status (removed per-key status check since keys are no longer
  on provider)
- **`routingRuleSheet.tsx`**: `TargetRow` now receives `allKeys` prop (from
  `useGetAllKeysQuery`) instead of `providersData` with `.keys`; filters keys
  by target provider
- **`routingRuleInfoSheet.tsx`**: New read-only sheet component that displays
  routing rule details (conditions, targets with provider icons and weight bars,
  fallback chain, scope, priority, timestamps)
- **`settingsPanel.tsx`**: Uses `useGetAllKeysQuery` to determine configured
  providers (replaces `p.keys.length > 0` check) and derive
  `providerKeyConfigs` per provider

### Other frontend changes (from prior commit, unchanged)
- Added `getProviderKeys`, `getProviderKey` RTK Query endpoints
- Added `createProviderKey`, `updateProviderKey`, `deleteProviderKey` mutations
- Added `buildProviderUpdatePayload` utility for key-free provider updates
- Migrated `providerKeyForm.tsx` to separate create/update mutations
- Updated `addNewKeySheet.tsx` props from `keyIndex` to `keyId`
- Updated all 6 provider form fragments to use `buildProviderUpdatePayload`
- Removed dead `selectedProvider.keys` sync matchers from `providerSlice.ts`

## Type of change

- [x] Feature
- [x] Refactor
- [ ] Bug fix
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [x] UI (Next.js)
- [ ] Docs

## How to test

```sh
cd ui
npm run build
npm run lint
```

Manual testing:

1. Navigate to Providers page, select a provider with keys
2. Verify keys table loads correctly from dedicated API
3. Create a new key — verify it appears immediately (no ghost/duplicate)
4. Toggle enable/disable — verify switch updates immediately
5. Edit a key — verify form pre-populates, save works
6. Delete a key — verify it disappears immediately
7. Update provider settings — verify key statuses refresh after save
8. Check sidebar badge shows provider-level discovery failures
9. Open Playground settings — verify provider/key dropdowns work
10. Open Routing Rules — verify target key selector works
11. Click a routing rule row — verify info sheet opens with correct details
    (conditions, targets, fallbacks, scope, priority)

## Screenshots/Recordings

N/A — no visual changes to existing features; routing rule info sheet is new.

## Breaking changes

- [ ] Yes
- [x] No

Frontend-only changes consuming the new API shape from PR #2160.

## Related issues

N/A

## Security considerations

No new security considerations. Key values continue to be handled through
existing redaction on the backend.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* refactor: replace string slice with WhiteList type for model restrictions (#2282)

## Summary

Refactored model access control logic by replacing string slice with a dedicated `WhiteList` type for the `Models` field in `TableKey`. This change introduces a more structured approach to handling wildcard permissions and improves code readability.

## Changes

- Changed `Models` field type from `[]string` to `schemas.WhiteList` in `TableKey` struct
- Replaced manual wildcard checking (`model == "*"`) with `IsUnrestricted()` method calls across multiple functions
- Added missing mock method `GetVirtualKeyMCPConfigsByMCPClientIDs` to test configuration store
- Applied the refactoring consistently in `ReloadProvider`, `ForceReloadPricing`, and `Bootstrap` methods

## Type of change

- [x] Refactor
- [ ] Bug fix
- [ ] Feature
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Verify that model access control continues to work correctly with both wildcard and specific model permissions:

```sh
# Core/Transports
go version
go test ./...

# Test specific areas affected by the changes
go test ./framework/configstore/tables/...
go test ./transports/bifrost-http/...
```

Test scenarios should include:
- Keys with wildcard permissions (`["*"]`)
- Keys with specific model restrictions
- Keys with empty model lists (deny-by-default behavior)

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

This refactoring maintains the existing security model for API key permissions. The deny-by-default behavior and wildcard functionality remain unchanged, just implemented through a more structured type system.

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* feat: add Plus icon and responsive text to pricing override create button (#2285)

## Summary

Improves the visual design and mobile responsiveness of the pricing overrides section by adding a Plus icon to the create button and optimizing the button text for different screen sizes.

## Changes

- Added Plus icon import from lucide-react
- Enhanced the "Create Override" button with a Plus icon and responsive text that shows "New Override" on larger screens and hides text on mobile
- Adjusted container spacing by removing top margin and changing flex alignment from `items-start` to `items-center` for better visual balance

## Type of change

- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [x] UI (Next.js)
- [ ] Docs

## How to test

Navigate to the custom pricing overrides page and verify:
1. The "New Override" button displays with a Plus icon
2. On mobile screens, only the Plus icon is visible
3. On larger screens (sm and above), both icon and "New Override" text are visible
4. The button functionality remains unchanged when clicked

```sh
# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

## Screenshots/Recordings

Before/after screenshots showing the button design changes and responsive behavior would be helpful.

## Breaking changes

- [x] Yes
- [ ] No

## Related issues

## Security considerations

No security implications - this is a purely visual enhancement.

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* refactor: blacklist models on new convention (#2305)

## Summary

Implements comprehensive blacklist support for model filtering across all providers. This adds the ability to explicitly deny access to specific models at the key level, with blacklist rules taking precedence over allowlist rules.

## Changes

- Added `BlackList` type with semantic validation (supports wildcard "*" for block-all)
- Updated key selection logic to check both allowlist and blacklist constraints
- Modified all provider model listing functions to filter out blacklisted models
- Enhanced UI to support blacklist configuration with improved UX for wildcard selection
- Added blacklist filtering to model catalog and provider handlers
- Updated test cases to verify blacklist functionality

Key design decisions:
- Blacklist always wins over allowlist when conflicts occur
- Wildcard "*" in blacklist blocks all models for that key
- Empty blacklist blocks nothing (permissive default)
- Consistent filtering logic across all providers (Anthropic, Azure, Bedrock, Cohere, etc.)

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Providers/Integrations
- [ ] Plugins
- [x] UI (Next.js)
- [x] Docs

## How to test

Test blacklist functionality with provider keys:

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

Example configuration to test:
```json
{
  "keys": [{
    "id": "test-key",
    "models": ["*"],
    "blacklisted_models": ["gpt-4", "claude-3"]
  }]
}
```

Verify that blacklisted models are excluded from model listings and key selection.

## Screenshots/Recordings

UI now shows "Blocked Models" field with improved tooltips and wildcard handling for denying access to specific models.

## Breaking changes

- [ ] Yes
- [x] No

The `blacklisted_models` field was already present in the schema but not fully implemented. This change makes it functional without breaking existing configurations.

## Related issues

Enhances model access control capabilities for fine-grained permission management.

## Security considerations

Improves security by allowing explicit denial of access to sensitive or expensive models at the key level. Blacklist rules cannot be bypassed by allowlist configurations.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable

* minor fix add blacklisted model field in tableKeyFromSchemaKey (#2324)

## Summary

This PR adds support for the `BlacklistedModels` field when converting schema keys to table keys in the configuration store's RDB implementation.

## Changes

- Added `BlacklistedModels: key.BlacklistedModels` field mapping in the `tableKeyFromSchemaKey` function
- Ensures that blacklisted model information is properly preserved when converting between schema and table representations

## Type of change

- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Verify that configuration keys with blacklisted models are properly stored and retrieved from the RDB configstore.

```sh
# Core/Transports
go version
go test ./...
```

Test creating configuration entries with `BlacklistedModels` specified and ensure they persist correctly through the RDB layer.

## Screenshots/Recordings

N/A

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

N/A

## Security considerations

None - this change only adds field mapping for existing blacklisted models functionality.

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable

* feat: add image edit input view on logs (#2321)

## Summary

Adds support for logging image edit and image variation requests by introducing new database columns and UI components to track and display these image manipulation operations alongside existing image generation functionality.

## Changes

- Added `image_edit_input` and `image_variation_input` columns to the logs table with corresponding database migrations
- Extended the Log struct with new fields for storing and parsing image edit/variation input data
- Updated logging plugin to capture image edit and variation request data with large payload threshold handling
- Enhanced UI to display input images and prompts for image edit operations and input images for variation operations
- Added image MIME type detection for proper display of base64-encoded images in the UI

## Type of change

- [x] Feature
- [ ] Bug fix
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [x] Plugins
- [x] UI (Next.js)
- [ ] Do…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants