Skip to content

refactor: semantic cache plugin#3210

Open
Pratham-Mishra04 wants to merge 1 commit intomainfrom
04-30-fix_semantic_cache_fixes
Open

refactor: semantic cache plugin#3210
Pratham-Mishra04 wants to merge 1 commit intomainfrom
04-30-fix_semantic_cache_fixes

Conversation

@Pratham-Mishra04
Copy link
Copy Markdown
Collaborator

Summary

This PR refactors the semantic cache plugin to simplify its internal state management, improves cache lookup correctness, and adds a new cache_hit_types filter to the logs API and UI. The direct cache lookup path is now a single deterministic point-fetch by a UUIDv5 directCacheID (replacing the previous dual-path of chunk lookup + legacy metadata scan), and several context keys are consolidated. The UI gains a "Local Caching" filter sidebar section and cache hit type badges in the log detail view.

Changes

  • Semantic cache plugin refactor:

    • Replaced the dual direct-search path (performDirectChunkLookup + performLegacyDirectSearch) with a single performDirectSearch that does an O(1) GetChunk by deterministic directCacheID (UUIDv5 derived from provider, model, cacheKey, requestHash, paramsHash).
    • generateDirectCacheID now returns an error instead of silently falling back to a string concatenation, making failures explicit.
    • request_hash is no longer stored as a top-level metadata field; it is encoded into the directCacheID instead.
    • Reduced context keys from ~10 to 4 (directCacheIDKey, paramsHashKey, embeddingsKey, embeddingsInputTokensKey), removing stale keys like requestIDKey, requestHashKey, isCacheHitKey, and cacheHitTypeKey.
    • shouldSkipCaching is extracted into its own method; cache-hit detection now reads CacheDebug.CacheHit from the response rather than a context flag.
    • buildUnifiedMetadata no longer accepts requestHash as a parameter.
    • addSingleResponse renamed to addNonStreamingResponse.
    • StreamAccumulator fields HasError, FinalTimestamp, and FinishReason on StreamChunk are removed; error streams are handled by early return in PostLLMHook.
    • Streaming replay goroutine now guards every send with ctx.Done() to prevent goroutine leaks on dropped consumers.
    • A background runStreamCleanupLoop goroutine (started by Init, stopped by Cleanup via stopCh) replaces the one-shot cleanup call, periodically reaping stale stream accumulators.
    • buildResponseFromResult now accepts threshold, similarity, and inputTokens as pointers, and attachCacheDebug is extracted as a shared helper for both streaming and non-streaming paths.
    • isExpiredEntry is extracted as a standalone function.
    • chunkSortKey replaces the large inline sort comparator in processAccumulatedStream.
    • Tools, stop sequences, modalities, include lists, and other order-insensitive set fields are now hashed with hashSortedSet / sortedStringSet to prevent MCP's randomized map iteration from perturbing the request hash.
    • extractAttachmentsForCaching is extracted so attachment URLs are included in the cache key metadata rather than the embedding text.
    • extractTextForEmbedding no longer returns a paramsHash; callers compute it once via buildRequestMetadataForCaching + hashMap.
    • generateEmbedding moved from utils.go to search.go.
    • generateRequestHash now accepts prebuilt metadata to avoid recomputing it.
    • removeField no longer mutates the input slice's backing array.
    • Added PronunciationDictionaryLocators, TimestampGranularities, Include, AdditionalFormats, and InputImages to their respective parameter metadata extractors.
    • Public context key names changed from semantic_cache_* to semantic_cache-* (underscore → hyphen separator after the plugin prefix).
    • SelectFields no longer includes request_hash.
    • VectorStoreProperties no longer includes a request_hash entry.
    • CacheByModel and CacheByProvider default-value log messages added.
  • Log filtering — cache_hit_types:

    • Added CacheHitTypes []string to SearchFilters in framework/logstore/tables.go.
    • applyFilters in rdb.go applies a JSON path filter on cache_debug for both SQLite (json_extract) and PostgreSQL (substring regex) dialects, restricted to the allowlist ["direct", "semantic"].
    • canUseMatViewFilters excludes queries with CacheHitTypes set from the materialized-view fast path.
    • HTTP handlers (getLogs, getLogsStats, parseHistogramFilters) parse a cache_hit_types comma-separated query parameter.
  • UI:

    • Added a "Local Caching" filter section to LogsFilterSidebar with checkboxes for "Direct cache" and "Semantic cache".
    • cache_hit_types is added to URL state, filter state, and the buildFilterParams API helper.
    • Log detail view shows "Direct Cache" (indigo) and "Semantic Cache" (rose) badges based on cache_debug.hit_type.
    • Plugins form now filters the provider dropdown to embedding-capable providers only (EmbeddingSupportedProviders for built-ins; custom_provider_config.allowed_requests.embedding for custom providers), shows an error message when no embedding provider is configured, and disables the toggle accordingly.
    • Embedding model input replaced with ModelMultiselect (single-select mode) scoped to the selected provider.
    • Provider dropdown clears the embedding model when the provider changes.
    • Provider icons rendered in the provider dropdown.
    • EmbeddingSupportedProviders constant added to ui/lib/constants/logs.ts.
  • Misc:

    • HTTP request logging in CorsMiddleware and an auth debug log are commented out.
    • transports/bifrost-http/v1.5.x added to .gitignore.
    • Minor formatting fixes in core/schemas/bifrost.go and framework/modelcatalog/sync.go.
    • Missing newline at end of sync.go added.

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (React)
  • Docs

How to test

# Core/Transports
go test ./plugins/semanticcache/...
go test ./framework/logstore/...
go test ./transports/bifrost-http/...

# UI
cd ui
pnpm i
pnpm build
  • Configure the semantic cache plugin with a direct and/or semantic cache type and verify that cache hits are recorded with the correct hit_type in cache_debug.
  • Query /logs?cache_hit_types=direct and /logs?cache_hit_types=semantic and confirm only matching entries are returned.
  • In the UI, open the logs filter sidebar and verify the "Local Caching" section appears with "Direct cache" and "Semantic cache" checkboxes that correctly filter the log list.
  • Open a log detail for a cache hit and confirm the appropriate badge ("Direct Cache" or "Semantic Cache") is displayed.
  • In the plugins form, verify that only embedding-capable providers appear in the provider dropdown and that the embedding model field uses the model multiselect.

Breaking changes

  • Yes

The public semantic cache context key names have changed from semantic_cache_* to semantic_cache-*. Any caller setting CacheKey, CacheTTLKey, CacheThresholdKey, CacheTypeKey, or CacheNoStoreKey via the old string values will no longer be recognized by the plugin. Update all call sites to use the exported constants from the plugin package rather than raw string literals.

request_hash is no longer stored as a top-level metadata field in the vector store. Existing cache entries written by prior versions will not be found by the new direct-search path (they will be treated as misses and re-populated).

ClearCacheForRequestID is documented as currently broken for entries written by the new direct-search path; callers should not rely on it until the TODO is resolved.

Related issues

N/A

Security considerations

The CacheHitTypes filter allowlists values to "direct" and "semantic" before interpolating them into SQL, preventing arbitrary input from reaching the JSON path expression.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Collaborator Author

Pratham-Mishra04 commented May 4, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #3210

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Log filtering by cache hit type (direct/semantic) with URL state; Direct/Semantic badges and separate request/cache ID rows; prefer cache-hit latency in logs.
    • New "Local Caching" filter UI, storage & cache-key settings, and TTL (ttl) in cache config/UI.
    • Cache-clear endpoint now accepts cache ID for targeted deletion.
  • Bug Fixes

    • Semantic-cache UI restricts provider/model selection to embedding-capable providers.
  • Documentation

    • Docs and migration guide updated for cache-ID invalidation and cache-debug on misses.

Walkthrough

Adds cache-hit-type filtering end-to-end, converts cache-clear APIs to use storage cache IDs, and implements a large refactor of the semantic-cache plugin: deterministic direct IDs, per-request state with cleanup, rewritten streaming accumulation/replay, deterministic hashing/metadata, unified async writes, TTL parsing, and broad test and UI/schema updates.

Changes

End-to-end Cache-Hit Filtering & Logs

Layer / File(s) Summary
Data Shape
framework/logstore/tables.go
SearchFilters adds CacheHitTypes []string (json:"cache_hit_types,omitempty").
Query Logic
framework/logstore/rdb.go
applyFilters implements CacheHitTypes filtering (allowlist direct,semantic) with dialect-specific cache_debug.hit_type predicates; listSelectColumns adds cache_debug.
Mat-View Eligibility
framework/logstore/matviews.go
canUseMatViewFilters disallows mat-view path when len(f.CacheHitTypes) > 0.
HTTP Parsing
transports/bifrost-http/handlers/logging.go
Parses cache_hit_types query param into filters.CacheHitTypes for getLogs, getLogsStats, and histogram parsing.
API & Frontend Types
ui/lib/types/logs.ts, ui/lib/store/apis/logsApi.ts
LogFilters gets cache_hit_types?: string[]; buildFilterParams emits cache_hit_types as comma-joined param.
URL & Page Wiring
ui/app/workspace/logs/page.tsx
Adds cache_hit_types to URL state and includes it in derived filters persisted to URL.
Filter UI & Detail View
ui/components/filters/logsFilterSidebar.tsx, ui/app/workspace/logs/sheets/logDetailView.tsx
Adds Local Caching filter (Direct/Semantic checkboxes) and renders cache-type badges and cache-id/rows in log detail.
Constants
ui/lib/constants/logs.ts
Adds EmbeddingSupportedProviders used for embedding-capable provider gating.

Semantic Cache Plugin Refactor

Layer / File(s) Summary
State & Lifecycle
plugins/semanticcache/state.go, plugins/semanticcache/main.go
Introduces internal cacheState per-request, create/get/clear helpers, background cleanup loop (runCacheStateCleanupLoop), stopCh, writersWg, cleanupWg, and graceful shutdown handling.
Request Prep, Context API
plugins/semanticcache/main.go, plugins/semanticcache/utils.go, core/schemas/context.go
PreLLMHook anchors per-request state on BifrostContextKeyRequestID, resolves cache key/params hash, centralizes embedding/input handling; BifrostContext.Root() added for async-safe root context capture.
Deterministic Hashing & IDs
plugins/semanticcache/utils.go, plugins/semanticcache/search.go
Adds deterministic hashing helpers (hashSortedSet, hashMap), stable input normalization, generateRequestHash, and generateDirectCacheID using a fixed namespace.
Direct Lookup Flow
plugins/semanticcache/search.go
performDirectSearch now computes deterministic directCacheID, stores it on state, and performs single-ID GetChunk lookups (O(1)).
Semantic Lookup Flow
plugins/semanticcache/search.go
performSemanticSearch accepts paramsHash, generates embedding, stores embedding + inputTokens on state, builds strict vector filters (cache_key, params_hash, plugin marker, optional provider/model), and queries GetNearest(..., topK=1).
Response Construction & TTL
plugins/semanticcache/search.go
buildResponseFromResult validates properties, handles expiry via isExpiredEntry (async deletion on expiry), computes similarity, and builds streaming or non-streaming short-circuited responses; stampCacheDebugForHit centralizes hit telemetry including hit latency.
Streaming Accumulation & Replay
plugins/semanticcache/stream.go, plugins/semanticcache/utils.go
Adds chunkSortKey, creates/initializes accumulators inline with LastSeenAt, addStreamChunk only appends and refreshes LastSeenAt, processAccumulatedStream sorts and marshals non-nil chunks, parseStreamChunks accepts multiple backend encodings; reaper uses LastSeenAt and runStreamCleanupLoop.
Writes & Unified Metadata
plugins/semanticcache/utils.go, plugins/semanticcache/main.go
Unifies metadata via buildUnifiedMetadata (removes request_hash), adds addNonStreamingResponse/stream write paths, stamps miss telemetry (stampCacheDebugForMiss), and performs async writes with per-entry TTL resolved from config or CacheTTLKey.
Config JSON & Validation
plugins/semanticcache/main.go
Custom Config.UnmarshalJSON accepts ttl as either duration string or numeric seconds and rejects negative TTLs.
API & Handler Changes
transports/bifrost-http/handlers/cache.go, docs/*, docs/openapi/*
Cache-clear route changed to /api/cache/clear/{cacheId}; handler now uses cacheClearer interface and calls ClearCacheForCacheID; OpenAPI/docs/migration guide updated.
VectorStore Validation
framework/vectorstore/weaviate.go
Adds validateClassName to ensure Weaviate class names start with uppercase ASCII when non-empty.
Telemetry & Log Integration
plugins/logging/main.go, plugins/logging/operations.go
Logging prefers CacheHitLatency for latency when present; streaming stop-reason kept applied before content logging.
Tests & Test Helpers
plugins/semanticcache/*_test.go, plugins/semanticcache/test_utils.go, transports/bifrost-http/handlers/cache_test.go
Extensive test additions/rewrites: TestMain cleanup, many new unit/integration tests (clear-by-cache-id/key, miss/hit stamping, cleanup/reaping, replay cancellation, embedding-format conversions, concurrency tests), observable-store test double, test helpers now accept testing.TB and generate per-test cache keys and shared SharedTestNamespace; many tests switched to t.Parallel() and use t.Skipf on upstream failures.
Public API Changes
plugins/semanticcache/main.go, transports/bifrost-http/handlers/cache.go
Removed ClearCacheForRequestID; added ClearCacheForCacheID; cache handler plugin field typed to cacheClearer interface; context key constants renamed to hyphenated strings.
UI Config Shape
ui/lib/types/config.ts, ui/lib/types/schemas.ts, ui/app/workspace/config/views/pluginsForm.tsx
Renames ttl_secondsttl, adds vector_store_namespace and default_cache_key; UI gates semantic cache by embedding-capable providers, uses ModelMultiselect, exposes storage & cache-key fields, and migrates legacy TTL.
Miscellaneous
multiple files
Whitespace/formatting edits and small middleware logging removals; .gitignore adds transports/bifrost-http/v1.5.x.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Plugin as SemanticCache Plugin
    participant Embedding as Embedding Executor
    participant Store as Vector/Chunk Store

    Client->>Plugin: Request (PreLLMHook)
    Plugin->>Plugin: resolve cache_key, paramsHash, create cacheState
    alt direct lookup
        Plugin->>Store: GetChunk(directCacheID)
        Store-->>Plugin: chunk / miss
    else semantic lookup
        Plugin->>Embedding: generate embedding
        Embedding-->>Plugin: embedding + inputTokens
        Plugin->>Store: GetNearest(filters: cache_key, params_hash, plugin_marker, maybe provider/model)
        Store-->>Plugin: nearest result / miss
    end
    Plugin-->>Client: short-circuit response if hit
    Client->>Plugin: Upstream provider result (PostLLMHook)
    Plugin->>Plugin: shouldSkipCaching? resolve storageID & embedding-write eligibility
    Plugin->>Store: async Add (stream or non-stream) with unified metadata
    Store-->>Plugin: ack
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🐰 "I hashed and hopped through streams and keys,
I sorted the chunks beneath the trees,
Direct or semantic, now they sing,
Cache IDs dance — a little spring,
A rabbit's cheer for smaller pleas!" 🥕✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 04-30-fix_semantic_cache_fixes

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 4, 2026

Confidence Score: 4/5

Safe to merge with awareness that cacheState for cache-hit requests is not cleared until the 60-minute reaper fires, and that cacheState uses CreatedAt rather than LastSeenAt as its reaper key.

The refactoring is thorough and the new cacheState struct is a clear improvement over stringly-typed context keys. The cacheState reaper uses CreatedAt rather than LastSeenAt, meaning a very long-running streaming request (>60 min) could have its state evicted mid-flight, silently skipping the final cache write. Neither issue corrupts data or poses a security risk; both result in cache misses at worst.

plugins/semanticcache/state.go (cacheState CreatedAt reaper key), plugins/semanticcache/main.go (cacheState not cleared on cache-hit path)

Important Files Changed

Filename Overview
plugins/semanticcache/main.go Major refactor: replaces dual-path direct lookup with single O(1) GetChunk, moves per-request state from context keys to cacheState struct, introduces background cleanup loops. cacheState is never cleared on cache-hit paths (existing issue from prior review), the provider/semantic-search mismatch on fallback routing also persists.
plugins/semanticcache/state.go New file encapsulating per-request cacheState with background cleanup loop. Uses CreatedAt for reaper cutoff (vs LastSeenAt used by StreamAccumulator), which could prematurely evict state for very long-running streams.
plugins/semanticcache/stream.go Simplifies chunk sorting via chunkSortKey helper, updates accumulator reaper to use LastSeenAt (correct), adds ctx.Done() guard in replay goroutine to prevent goroutine leaks. Logic is sound.
plugins/semanticcache/search.go Consolidates dual direct-search path into single performDirectSearch; adds stamping helpers for cache debug. generateDirectCacheID now errors instead of silently falling back. Logic is clean.
plugins/semanticcache/utils.go Large refactor of hash utilities, metadata builders, embedding converters. hashSortedSet / sortedStringSet fix MCP random iteration. buildUnifiedMetadata drops requestHash. Overall cleaner.
framework/logstore/rdb.go Adds CacheHitTypes SQL filter with proper allowlisting (only "direct"/"semantic") for both SQLite and PostgreSQL dialects; adds cache_debug to listSelectColumns. Safe implementation.
transports/bifrost-http/handlers/cache.go Route renamed from /requestId to /cacheId; ClearCacheForRequestID replaced with ClearCacheForCacheID; interface extracted for testability. Clean change.

Reviews (4): Last reviewed commit: "fix: semantic cache fixes" | Re-trigger Greptile

Comment thread plugins/semanticcache/plugin_cache_type_test.go Outdated
Comment thread transports/bifrost-http/handlers/middlewares.go
Comment thread plugins/semanticcache/main.go Outdated
Comment thread plugins/semanticcache/search.go
Comment thread plugins/semanticcache/utils.go Fixed
Comment thread plugins/semanticcache/utils.go Fixed
Comment thread plugins/semanticcache/utils.go Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ui/app/workspace/config/views/pluginsForm.tsx (1)

249-253: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Allow disabling an already-enabled plugin even when no embedding providers exist.

Line 249 disables the toggle when embeddingProviders.length === 0, which also blocks turning the plugin OFF if it is currently enabled.

Proposed fix
- disabled={!isVectorStoreEnabled || providersLoading || embeddingProviders.length === 0}
+ disabled={!isVectorStoreEnabled || providersLoading || (embeddingProviders.length === 0 && !originalCacheEnabled)}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/config/views/pluginsForm.tsx` around lines 249 - 253, The
toggle is currently disabled whenever embeddingProviders.length === 0, which
prevents turning an already-enabled plugin off; change the disabled prop to only
block enabling (e.g., disabled={providersLoading || (!isVectorStoreEnabled &&
embeddingProviders.length === 0)}) and add a guard in onCheckedChange to ignore
attempts to enable when embeddingProviders.length === 0 (e.g., if (checked &&
embeddingProviders.length === 0) return;) before calling
handleSemanticCacheToggle(checked), leaving isVectorStoreEnabled,
providersLoading, embeddingProviders, onCheckedChange, and
handleSemanticCacheToggle as the referenced symbols to locate and update.
🧹 Nitpick comments (2)
transports/bifrost-http/handlers/middlewares.go (1)

52-79: ⚡ Quick win

Remove the now-dead skip/goto branch in CorsMiddleware.

After disabling request-completion logging, this block is a no-op: the skip-path check and goto corsFlow no longer change behavior. Please delete the dead control-flow and commented logger block to keep this middleware readable.

♻️ Suggested cleanup
-			// startTime := time.Now()
-			// skip logging if it's a /health check request
-			if slices.IndexFunc(loggingSkipPaths, func(path string) bool {
-				return strings.HasPrefix(string(ctx.RequestURI()), path)
-			}) != -1 {
-				goto corsFlow
-			}
-			// defer func() {
-			// 	...
-			// }()
-		corsFlow:
+			// Request-completion logging intentionally disabled.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/middlewares.go` around lines 52 - 79, In
CorsMiddleware, remove the dead skip-path check and goto along with the entirely
commented-out deferred request-completion logging block: delete the
slices.IndexFunc(...) check referencing loggingSkipPaths and the goto corsFlow
label, remove the commented defer block that builds the request log (including
references to startTime, logger.LogHTTPRequest and schemas.* fields), and remove
the remaining corsFlow: label so the middleware has no dead control-flow or
commented legacy code.
ui/app/workspace/config/views/pluginsForm.tsx (1)

324-332: ⚡ Quick win

Add a test selector to the new embedding model control.

The new ModelMultiselect is interactive but has no data-testid, which makes E2E targeting inconsistent.

Proposed fix
 	<ModelMultiselect
 		inputId="embedding_model"
+		data-testid="semantic-cache-embedding-model-multiselect"
 		isSingleSelect
 		provider={cacheConfig.provider || undefined}
 		value={cacheConfig.embedding_model ?? ""}

As per coding guidelines: "Add data-testid to all new interactive elements in React components for E2E test compatibility" and use data-testid="<entity>-<element>-<qualifier>".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/config/views/pluginsForm.tsx` around lines 324 - 332, The
ModelMultiselect instance rendering the embedding model picker lacks a
data-testid; add a data-testid prop to the component (ModelMultiselect) using
the project's naming convention "<entity>-<element>-<qualifier>" so E2E tests
can target it reliably (e.g., use something like
data-testid="cacheconfig-embedding_model-input" or similar consistent name),
leaving the other props (inputId, provider, value, onChange, placeholder,
disabled) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@framework/logstore/rdb.go`:
- Around line 195-217: The code currently skips adding any predicate when
filters.CacheHitTypes is non-empty but none pass the allowlist, causing
unfiltered results; change the branch so that after building valid you check if
len(filters.CacheHitTypes) > 0 && len(valid) == 0 and in that case add a safe
always-false predicate to baseQuery (e.g. baseQuery = baseQuery.Where("1=0") or
dialect-appropriate FALSE) so queries like cache_hit_types=foo return no rows;
update the block around filters.CacheHitTypes, baseQuery, and the Dialector
check (and keep the existing regex/json extraction logic unchanged when valid is
non-empty).

In `@plugins/semanticcache/main.go`:
- Around line 710-724: ClearCacheForRequestID currently deletes using the raw
requestID but new entries use deterministic directCacheID via
resolveStorageIDAndEmbedding, so callers passing
schemas.BifrostContextKeyRequestID miss entries; update
Plugin.ClearCacheForRequestID to accept either a directCacheID or compute the
correct storage ID by calling resolveStorageIDAndEmbedding (or the helper that
produces directCacheID) for the given requestID and then call
plugin.store.Delete(ctx, plugin.config.VectorStoreNamespace, computedStorageID);
ensure logging uses the actual storage key deleted and preserve the
timeout/cancel pattern already present.
- Around line 285-286: WaitForPendingOperations() is blocked by the long-lived
runStreamCleanupLoop because Init increments plugin.waitGroup for that
goroutine; remove the forever-loop from the same wait group used to flush
pending writes. Concretely, stop calling plugin.waitGroup.Add(1) for
runStreamCleanupLoop in Init (or move that Add/Done into a separate
longLivedWaitGroup variable or into Cleanup()), so runStreamCleanupLoop is not
waited on by WaitForPendingOperations(); keep WaitForPendingOperations() waiting
only for short-lived write/flush goroutines so tests can call it to flush
pending cache writes.

In `@plugins/semanticcache/plugin_cache_type_test.go`:
- Around line 617-619: The tests still call the removed 3-argument
performDirectSearch; update the remaining call sites in
TestDefaultDirectSearchSetsStorageIDForDeterministicWrites and
TestPerformDirectSearchDisablesScanFallbackForLegacyLookup (also the calls
around lines 858-860) to match the new performDirectSearch signature in
plugins/semanticcache/search.go by adding the new parameter(s) and adapting to
any changed return values; ensure you pass the appropriate
context/req/plugin/storageID or options values that the new signature requires
and update the assertions to reflect the new outputs.

In `@plugins/semanticcache/stream.go`:
- Around line 62-70: The reaper currently uses the first chunk timestamp
(Chunks[0].Timestamp) to determine staleness, causing active streams to be
removed; update the StreamAccumulator struct to include a lastSeen time field,
set/update that field inside addStreamChunk (and when creating the accumulator),
and change the reaping logic (the code referenced around lines 141-150) to
compare streamAccumulatorMaxAge against accumulator.lastSeen instead of
Chunks[0].Timestamp so last activity, not the first chunk, controls expiry.

In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Around line 126-134: The effect that runs when embeddingProviders changes
currently forces provider to embeddingProviders[0].name and can overwrite a
user's in-progress selection; change the setCacheConfig call inside the effect
so it only sets the default provider when there is no existing provider (i.e.,
when prev.provider is null/undefined/empty) rather than unconditionally—inside
the setCacheConfig updater for the effect referencing embeddingProviders and
semanticCachePlugin?.config, short-circuit and return prev if prev.provider
already exists, otherwise populate provider with embeddingProviders[0].name and
fill embedding_model/dimension defaults.

---

Outside diff comments:
In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Around line 249-253: The toggle is currently disabled whenever
embeddingProviders.length === 0, which prevents turning an already-enabled
plugin off; change the disabled prop to only block enabling (e.g.,
disabled={providersLoading || (!isVectorStoreEnabled &&
embeddingProviders.length === 0)}) and add a guard in onCheckedChange to ignore
attempts to enable when embeddingProviders.length === 0 (e.g., if (checked &&
embeddingProviders.length === 0) return;) before calling
handleSemanticCacheToggle(checked), leaving isVectorStoreEnabled,
providersLoading, embeddingProviders, onCheckedChange, and
handleSemanticCacheToggle as the referenced symbols to locate and update.

---

Nitpick comments:
In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 52-79: In CorsMiddleware, remove the dead skip-path check and goto
along with the entirely commented-out deferred request-completion logging block:
delete the slices.IndexFunc(...) check referencing loggingSkipPaths and the goto
corsFlow label, remove the commented defer block that builds the request log
(including references to startTime, logger.LogHTTPRequest and schemas.* fields),
and remove the remaining corsFlow: label so the middleware has no dead
control-flow or commented legacy code.

In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Around line 324-332: The ModelMultiselect instance rendering the embedding
model picker lacks a data-testid; add a data-testid prop to the component
(ModelMultiselect) using the project's naming convention
"<entity>-<element>-<qualifier>" so E2E tests can target it reliably (e.g., use
something like data-testid="cacheconfig-embedding_model-input" or similar
consistent name), leaving the other props (inputId, provider, value, onChange,
placeholder, disabled) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dacf00c4-f836-45fb-a591-e2b29e8069ed

📥 Commits

Reviewing files that changed from the base of the PR and between 636c89d and 76e9d94.

📒 Files selected for processing (20)
  • .gitignore
  • core/schemas/bifrost.go
  • framework/logstore/matviews.go
  • framework/logstore/rdb.go
  • framework/logstore/tables.go
  • framework/modelcatalog/sync.go
  • plugins/semanticcache/main.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/search.go
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/utils.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/middlewares.go
  • ui/app/workspace/config/views/pluginsForm.tsx
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • ui/components/filters/logsFilterSidebar.tsx
  • ui/lib/constants/logs.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/logs.ts

Comment thread framework/logstore/rdb.go
Comment thread plugins/semanticcache/main.go Outdated
Comment thread plugins/semanticcache/main.go Outdated
Comment thread plugins/semanticcache/plugin_cache_type_test.go Outdated
Comment thread plugins/semanticcache/stream.go
Comment thread ui/app/workspace/config/views/pluginsForm.tsx
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-30-fix_semantic_cache_fixes branch from 76e9d94 to 2ee69fd Compare May 5, 2026 12:20
@Pratham-Mishra04 Pratham-Mishra04 requested a review from a team as a code owner May 5, 2026 12:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/semanticcache/plugin_edge_cases_test.go (1)

350-355: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a unique cache key per subtest here.

All content-variation cases reuse content-variations-test, and this loop never clears the store between subtests. That lets earlier semantic entries bleed into later cases, so a case may stop validating its own first write path.

Suggested fix
-			ctx := CreateContextWithCacheKey(t, "content-variations-test")
+			ctx := CreateContextWithCacheKey(t, "content-variations-test-"+tt.name)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/plugin_edge_cases_test.go` around lines 350 - 355, The
subtests reuse the same cache key ("content-variations-test") causing state
leakage between cases; update the context creation in the test loop so each
subtest gets a unique cache key (e.g., derive from tt.name or t.Name(), or
append a UUID/timestamp) when calling CreateContextWithCacheKey so each run uses
an isolated store and prevents earlier semantic entries from affecting later
cases.
🧹 Nitpick comments (5)
plugins/semanticcache/utils.go (1)

5-16: ⚡ Quick win

Switch the cache serialization path to sonic.

These marshal/unmarshal calls are on the cache read/write path, so they should use the repo-standard hot-path JSON library instead of encoding/json.

Suggested fix
-	"encoding/json"
+	sonic "github.com/bytedance/sonic"
-	responseData, err := json.Marshal(res)
+	responseData, err := sonic.Marshal(res)
-		if err := json.Unmarshal([]byte(v), &stringArray); err != nil {
+		if err := sonic.Unmarshal([]byte(v), &stringArray); err != nil {

As per coding guidelines **/*.go: Use github.com/bytedance/sonic for JSON marshaling in hot paths; use encoding/json for custom marshaling in schemas.

Also applies to: 433-445, 495-520

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/utils.go` around lines 5 - 16, Replace the hot-path
JSON usage of encoding/json in plugins/semanticcache/utils.go with the
repo-standard fast library: remove the "encoding/json" import and add
"github.com/bytedance/sonic" to the imports, then change all
json.Marshal/json.Unmarshal calls in this file (including the cache read/write
sites around the ranges mentioned: roughly lines 433-445 and 495-520) to use
sonic.Marshal/sonic.Unmarshal instead; leave any custom schema marshaling that
intentionally uses encoding/json (schemas package) unchanged.
plugins/semanticcache/plugin_nil_content_test.go (1)

91-92: ⚡ Quick win

Assert the nil-content path stays error-free.

This test currently passes even if extractTextForEmbedding starts returning an error. Since the regression you care about is “nil content should be handled safely,” please fail the test when err != nil.

💡 Tighten the assertion
 			text, err := plugin.extractTextForEmbedding(nil, tt.request)
-			t.Logf("text=%q, err=%v", text, err)
+			if err != nil {
+				t.Fatalf("extractTextForEmbedding returned error: %v", err)
+			}
+			t.Logf("text=%q", text)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/plugin_nil_content_test.go` around lines 91 - 92, The
test must fail if extractTextForEmbedding returns an error for the nil-content
case: after calling plugin.extractTextForEmbedding(nil, tt.request) (symbols:
extractTextForEmbedding, plugin, tt.request) assert that err is nil and fail the
test when it's not (e.g. use t.Fatalf or require.NoError) before proceeding to
inspect text so the nil-content regression is caught.
transports/bifrost-http/handlers/cache.go (1)

55-69: 💤 Low value

Minor validation inconsistency with clearCache handler.

The clearCacheByKey handler validates the type assertion (!ok) but doesn't check for empty string like clearCache does on line 41. Consider adding an empty string check for consistency, though an empty cache key is unlikely to match anything meaningful.

♻️ Optional: Add empty string validation for consistency
 func (h *CacheHandler) clearCacheByKey(ctx *fasthttp.RequestCtx) {
 	cacheKey, ok := ctx.UserValue("cacheKey").(string)
-	if !ok {
+	if !ok || cacheKey == "" {
 		SendError(ctx, fasthttp.StatusBadRequest, "Invalid cache key")
 		return
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@transports/bifrost-http/handlers/cache.go` around lines 55 - 69, The
clearCacheByKey handler currently only type-asserts cacheKey but doesn't reject
empty strings; update clearCacheByKey to mirror clearCache by checking cacheKey
== "" after the type assertion and call SendError(ctx,
fasthttp.StatusBadRequest, "Invalid cache key") if empty, before invoking
h.plugin.ClearCacheForKey(cacheKey), leaving SendError and SendJSON calls
unchanged.
transports/bifrost-http/handlers/cache_test.go (2)

78-89: ⚡ Quick win

Assert no plugin call when cacheId user value is missing.

This test checks status code only; add a call-count assertion to prevent regressions where the handler still invokes clearing logic on invalid input.

Proposed test hardening
 func TestClearCache_MissingUserValue(t *testing.T) {
 	clearer := &fakeCacheClearer{}
 	h := &CacheHandler{plugin: clearer}
@@
 	if got := ctx.Response.StatusCode(); got != fasthttp.StatusBadRequest {
 		t.Fatalf("expected 400 when cacheId user value missing, got %d", got)
 	}
+	if len(clearer.idCalls) != 0 {
+		t.Fatalf("expected no Clear calls when cacheId is missing, got %v", clearer.idCalls)
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@transports/bifrost-http/handlers/cache_test.go` around lines 78 - 89, Add an
assertion that the fake cache clearer was not invoked when the user value
"cacheId" is missing: after calling CacheHandler.clearCache (in
TestClearCache_MissingUserValue) assert that the fakeCacheClearer call-count
field (e.g., callCount / clearedCalls on the fakeCacheClearer) is zero so the
test fails if CacheHandler.clearCache still calls clearer.Clear...; if the fake
doesn't already expose a counter, add one to fakeCacheClearer and increment it
in its Clear method, then assert it remains 0.

112-139: ⚡ Quick win

Add negative-path coverage for clearCacheByKey (empty/missing key).

clearCache already guards invalid path values, but clearCacheByKey tests currently miss equivalent bad-input assertions. Adding them improves contract symmetry and catches accidental plugin calls.

Proposed additional tests
+func TestClearCacheByKey_RejectsEmptyKey(t *testing.T) {
+	clearer := &fakeCacheClearer{}
+	h := &CacheHandler{plugin: clearer}
+
+	ctx := newCacheCtx("cacheKey", "")
+	h.clearCacheByKey(ctx)
+
+	if got := ctx.Response.StatusCode(); got != fasthttp.StatusBadRequest {
+		t.Fatalf("expected 400 for empty key, got %d", got)
+	}
+	if len(clearer.keyCalls) != 0 {
+		t.Fatalf("expected no Clear calls on bad key, got %v", clearer.keyCalls)
+	}
+}
+
+func TestClearCacheByKey_MissingUserValue(t *testing.T) {
+	clearer := &fakeCacheClearer{}
+	h := &CacheHandler{plugin: clearer}
+
+	ctx := &fasthttp.RequestCtx{}
+	h.clearCacheByKey(ctx)
+
+	if got := ctx.Response.StatusCode(); got != fasthttp.StatusBadRequest {
+		t.Fatalf("expected 400 when cacheKey user value missing, got %d", got)
+	}
+	if len(clearer.keyCalls) != 0 {
+		t.Fatalf("expected no Clear calls when cacheKey is missing, got %v", clearer.keyCalls)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@transports/bifrost-http/handlers/cache_test.go` around lines 112 - 139, Add a
new test that covers the negative path for CacheHandler.clearCacheByKey: create
a fakeCacheClearer and CacheHandler as in TestClearCacheByKey_OK, call
h.clearCacheByKey with a context made via newCacheCtx using an empty string (or
missing) key, assert that ctx.Response.StatusCode() is fasthttp.StatusBadRequest
(or the handler's validation status) and assert that the fake clearer's keyCalls
slice remains empty (i.e., ClearCacheForKey was not invoked); place this
alongside TestClearCacheByKey_OK and TestClearCacheByKey_PluginErrorReturns500
to ensure symmetry of input validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/schemas/context.go`:
- Around line 153-158: The Root method currently only returns a single level of
valueDelegate which leaves intermediate pooled scopes in the chain; modify
BifrostContext.Root to traverse/unwarp the delegation chain by following
valueDelegate repeatedly until nil (e.g., loop while bc.valueDelegate != nil
updating bc = bc.valueDelegate) and then return the ultimate root; update the
method on type BifrostContext (Root) so it always resolves to the real request
root even when multiple scoped contexts were derived.

In `@plugins/semanticcache/plugin_api_test.go`:
- Around line 352-358: The test currently does a blocking receive <-sc.Stream
before calling ctx.Cancel(), which can hang if the first chunk never arrives;
change that receive to a guarded receive using a timeout (e.g. select between
sc.Stream and time.After or a context with timeout) so the test fails fast if no
first chunk is produced, then call ctx.Cancel() and proceed to the existing
drain/timeout logic; reference the sc.Stream receive and ctx.Cancel() locations
when making the change.

In `@plugins/semanticcache/plugin_core_test.go`:
- Around line 14-16: The test TestSemanticCacheBasicFunctionality is
timing-sensitive and should not run in parallel; remove the t.Parallel() call
(the parallelization invoked in TestSemanticCacheBasicFunctionality) so the test
runs serially and avoids amplified CPU/network contention that makes the ratio
assertion flaky in CI.

In `@plugins/semanticcache/plugin_embedding_test.go`:
- Around line 84-86: The test is supposed to exercise the "no cache key" path
but calling CreateContextWithCacheKey(t, "") still sets CacheKey (derived from
t.Name()); replace that call so the context truly has no CacheKey — e.g. use
context.Background() (or a newly added helper like CreateContextWithoutCacheKey)
in the test instead of CreateContextWithCacheKey(t, "") so the CacheKey field
remains unset and the uncached path is exercised.

In `@plugins/semanticcache/plugin_responses_test.go`:
- Around line 307-308: Replace the inconsistent context creation in the "no
cache key" test by using newBaseTestContext() instead of
CreateContextWithCacheKey(t, ""); locate the test that currently calls
CreateContextWithCacheKey(t, "") and change it to newBaseTestContext() so the
test uses an explicit base (no-key) context and avoids accidentally supplying a
cache key.

In `@plugins/semanticcache/search.go`:
- Around line 342-365: The loop currently unmarshals each chunk (json.Unmarshal
into cachedResponse) and skips malformed ones, which can truncate the stream and
skip stampCacheDebugForHit; instead, pre-validate the entire streamArray before
replay: iterate over all chunkStr entries, attempt json.Unmarshal into a
temporary struct (and ensure ExtraFields/RequestType are present), and if any
unmarshal/validation fails treat it as a cache miss (log the error and abort
replay) so that plugin.stampCacheDebugForHit
(plugin.stampCacheDebugForHit(state, cachedResponse.GetExtraFields(), result.ID,
...)) and downstream final-chunk handling are guaranteed to run only for
fully-valid cached streams.
- Around line 167-179: The switch on embedding currently handles EmbeddingStr,
EmbeddingArray and Embedding2DArray but omits EmbeddingInt8Array and
EmbeddingInt32Array; add cases for embedding.EmbeddingInt8Array and
embedding.EmbeddingInt32Array that convert those numeric arrays to []float32
using helper functions (e.g. toFloat32EmbeddingFromInt8 and
toFloat32EmbeddingFromInt32) and return the converted slice along with
inputTokens and nil error so those formats no longer fall through to the final
error branch.

---

Outside diff comments:
In `@plugins/semanticcache/plugin_edge_cases_test.go`:
- Around line 350-355: The subtests reuse the same cache key
("content-variations-test") causing state leakage between cases; update the
context creation in the test loop so each subtest gets a unique cache key (e.g.,
derive from tt.name or t.Name(), or append a UUID/timestamp) when calling
CreateContextWithCacheKey so each run uses an isolated store and prevents
earlier semantic entries from affecting later cases.

---

Nitpick comments:
In `@plugins/semanticcache/plugin_nil_content_test.go`:
- Around line 91-92: The test must fail if extractTextForEmbedding returns an
error for the nil-content case: after calling
plugin.extractTextForEmbedding(nil, tt.request) (symbols:
extractTextForEmbedding, plugin, tt.request) assert that err is nil and fail the
test when it's not (e.g. use t.Fatalf or require.NoError) before proceeding to
inspect text so the nil-content regression is caught.

In `@plugins/semanticcache/utils.go`:
- Around line 5-16: Replace the hot-path JSON usage of encoding/json in
plugins/semanticcache/utils.go with the repo-standard fast library: remove the
"encoding/json" import and add "github.com/bytedance/sonic" to the imports, then
change all json.Marshal/json.Unmarshal calls in this file (including the cache
read/write sites around the ranges mentioned: roughly lines 433-445 and 495-520)
to use sonic.Marshal/sonic.Unmarshal instead; leave any custom schema marshaling
that intentionally uses encoding/json (schemas package) unchanged.

In `@transports/bifrost-http/handlers/cache_test.go`:
- Around line 78-89: Add an assertion that the fake cache clearer was not
invoked when the user value "cacheId" is missing: after calling
CacheHandler.clearCache (in TestClearCache_MissingUserValue) assert that the
fakeCacheClearer call-count field (e.g., callCount / clearedCalls on the
fakeCacheClearer) is zero so the test fails if CacheHandler.clearCache still
calls clearer.Clear...; if the fake doesn't already expose a counter, add one to
fakeCacheClearer and increment it in its Clear method, then assert it remains 0.
- Around line 112-139: Add a new test that covers the negative path for
CacheHandler.clearCacheByKey: create a fakeCacheClearer and CacheHandler as in
TestClearCacheByKey_OK, call h.clearCacheByKey with a context made via
newCacheCtx using an empty string (or missing) key, assert that
ctx.Response.StatusCode() is fasthttp.StatusBadRequest (or the handler's
validation status) and assert that the fake clearer's keyCalls slice remains
empty (i.e., ClearCacheForKey was not invoked); place this alongside
TestClearCacheByKey_OK and TestClearCacheByKey_PluginErrorReturns500 to ensure
symmetry of input validation.

In `@transports/bifrost-http/handlers/cache.go`:
- Around line 55-69: The clearCacheByKey handler currently only type-asserts
cacheKey but doesn't reject empty strings; update clearCacheByKey to mirror
clearCache by checking cacheKey == "" after the type assertion and call
SendError(ctx, fasthttp.StatusBadRequest, "Invalid cache key") if empty, before
invoking h.plugin.ClearCacheForKey(cacheKey), leaving SendError and SendJSON
calls unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9762ab5e-8d60-4d60-b04f-6edf04c58a4a

📥 Commits

Reviewing files that changed from the base of the PR and between 76e9d94 and 2ee69fd.

📒 Files selected for processing (53)
  • .claude/skills/docs-writer/SKILL.md
  • .gitignore
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • docs/features/semantic-caching.mdx
  • docs/migration-guides/v1.5.0.mdx
  • docs/openapi/openapi.json
  • docs/openapi/openapi.yaml
  • docs/openapi/paths/management/cache.yaml
  • framework/logstore/matviews.go
  • framework/logstore/rdb.go
  • framework/logstore/tables.go
  • framework/modelcatalog/sync.go
  • framework/vectorstore/weaviate.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/semanticcache/main.go
  • plugins/semanticcache/main_test.go
  • plugins/semanticcache/plugin_api_test.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_conversation_config_test.go
  • plugins/semanticcache/plugin_core_test.go
  • plugins/semanticcache/plugin_cross_cache_test.go
  • plugins/semanticcache/plugin_default_cache_key_test.go
  • plugins/semanticcache/plugin_edge_cases_test.go
  • plugins/semanticcache/plugin_embedding_test.go
  • plugins/semanticcache/plugin_image_generation_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_nil_content_test.go
  • plugins/semanticcache/plugin_no_store_test.go
  • plugins/semanticcache/plugin_normalization_test.go
  • plugins/semanticcache/plugin_paths_test.go
  • plugins/semanticcache/plugin_responses_test.go
  • plugins/semanticcache/plugin_streaming_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/semanticcache/search.go
  • plugins/semanticcache/state.go
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/utils.go
  • transports/bifrost-http/handlers/cache.go
  • transports/bifrost-http/handlers/cache_test.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/middlewares.go
  • ui/app/workspace/config/views/pluginsForm.tsx
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • ui/components/filters/logsFilterSidebar.tsx
  • ui/lib/constants/logs.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
💤 Files with no reviewable changes (1)
  • plugins/logging/operations.go
✅ Files skipped from review due to trivial changes (6)
  • .gitignore
  • ui/lib/types/logs.ts
  • transports/bifrost-http/handlers/logging.go
  • core/schemas/bifrost.go
  • .claude/skills/docs-writer/SKILL.md
  • transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • ui/lib/constants/logs.ts
  • ui/app/workspace/logs/page.tsx
  • framework/logstore/matviews.go
  • ui/components/filters/logsFilterSidebar.tsx
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • framework/modelcatalog/sync.go
  • ui/lib/store/apis/logsApi.ts
  • framework/logstore/tables.go
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/main.go

Comment thread core/schemas/context.go
Comment thread plugins/semanticcache/plugin_api_test.go
Comment thread plugins/semanticcache/plugin_core_test.go
Comment thread plugins/semanticcache/plugin_embedding_test.go Outdated
Comment thread plugins/semanticcache/plugin_responses_test.go Outdated
Comment thread plugins/semanticcache/search.go
Comment on lines 342 to +365
if err := json.Unmarshal([]byte(chunkStr), &cachedResponse); err != nil {
plugin.logger.Warn("%s Failed to unmarshal stream chunk %d, skipping: %v", PluginLoggerPrefix, i, err)
plugin.logger.Warn("Failed to unmarshal stream chunk %d, skipping: %v", i, err)
continue
}

// Ensure RequestType is set on every chunk so downstream consumers
// (logging, telemetry, etc.) correctly identify this as a streaming response.
// (logging, telemetry) correctly identify this as a streaming response.
if ef := cachedResponse.GetExtraFields(); ef != nil && ef.RequestType == "" {
ef.RequestType = req.RequestType
}

// Add cache debug to only the last chunk
if i == len(streamArray)-1 {
ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
extraFields := cachedResponse.GetExtraFields()
cacheDebug := schemas.BifrostCacheDebug{
CacheHit: true,
HitType: bifrost.Ptr(string(cacheType)),
CacheID: bifrost.Ptr(result.ID),
RequestedProvider: bifrost.Ptr(string(requestedProvider)),
RequestedModel: bifrost.Ptr(requestedModel),
}
if cacheType == CacheTypeSemantic {
cacheDebug.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
cacheDebug.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
cacheDebug.Threshold = &threshold
cacheDebug.Similarity = &similarity
cacheDebug.InputTokens = &inputTokens
} else {
cacheDebug.ProviderUsed = nil
cacheDebug.ModelUsed = nil
cacheDebug.Threshold = nil
cacheDebug.Similarity = nil
cacheDebug.InputTokens = nil
}
extraFields.CacheDebug = &cacheDebug
// stampCacheDebugForHit marks this chunk as the cache-hit final
// chunk; cache.PostLLMHook keys off CacheDebug.CacheHit=true to
// set BifrostContextKeyStreamEndIndicator on the root ctx
// synchronously (same goroutine as logging.PostLLMHook).
//
// We deliberately do NOT call ctx.Root().SetValue here. Doing
// so races against the receiver's PostLLMHook for the previous
// chunk: the cache replay can advance to iteration N (and
// write the indicator) while the receiver is still running
// PostLLMHooks for chunk N-1, poisoning that chunk's
// IsFinalChunk read and causing duplicate "final" events.
plugin.stampCacheDebugForHit(state, cachedResponse.GetExtraFields(), result.ID, requestedProvider, requestedModel, cacheType, threshold, similarity, inputTokens)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on malformed cached stream chunks.

Skipping a bad chunk here turns a corrupt cache entry into a truncated stream. If the malformed chunk is the final one, stampCacheDebugForHit never runs, so downstream final-chunk handling and log finalization can be missed entirely. Treat any decode failure as a cache miss by validating the full chunk list before starting replay.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/search.go` around lines 342 - 365, The loop currently
unmarshals each chunk (json.Unmarshal into cachedResponse) and skips malformed
ones, which can truncate the stream and skip stampCacheDebugForHit; instead,
pre-validate the entire streamArray before replay: iterate over all chunkStr
entries, attempt json.Unmarshal into a temporary struct (and ensure
ExtraFields/RequestType are present), and if any unmarshal/validation fails
treat it as a cache miss (log the error and abort replay) so that
plugin.stampCacheDebugForHit (plugin.stampCacheDebugForHit(state,
cachedResponse.GetExtraFields(), result.ID, ...)) and downstream final-chunk
handling are guaranteed to run only for fully-valid cached streams.

Comment thread plugins/semanticcache/main.go
Comment thread plugins/semanticcache/main.go
Comment thread plugins/semanticcache/main.go
Comment thread plugins/semanticcache/main.go
Comment thread plugins/semanticcache/stream.go
Comment thread plugins/semanticcache/utils.go
Comment thread plugins/semanticcache/utils.go
Comment thread plugins/semanticcache/utils.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-30-fix_semantic_cache_fixes branch from 2ee69fd to 59459da Compare May 5, 2026 15:29
Comment thread plugins/semanticcache/utils.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
plugins/semanticcache/search.go (1)

344-355: ⚠️ Potential issue | 🟠 Major

Still fail closed on malformed cached stream chunks.

Skipping a bad chunk turns a corrupt cache entry into a partial replay. If the malformed chunk is the last one, stampCacheDebugForHit never runs, so downstream final-chunk handling/log finalization can be skipped entirely. Abort replay on the first decode/validation failure and treat the entry as a cache miss instead.

Also applies to: 364-377

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/search.go` around lines 344 - 355, The current
goroutine that iterates streamArray and unmarshals into schemas.BifrostResponse
(inside the anonymous func that closes streamChan) silently skips malformed
chunks by logging via plugin.logger.Warn and continuing; change this to fail
closed: on the first json.Unmarshal or validation error for a chunk (in the loop
over streamArray) immediately log the error, abort the replay by closing
streamChan and returning (so no further chunks are sent), and ensure the cache
is treated as a miss by invoking the same failure path used elsewhere (so
stampCacheDebugForHit is not called); apply the same behavior to the later block
handling indices 364-377 to keep consistent fail-closed semantics.
🧹 Nitpick comments (3)
ui/app/workspace/config/views/pluginsForm.tsx (2)

98-98: 💤 Low value

Filename uses camelCase instead of PascalCase.

Per coding guidelines, React component files should use PascalCase for filenames (PluginsForm.tsx instead of pluginsForm.tsx). Since this is an existing file, this can be addressed in a separate housekeeping refactor.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ui/app/workspace/config/views/pluginsForm.tsx` at line 98, The file name uses
camelCase but React component files should be PascalCase; rename the file to
PluginsForm.tsx and update all imports/usages that reference the old filename
(e.g., any import of PluginsForm from "pluginsForm" or similar) to the new
PascalCase module name so the exported component PluginsForm continues to be
imported correctly throughout the codebase; ensure build/test pass after the
rename.

315-351: 💤 Low value

Consider adding data-testid attributes for E2E test compatibility.

The new provider Select and ModelMultiselect components lack data-testid attributes. Per coding guidelines, new interactive elements should include them (e.g., data-testid="semantic-cache-provider-select", data-testid="semantic-cache-embedding-model-select").

Based on learnings, this can be deferred to a dedicated PR that also updates related E2E tests.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ui/app/workspace/config/views/pluginsForm.tsx` around lines 315 - 351, The
Select for choosing provider (rendered with the
Select/SelectTrigger/SelectContent/SelectItem components) and the
ModelMultiselect (props: inputId="embedding_model", isSingleSelect, provider,
value, onChange) are missing data-testid attributes required for E2E tests; add
data-testid="semantic-cache-provider-select" to the provider Select
trigger/component and data-testid="semantic-cache-embedding-model-select" to the
ModelMultiselect (or its input wrapper) while preserving existing props and
behavior (updateCacheConfigLocal and cacheConfig usage remain unchanged).
plugins/semanticcache/plugin_core_test.go (1)

454-480: ⚡ Quick win

custom_ttl still doesn't exercise TTL behavior.

This branch only re-checks the generic direct-hit path, so a regression in resolveTTL / expires_at handling would still pass. Use a short TTL and assert the entry expires, or inspect the stored expires_at metadata directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/plugin_core_test.go` around lines 454 - 480, The test's
custom_ttl branch only re-checks a direct cache hit and doesn't validate TTL
expiry; update the test for the "custom_ttl" tt case to actually exercise
resolveTTL/expires_at by configuring a short TTL (e.g., 1s) for that plugin,
then either: (A) after the first successful cached response (response2 from
setup.Client.ChatCompletionRequest), wait slightly longer than the short TTL and
issue a new ChatCompletionRequest and assert a cache miss (i.e., AssertCacheMiss
or that CacheDebug.CacheHit is false), or (B) directly read the stored cache
metadata for that request (inspect response2.ExtraFields.CacheDebug or the
underlying cache entry) and assert the expires_at value equals now + configured
TTL (and that it is within an acceptable delta). Modify the test helper or
fixture that builds the "custom_ttl" config and add the sleep+re-request or
metadata assertion to ensure resolveTTL/expires_at behavior is verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/semanticcache/main.go`:
- Around line 354-358: createCacheState is persisting cache state too early
(called before isConversationHistoryThresholdExceeded and before the ParamsHash
empty-path in PostLLMHook), causing stale cacheStates to accumulate for requests
that are never cacheable; change the flow so you do not persist the cacheState
up-front—either create a non-persisted temporary state or delay calling
plugin.createCacheState until after
plugin.isConversationHistoryThresholdExceeded returns false and after you verify
ParamsHash != "" (i.e., only persist the state when the request is actually
cacheable in PostLLMHook and related paths referenced around
createCacheState/isConversationHistoryThresholdExceeded).
- Around line 625-640: In stampCacheDebugForMiss, ensure all hit-only fields are
cleared when stamping a miss: set CacheHit=false and CacheID as already done,
and explicitly nil-out or reset HitType, Threshold, Similarity, CacheHitLatency,
ProviderUsed, ModelUsed and InputTokens on extraFields.CacheDebug so no previous
hit data leaks into miss responses; update the function (stampCacheDebugForMiss)
to clear those fields on cd (the local CacheDebug variable) before returning.

---

Duplicate comments:
In `@plugins/semanticcache/search.go`:
- Around line 344-355: The current goroutine that iterates streamArray and
unmarshals into schemas.BifrostResponse (inside the anonymous func that closes
streamChan) silently skips malformed chunks by logging via plugin.logger.Warn
and continuing; change this to fail closed: on the first json.Unmarshal or
validation error for a chunk (in the loop over streamArray) immediately log the
error, abort the replay by closing streamChan and returning (so no further
chunks are sent), and ensure the cache is treated as a miss by invoking the same
failure path used elsewhere (so stampCacheDebugForHit is not called); apply the
same behavior to the later block handling indices 364-377 to keep consistent
fail-closed semantics.

---

Nitpick comments:
In `@plugins/semanticcache/plugin_core_test.go`:
- Around line 454-480: The test's custom_ttl branch only re-checks a direct
cache hit and doesn't validate TTL expiry; update the test for the "custom_ttl"
tt case to actually exercise resolveTTL/expires_at by configuring a short TTL
(e.g., 1s) for that plugin, then either: (A) after the first successful cached
response (response2 from setup.Client.ChatCompletionRequest), wait slightly
longer than the short TTL and issue a new ChatCompletionRequest and assert a
cache miss (i.e., AssertCacheMiss or that CacheDebug.CacheHit is false), or (B)
directly read the stored cache metadata for that request (inspect
response2.ExtraFields.CacheDebug or the underlying cache entry) and assert the
expires_at value equals now + configured TTL (and that it is within an
acceptable delta). Modify the test helper or fixture that builds the
"custom_ttl" config and add the sleep+re-request or metadata assertion to ensure
resolveTTL/expires_at behavior is verified.

In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Line 98: The file name uses camelCase but React component files should be
PascalCase; rename the file to PluginsForm.tsx and update all imports/usages
that reference the old filename (e.g., any import of PluginsForm from
"pluginsForm" or similar) to the new PascalCase module name so the exported
component PluginsForm continues to be imported correctly throughout the
codebase; ensure build/test pass after the rename.
- Around line 315-351: The Select for choosing provider (rendered with the
Select/SelectTrigger/SelectContent/SelectItem components) and the
ModelMultiselect (props: inputId="embedding_model", isSingleSelect, provider,
value, onChange) are missing data-testid attributes required for E2E tests; add
data-testid="semantic-cache-provider-select" to the provider Select
trigger/component and data-testid="semantic-cache-embedding-model-select" to the
ModelMultiselect (or its input wrapper) while preserving existing props and
behavior (updateCacheConfigLocal and cacheConfig usage remain unchanged).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 849d029e-ae40-4d26-b642-e346b85f6365

📥 Commits

Reviewing files that changed from the base of the PR and between 2ee69fd and 59459da.

📒 Files selected for processing (55)
  • .claude/skills/docs-writer/SKILL.md
  • .gitignore
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • docs/features/semantic-caching.mdx
  • docs/migration-guides/v1.5.0.mdx
  • docs/openapi/openapi.json
  • docs/openapi/openapi.yaml
  • docs/openapi/paths/management/cache.yaml
  • framework/logstore/matviews.go
  • framework/logstore/rdb.go
  • framework/logstore/tables.go
  • framework/modelcatalog/sync.go
  • framework/vectorstore/weaviate.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/semanticcache/main.go
  • plugins/semanticcache/main_test.go
  • plugins/semanticcache/plugin_api_test.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_conversation_config_test.go
  • plugins/semanticcache/plugin_core_test.go
  • plugins/semanticcache/plugin_cross_cache_test.go
  • plugins/semanticcache/plugin_default_cache_key_test.go
  • plugins/semanticcache/plugin_edge_cases_test.go
  • plugins/semanticcache/plugin_embedding_test.go
  • plugins/semanticcache/plugin_image_generation_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_nil_content_test.go
  • plugins/semanticcache/plugin_no_mutation_test.go
  • plugins/semanticcache/plugin_no_store_test.go
  • plugins/semanticcache/plugin_normalization_test.go
  • plugins/semanticcache/plugin_paths_test.go
  • plugins/semanticcache/plugin_responses_test.go
  • plugins/semanticcache/plugin_streaming_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/semanticcache/search.go
  • plugins/semanticcache/state.go
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/utils.go
  • transports/bifrost-http/handlers/cache.go
  • transports/bifrost-http/handlers/cache_test.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/middlewares.go
  • ui/app/workspace/config/views/pluginsForm.tsx
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • ui/components/filters/logsFilterSidebar.tsx
  • ui/lib/constants/logs.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
💤 Files with no reviewable changes (1)
  • plugins/logging/operations.go
✅ Files skipped from review due to trivial changes (13)
  • .gitignore
  • plugins/logging/main.go
  • ui/lib/constants/logs.ts
  • transports/bifrost-http/handlers/logging.go
  • .claude/skills/docs-writer/SKILL.md
  • framework/modelcatalog/sync.go
  • framework/logstore/rdb.go
  • ui/lib/types/logs.ts
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • ui/app/workspace/logs/page.tsx
  • ui/lib/types/schemas.ts
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/plugin_no_store_test.go
🚧 Files skipped from review as they are similar to previous changes (25)
  • ui/lib/store/apis/logsApi.ts
  • docs/openapi/paths/management/cache.yaml
  • docs/migration-guides/v1.5.0.mdx
  • docs/openapi/openapi.json
  • transports/bifrost-http/handlers/middlewares.go
  • ui/lib/types/config.ts
  • core/schemas/context.go
  • plugins/semanticcache/plugin_streaming_test.go
  • plugins/semanticcache/state.go
  • plugins/semanticcache/plugin_image_generation_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/semanticcache/plugin_paths_test.go
  • docs/features/semantic-caching.mdx
  • ui/components/filters/logsFilterSidebar.tsx
  • plugins/semanticcache/plugin_api_test.go
  • plugins/semanticcache/plugin_normalization_test.go
  • framework/logstore/matviews.go
  • docs/openapi/openapi.yaml
  • plugins/semanticcache/plugin_embedding_test.go
  • plugins/semanticcache/plugin_default_cache_key_test.go
  • plugins/semanticcache/utils.go
  • plugins/semanticcache/plugin_nil_content_test.go
  • transports/bifrost-http/handlers/cache.go
  • plugins/semanticcache/test_utils.go
  • transports/bifrost-http/handlers/cache_test.go

Comment on lines +354 to 358
// Create state up front so a reused/retried request ID never inherits stale fields.
state := plugin.createCacheState(requestID)

if plugin.isConversationHistoryThresholdExceeded(state, req) {
return req, nil, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Defer storing cacheState until the request is actually cacheable.

createCacheState() persists state before the threshold/metadata/hash skip paths. Those requests later hit the ParamsHash == "" early return in PostLLMHook, so their state is never cleared on the normal request path and only disappears when the background reaper runs. On high-volume uncached traffic, that becomes avoidable cacheStates growth.

Also applies to: 364-374

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/main.go` around lines 354 - 358, createCacheState is
persisting cache state too early (called before
isConversationHistoryThresholdExceeded and before the ParamsHash empty-path in
PostLLMHook), causing stale cacheStates to accumulate for requests that are
never cacheable; change the flow so you do not persist the cacheState
up-front—either create a non-persisted temporary state or delay calling
plugin.createCacheState until after
plugin.isConversationHistoryThresholdExceeded returns false and after you verify
ParamsHash != "" (i.e., only persist the state when the request is actually
cacheable in PostLLMHook and related paths referenced around
createCacheState/isConversationHistoryThresholdExceeded).

Comment on lines +625 to +640
func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) {
if isStream && !isFinalChunk {
return
}
if extraFields.CacheDebug == nil {
extraFields.CacheDebug = &schemas.BifrostCacheDebug{}
}
cd := extraFields.CacheDebug
cd.CacheHit = false
cd.CacheID = bifrost.Ptr(storageID)
if state.EmbeddingsInputTokens > 0 {
inputTokens := state.EmbeddingsInputTokens
cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
cd.InputTokens = &inputTokens
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clear hit-only cache_debug fields when stamping misses.

This path only flips CacheHit=false and sets CacheID. If extraFields.CacheDebug is already populated, HitType, Threshold, Similarity, CacheHitLatency, and the requested/provider/model fields leak through, which makes the new cache_hit_types filtering and UI badges internally inconsistent on misses.

Suggested fix
  cd := extraFields.CacheDebug
  cd.CacheHit = false
  cd.CacheID = bifrost.Ptr(storageID)
+ cd.HitType = nil
+ cd.RequestedProvider = nil
+ cd.RequestedModel = nil
+ cd.CacheHitLatency = nil
+ cd.Threshold = nil
+ cd.Similarity = nil
+ cd.ProviderUsed = nil
+ cd.ModelUsed = nil
+ cd.InputTokens = nil
  if state.EmbeddingsInputTokens > 0 {
  	inputTokens := state.EmbeddingsInputTokens
  	cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
  	cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
  	cd.InputTokens = &inputTokens
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) {
if isStream && !isFinalChunk {
return
}
if extraFields.CacheDebug == nil {
extraFields.CacheDebug = &schemas.BifrostCacheDebug{}
}
cd := extraFields.CacheDebug
cd.CacheHit = false
cd.CacheID = bifrost.Ptr(storageID)
if state.EmbeddingsInputTokens > 0 {
inputTokens := state.EmbeddingsInputTokens
cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
cd.InputTokens = &inputTokens
}
func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) {
if isStream && !isFinalChunk {
return
}
if extraFields.CacheDebug == nil {
extraFields.CacheDebug = &schemas.BifrostCacheDebug{}
}
cd := extraFields.CacheDebug
cd.CacheHit = false
cd.CacheID = bifrost.Ptr(storageID)
cd.HitType = nil
cd.RequestedProvider = nil
cd.RequestedModel = nil
cd.CacheHitLatency = nil
cd.Threshold = nil
cd.Similarity = nil
cd.ProviderUsed = nil
cd.ModelUsed = nil
cd.InputTokens = nil
if state.EmbeddingsInputTokens > 0 {
inputTokens := state.EmbeddingsInputTokens
cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
cd.InputTokens = &inputTokens
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/main.go` around lines 625 - 640, In
stampCacheDebugForMiss, ensure all hit-only fields are cleared when stamping a
miss: set CacheHit=false and CacheID as already done, and explicitly nil-out or
reset HitType, Threshold, Similarity, CacheHitLatency, ProviderUsed, ModelUsed
and InputTokens on extraFields.CacheDebug so no previous hit data leaks into
miss responses; update the function (stampCacheDebugForMiss) to clear those
fields on cd (the local CacheDebug variable) before returning.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
plugins/semanticcache/main.go (2)

354-358: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Defer createCacheState until the request is confirmed cacheable.

createCacheState(requestID) is called at Line 355 before isConversationHistoryThresholdExceeded (Line 357). When the threshold check returns early, the state is created but never cleared on the normal request path — it only gets reaped by the background cleanup loop. Under high-volume traffic with many uncached requests, this causes avoidable cacheStates growth.

Consider moving state creation after the threshold check, or lazily creating it only when caching will proceed.

🛠️ Proposed fix sketch
 func (plugin *Plugin) PreLLMHook(ctx *schemas.BifrostContext, req *schemas.BifrostRequest) (*schemas.BifrostRequest, *schemas.LLMPluginShortCircuit, error) {
 	// ... cache key and request ID checks ...
 
 	if !isSemanticCacheSupportedRequestType(req.RequestType) {
 		return req, nil, nil
 	}
 
-	// Create state up front so a reused/retried request ID never inherits stale fields.
-	state := plugin.createCacheState(requestID)
-
-	if plugin.isConversationHistoryThresholdExceeded(state, req) {
+	if plugin.isConversationHistoryThresholdExceeded(req) {
 		return req, nil, nil
 	}
 
+	// Create state only after confirming the request is cacheable.
+	state := plugin.createCacheState(requestID)
+
 	performDirectSearch, performSemanticSearch := plugin.resolveCacheTypes(ctx)
 	// ... rest of the method ...

This requires updating isConversationHistoryThresholdExceeded to not depend on state, or passing a temporary/stack-local struct for the check.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/main.go` around lines 354 - 358,
createCacheState(requestID) is being called before the cacheability check,
causing unused entries to accumulate in cacheStates when
isConversationHistoryThresholdExceeded(req) returns early; move or defer
creation of state by calling createCacheState only after
isConversationHistoryThresholdExceeded returns false (i.e., after the request is
confirmed cacheable), or change isConversationHistoryThresholdExceeded to accept
no state (or a temporary stack-local struct) so the threshold check can run
without mutating/allocating cache state; update call sites to
createCacheState(requestID) only when you will proceed to caching and ensure any
code paths that relied on state are adjusted accordingly.

625-641: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clear hit-only fields when stamping cache misses.

stampCacheDebugForMiss sets CacheHit=false and CacheID, but if extraFields.CacheDebug was previously populated (e.g., from an earlier hook or partial state), hit-only fields like HitType, Threshold, Similarity, CacheHitLatency, RequestedProvider, and RequestedModel may leak through. This can cause inconsistent cache_hit_types filtering and UI badges.

🛠️ Proposed fix
 func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) {
 	if isStream && !isFinalChunk {
 		return
 	}
 	if extraFields.CacheDebug == nil {
 		extraFields.CacheDebug = &schemas.BifrostCacheDebug{}
 	}
 	cd := extraFields.CacheDebug
 	cd.CacheHit = false
 	cd.CacheID = bifrost.Ptr(storageID)
+	// Clear hit-only fields to prevent stale data from leaking on misses
+	cd.HitType = nil
+	cd.Threshold = nil
+	cd.Similarity = nil
+	cd.CacheHitLatency = nil
+	cd.RequestedProvider = nil
+	cd.RequestedModel = nil
 	if state.EmbeddingsInputTokens > 0 {
 		inputTokens := state.EmbeddingsInputTokens
 		cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
 		cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
 		cd.InputTokens = &inputTokens
+	} else {
+		cd.ProviderUsed = nil
+		cd.ModelUsed = nil
+		cd.InputTokens = nil
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/semanticcache/main.go` around lines 625 - 641, stampCacheDebugForMiss
must clear any hit-only fields on extraFields.CacheDebug to avoid leaking prior
hit data; update the function (stampCacheDebugForMiss) to explicitly zero or nil
out HitType, Threshold, Similarity, CacheHitLatency, RequestedProvider, and
RequestedModel on cd before setting CacheHit=false and CacheID, leaving the
existing embeddings-related assignments intact so only those hit-specific fields
are cleared for a miss.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@plugins/semanticcache/main.go`:
- Around line 354-358: createCacheState(requestID) is being called before the
cacheability check, causing unused entries to accumulate in cacheStates when
isConversationHistoryThresholdExceeded(req) returns early; move or defer
creation of state by calling createCacheState only after
isConversationHistoryThresholdExceeded returns false (i.e., after the request is
confirmed cacheable), or change isConversationHistoryThresholdExceeded to accept
no state (or a temporary stack-local struct) so the threshold check can run
without mutating/allocating cache state; update call sites to
createCacheState(requestID) only when you will proceed to caching and ensure any
code paths that relied on state are adjusted accordingly.
- Around line 625-641: stampCacheDebugForMiss must clear any hit-only fields on
extraFields.CacheDebug to avoid leaking prior hit data; update the function
(stampCacheDebugForMiss) to explicitly zero or nil out HitType, Threshold,
Similarity, CacheHitLatency, RequestedProvider, and RequestedModel on cd before
setting CacheHit=false and CacheID, leaving the existing embeddings-related
assignments intact so only those hit-specific fields are cleared for a miss.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: af4ddbe1-9b1e-4d53-b76a-4f26cb9584fc

📥 Commits

Reviewing files that changed from the base of the PR and between 59459da and d03a719.

📒 Files selected for processing (55)
  • .claude/skills/docs-writer/SKILL.md
  • .gitignore
  • core/schemas/bifrost.go
  • core/schemas/context.go
  • core/schemas/context_test.go
  • docs/features/semantic-caching.mdx
  • docs/migration-guides/v1.5.0.mdx
  • docs/openapi/openapi.json
  • docs/openapi/openapi.yaml
  • docs/openapi/paths/management/cache.yaml
  • framework/logstore/matviews.go
  • framework/logstore/rdb.go
  • framework/logstore/tables.go
  • framework/modelcatalog/sync.go
  • framework/vectorstore/weaviate.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/semanticcache/main.go
  • plugins/semanticcache/main_test.go
  • plugins/semanticcache/plugin_api_test.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_conversation_config_test.go
  • plugins/semanticcache/plugin_core_test.go
  • plugins/semanticcache/plugin_cross_cache_test.go
  • plugins/semanticcache/plugin_default_cache_key_test.go
  • plugins/semanticcache/plugin_edge_cases_test.go
  • plugins/semanticcache/plugin_embedding_test.go
  • plugins/semanticcache/plugin_image_generation_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_nil_content_test.go
  • plugins/semanticcache/plugin_no_mutation_test.go
  • plugins/semanticcache/plugin_no_store_test.go
  • plugins/semanticcache/plugin_normalization_test.go
  • plugins/semanticcache/plugin_paths_test.go
  • plugins/semanticcache/plugin_responses_test.go
  • plugins/semanticcache/plugin_streaming_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/semanticcache/search.go
  • plugins/semanticcache/state.go
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/utils.go
  • transports/bifrost-http/handlers/cache.go
  • transports/bifrost-http/handlers/cache_test.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/middlewares.go
  • ui/app/workspace/config/views/pluginsForm.tsx
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • ui/components/filters/logsFilterSidebar.tsx
  • ui/lib/constants/logs.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
💤 Files with no reviewable changes (1)
  • plugins/logging/operations.go
✅ Files skipped from review due to trivial changes (17)
  • .gitignore
  • ui/lib/constants/logs.ts
  • .claude/skills/docs-writer/SKILL.md
  • ui/app/workspace/logs/page.tsx
  • core/schemas/context.go
  • docs/openapi/openapi.json
  • transports/bifrost-http/handlers/cache.go
  • ui/lib/types/schemas.ts
  • framework/logstore/tables.go
  • transports/bifrost-http/handlers/logging.go
  • framework/modelcatalog/sync.go
  • docs/features/semantic-caching.mdx
  • plugins/semanticcache/plugin_conversation_config_test.go
  • transports/bifrost-http/handlers/middlewares.go
  • plugins/semanticcache/plugin_normalization_test.go
  • plugins/semanticcache/plugin_cross_cache_test.go
  • plugins/semanticcache/plugin_no_store_test.go
🚧 Files skipped from review as they are similar to previous changes (30)
  • docs/openapi/openapi.yaml
  • core/schemas/context_test.go
  • plugins/semanticcache/main_test.go
  • plugins/semanticcache/state.go
  • plugins/logging/main.go
  • ui/lib/types/logs.ts
  • docs/migration-guides/v1.5.0.mdx
  • plugins/semanticcache/plugin_default_cache_key_test.go
  • ui/lib/types/config.ts
  • framework/vectorstore/weaviate.go
  • docs/openapi/paths/management/cache.yaml
  • framework/logstore/matviews.go
  • plugins/semanticcache/plugin_no_mutation_test.go
  • plugins/semanticcache/plugin_image_generation_test.go
  • ui/components/filters/logsFilterSidebar.tsx
  • framework/logstore/rdb.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • transports/bifrost-http/handlers/cache_test.go
  • plugins/semanticcache/plugin_api_test.go
  • core/schemas/bifrost.go
  • ui/app/workspace/config/views/pluginsForm.tsx
  • plugins/semanticcache/plugin_streaming_test.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/plugin_edge_cases_test.go
  • plugins/semanticcache/stream.go
  • plugins/semanticcache/plugin_embedding_test.go
  • plugins/semanticcache/plugin_paths_test.go
  • plugins/semanticcache/utils.go
  • plugins/semanticcache/search.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants