refactor: semantic cache plugin#3210
Conversation
|
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds cache-hit-type filtering end-to-end, converts cache-clear APIs to use storage cache IDs, and implements a large refactor of the semantic-cache plugin: deterministic direct IDs, per-request state with cleanup, rewritten streaming accumulation/replay, deterministic hashing/metadata, unified async writes, TTL parsing, and broad test and UI/schema updates. ChangesEnd-to-end Cache-Hit Filtering & Logs
Semantic Cache Plugin Refactor
Sequence Diagram(s)sequenceDiagram
participant Client
participant Plugin as SemanticCache Plugin
participant Embedding as Embedding Executor
participant Store as Vector/Chunk Store
Client->>Plugin: Request (PreLLMHook)
Plugin->>Plugin: resolve cache_key, paramsHash, create cacheState
alt direct lookup
Plugin->>Store: GetChunk(directCacheID)
Store-->>Plugin: chunk / miss
else semantic lookup
Plugin->>Embedding: generate embedding
Embedding-->>Plugin: embedding + inputTokens
Plugin->>Store: GetNearest(filters: cache_key, params_hash, plugin_marker, maybe provider/model)
Store-->>Plugin: nearest result / miss
end
Plugin-->>Client: short-circuit response if hit
Client->>Plugin: Upstream provider result (PostLLMHook)
Plugin->>Plugin: shouldSkipCaching? resolve storageID & embedding-write eligibility
Plugin->>Store: async Add (stream or non-stream) with unified metadata
Store-->>Plugin: ack
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
Confidence Score: 4/5Safe to merge with awareness that cacheState for cache-hit requests is not cleared until the 60-minute reaper fires, and that cacheState uses CreatedAt rather than LastSeenAt as its reaper key. The refactoring is thorough and the new cacheState struct is a clear improvement over stringly-typed context keys. The cacheState reaper uses CreatedAt rather than LastSeenAt, meaning a very long-running streaming request (>60 min) could have its state evicted mid-flight, silently skipping the final cache write. Neither issue corrupts data or poses a security risk; both result in cache misses at worst. plugins/semanticcache/state.go (cacheState CreatedAt reaper key), plugins/semanticcache/main.go (cacheState not cleared on cache-hit path) Important Files Changed
Reviews (4): Last reviewed commit: "fix: semantic cache fixes" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ui/app/workspace/config/views/pluginsForm.tsx (1)
249-253:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAllow disabling an already-enabled plugin even when no embedding providers exist.
Line 249 disables the toggle when
embeddingProviders.length === 0, which also blocks turning the plugin OFF if it is currently enabled.Proposed fix
- disabled={!isVectorStoreEnabled || providersLoading || embeddingProviders.length === 0} + disabled={!isVectorStoreEnabled || providersLoading || (embeddingProviders.length === 0 && !originalCacheEnabled)}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/app/workspace/config/views/pluginsForm.tsx` around lines 249 - 253, The toggle is currently disabled whenever embeddingProviders.length === 0, which prevents turning an already-enabled plugin off; change the disabled prop to only block enabling (e.g., disabled={providersLoading || (!isVectorStoreEnabled && embeddingProviders.length === 0)}) and add a guard in onCheckedChange to ignore attempts to enable when embeddingProviders.length === 0 (e.g., if (checked && embeddingProviders.length === 0) return;) before calling handleSemanticCacheToggle(checked), leaving isVectorStoreEnabled, providersLoading, embeddingProviders, onCheckedChange, and handleSemanticCacheToggle as the referenced symbols to locate and update.
🧹 Nitpick comments (2)
transports/bifrost-http/handlers/middlewares.go (1)
52-79: ⚡ Quick winRemove the now-dead skip/goto branch in
CorsMiddleware.After disabling request-completion logging, this block is a no-op: the skip-path check and
goto corsFlowno longer change behavior. Please delete the dead control-flow and commented logger block to keep this middleware readable.♻️ Suggested cleanup
- // startTime := time.Now() - // skip logging if it's a /health check request - if slices.IndexFunc(loggingSkipPaths, func(path string) bool { - return strings.HasPrefix(string(ctx.RequestURI()), path) - }) != -1 { - goto corsFlow - } - // defer func() { - // ... - // }() - corsFlow: + // Request-completion logging intentionally disabled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@transports/bifrost-http/handlers/middlewares.go` around lines 52 - 79, In CorsMiddleware, remove the dead skip-path check and goto along with the entirely commented-out deferred request-completion logging block: delete the slices.IndexFunc(...) check referencing loggingSkipPaths and the goto corsFlow label, remove the commented defer block that builds the request log (including references to startTime, logger.LogHTTPRequest and schemas.* fields), and remove the remaining corsFlow: label so the middleware has no dead control-flow or commented legacy code.ui/app/workspace/config/views/pluginsForm.tsx (1)
324-332: ⚡ Quick winAdd a test selector to the new embedding model control.
The new
ModelMultiselectis interactive but has nodata-testid, which makes E2E targeting inconsistent.Proposed fix
<ModelMultiselect inputId="embedding_model" + data-testid="semantic-cache-embedding-model-multiselect" isSingleSelect provider={cacheConfig.provider || undefined} value={cacheConfig.embedding_model ?? ""}As per coding guidelines: "Add data-testid to all new interactive elements in React components for E2E test compatibility" and use
data-testid="<entity>-<element>-<qualifier>".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/app/workspace/config/views/pluginsForm.tsx` around lines 324 - 332, The ModelMultiselect instance rendering the embedding model picker lacks a data-testid; add a data-testid prop to the component (ModelMultiselect) using the project's naming convention "<entity>-<element>-<qualifier>" so E2E tests can target it reliably (e.g., use something like data-testid="cacheconfig-embedding_model-input" or similar consistent name), leaving the other props (inputId, provider, value, onChange, placeholder, disabled) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@framework/logstore/rdb.go`:
- Around line 195-217: The code currently skips adding any predicate when
filters.CacheHitTypes is non-empty but none pass the allowlist, causing
unfiltered results; change the branch so that after building valid you check if
len(filters.CacheHitTypes) > 0 && len(valid) == 0 and in that case add a safe
always-false predicate to baseQuery (e.g. baseQuery = baseQuery.Where("1=0") or
dialect-appropriate FALSE) so queries like cache_hit_types=foo return no rows;
update the block around filters.CacheHitTypes, baseQuery, and the Dialector
check (and keep the existing regex/json extraction logic unchanged when valid is
non-empty).
In `@plugins/semanticcache/main.go`:
- Around line 710-724: ClearCacheForRequestID currently deletes using the raw
requestID but new entries use deterministic directCacheID via
resolveStorageIDAndEmbedding, so callers passing
schemas.BifrostContextKeyRequestID miss entries; update
Plugin.ClearCacheForRequestID to accept either a directCacheID or compute the
correct storage ID by calling resolveStorageIDAndEmbedding (or the helper that
produces directCacheID) for the given requestID and then call
plugin.store.Delete(ctx, plugin.config.VectorStoreNamespace, computedStorageID);
ensure logging uses the actual storage key deleted and preserve the
timeout/cancel pattern already present.
- Around line 285-286: WaitForPendingOperations() is blocked by the long-lived
runStreamCleanupLoop because Init increments plugin.waitGroup for that
goroutine; remove the forever-loop from the same wait group used to flush
pending writes. Concretely, stop calling plugin.waitGroup.Add(1) for
runStreamCleanupLoop in Init (or move that Add/Done into a separate
longLivedWaitGroup variable or into Cleanup()), so runStreamCleanupLoop is not
waited on by WaitForPendingOperations(); keep WaitForPendingOperations() waiting
only for short-lived write/flush goroutines so tests can call it to flush
pending cache writes.
In `@plugins/semanticcache/plugin_cache_type_test.go`:
- Around line 617-619: The tests still call the removed 3-argument
performDirectSearch; update the remaining call sites in
TestDefaultDirectSearchSetsStorageIDForDeterministicWrites and
TestPerformDirectSearchDisablesScanFallbackForLegacyLookup (also the calls
around lines 858-860) to match the new performDirectSearch signature in
plugins/semanticcache/search.go by adding the new parameter(s) and adapting to
any changed return values; ensure you pass the appropriate
context/req/plugin/storageID or options values that the new signature requires
and update the assertions to reflect the new outputs.
In `@plugins/semanticcache/stream.go`:
- Around line 62-70: The reaper currently uses the first chunk timestamp
(Chunks[0].Timestamp) to determine staleness, causing active streams to be
removed; update the StreamAccumulator struct to include a lastSeen time field,
set/update that field inside addStreamChunk (and when creating the accumulator),
and change the reaping logic (the code referenced around lines 141-150) to
compare streamAccumulatorMaxAge against accumulator.lastSeen instead of
Chunks[0].Timestamp so last activity, not the first chunk, controls expiry.
In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Around line 126-134: The effect that runs when embeddingProviders changes
currently forces provider to embeddingProviders[0].name and can overwrite a
user's in-progress selection; change the setCacheConfig call inside the effect
so it only sets the default provider when there is no existing provider (i.e.,
when prev.provider is null/undefined/empty) rather than unconditionally—inside
the setCacheConfig updater for the effect referencing embeddingProviders and
semanticCachePlugin?.config, short-circuit and return prev if prev.provider
already exists, otherwise populate provider with embeddingProviders[0].name and
fill embedding_model/dimension defaults.
---
Outside diff comments:
In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Around line 249-253: The toggle is currently disabled whenever
embeddingProviders.length === 0, which prevents turning an already-enabled
plugin off; change the disabled prop to only block enabling (e.g.,
disabled={providersLoading || (!isVectorStoreEnabled &&
embeddingProviders.length === 0)}) and add a guard in onCheckedChange to ignore
attempts to enable when embeddingProviders.length === 0 (e.g., if (checked &&
embeddingProviders.length === 0) return;) before calling
handleSemanticCacheToggle(checked), leaving isVectorStoreEnabled,
providersLoading, embeddingProviders, onCheckedChange, and
handleSemanticCacheToggle as the referenced symbols to locate and update.
---
Nitpick comments:
In `@transports/bifrost-http/handlers/middlewares.go`:
- Around line 52-79: In CorsMiddleware, remove the dead skip-path check and goto
along with the entirely commented-out deferred request-completion logging block:
delete the slices.IndexFunc(...) check referencing loggingSkipPaths and the goto
corsFlow label, remove the commented defer block that builds the request log
(including references to startTime, logger.LogHTTPRequest and schemas.* fields),
and remove the remaining corsFlow: label so the middleware has no dead
control-flow or commented legacy code.
In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Around line 324-332: The ModelMultiselect instance rendering the embedding
model picker lacks a data-testid; add a data-testid prop to the component
(ModelMultiselect) using the project's naming convention
"<entity>-<element>-<qualifier>" so E2E tests can target it reliably (e.g., use
something like data-testid="cacheconfig-embedding_model-input" or similar
consistent name), leaving the other props (inputId, provider, value, onChange,
placeholder, disabled) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dacf00c4-f836-45fb-a591-e2b29e8069ed
📒 Files selected for processing (20)
.gitignorecore/schemas/bifrost.goframework/logstore/matviews.goframework/logstore/rdb.goframework/logstore/tables.goframework/modelcatalog/sync.goplugins/semanticcache/main.goplugins/semanticcache/plugin_cache_type_test.goplugins/semanticcache/search.goplugins/semanticcache/stream.goplugins/semanticcache/utils.gotransports/bifrost-http/handlers/logging.gotransports/bifrost-http/handlers/middlewares.goui/app/workspace/config/views/pluginsForm.tsxui/app/workspace/logs/page.tsxui/app/workspace/logs/sheets/logDetailView.tsxui/components/filters/logsFilterSidebar.tsxui/lib/constants/logs.tsui/lib/store/apis/logsApi.tsui/lib/types/logs.ts
76e9d94 to
2ee69fd
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/semanticcache/plugin_edge_cases_test.go (1)
350-355:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse a unique cache key per subtest here.
All content-variation cases reuse
content-variations-test, and this loop never clears the store between subtests. That lets earlier semantic entries bleed into later cases, so a case may stop validating its own first write path.Suggested fix
- ctx := CreateContextWithCacheKey(t, "content-variations-test") + ctx := CreateContextWithCacheKey(t, "content-variations-test-"+tt.name)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/plugin_edge_cases_test.go` around lines 350 - 355, The subtests reuse the same cache key ("content-variations-test") causing state leakage between cases; update the context creation in the test loop so each subtest gets a unique cache key (e.g., derive from tt.name or t.Name(), or append a UUID/timestamp) when calling CreateContextWithCacheKey so each run uses an isolated store and prevents earlier semantic entries from affecting later cases.
🧹 Nitpick comments (5)
plugins/semanticcache/utils.go (1)
5-16: ⚡ Quick winSwitch the cache serialization path to
sonic.These marshal/unmarshal calls are on the cache read/write path, so they should use the repo-standard hot-path JSON library instead of
encoding/json.Suggested fix
- "encoding/json" + sonic "github.com/bytedance/sonic"- responseData, err := json.Marshal(res) + responseData, err := sonic.Marshal(res)- if err := json.Unmarshal([]byte(v), &stringArray); err != nil { + if err := sonic.Unmarshal([]byte(v), &stringArray); err != nil {As per coding guidelines
**/*.go: Use github.com/bytedance/sonic for JSON marshaling in hot paths; use encoding/json for custom marshaling in schemas.Also applies to: 433-445, 495-520
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/utils.go` around lines 5 - 16, Replace the hot-path JSON usage of encoding/json in plugins/semanticcache/utils.go with the repo-standard fast library: remove the "encoding/json" import and add "github.com/bytedance/sonic" to the imports, then change all json.Marshal/json.Unmarshal calls in this file (including the cache read/write sites around the ranges mentioned: roughly lines 433-445 and 495-520) to use sonic.Marshal/sonic.Unmarshal instead; leave any custom schema marshaling that intentionally uses encoding/json (schemas package) unchanged.plugins/semanticcache/plugin_nil_content_test.go (1)
91-92: ⚡ Quick winAssert the nil-content path stays error-free.
This test currently passes even if
extractTextForEmbeddingstarts returning an error. Since the regression you care about is “nil content should be handled safely,” please fail the test whenerr != nil.💡 Tighten the assertion
text, err := plugin.extractTextForEmbedding(nil, tt.request) - t.Logf("text=%q, err=%v", text, err) + if err != nil { + t.Fatalf("extractTextForEmbedding returned error: %v", err) + } + t.Logf("text=%q", text)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/plugin_nil_content_test.go` around lines 91 - 92, The test must fail if extractTextForEmbedding returns an error for the nil-content case: after calling plugin.extractTextForEmbedding(nil, tt.request) (symbols: extractTextForEmbedding, plugin, tt.request) assert that err is nil and fail the test when it's not (e.g. use t.Fatalf or require.NoError) before proceeding to inspect text so the nil-content regression is caught.transports/bifrost-http/handlers/cache.go (1)
55-69: 💤 Low valueMinor validation inconsistency with
clearCachehandler.The
clearCacheByKeyhandler validates the type assertion (!ok) but doesn't check for empty string likeclearCachedoes on line 41. Consider adding an empty string check for consistency, though an empty cache key is unlikely to match anything meaningful.♻️ Optional: Add empty string validation for consistency
func (h *CacheHandler) clearCacheByKey(ctx *fasthttp.RequestCtx) { cacheKey, ok := ctx.UserValue("cacheKey").(string) - if !ok { + if !ok || cacheKey == "" { SendError(ctx, fasthttp.StatusBadRequest, "Invalid cache key") return }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@transports/bifrost-http/handlers/cache.go` around lines 55 - 69, The clearCacheByKey handler currently only type-asserts cacheKey but doesn't reject empty strings; update clearCacheByKey to mirror clearCache by checking cacheKey == "" after the type assertion and call SendError(ctx, fasthttp.StatusBadRequest, "Invalid cache key") if empty, before invoking h.plugin.ClearCacheForKey(cacheKey), leaving SendError and SendJSON calls unchanged.transports/bifrost-http/handlers/cache_test.go (2)
78-89: ⚡ Quick winAssert no plugin call when
cacheIduser value is missing.This test checks status code only; add a call-count assertion to prevent regressions where the handler still invokes clearing logic on invalid input.
Proposed test hardening
func TestClearCache_MissingUserValue(t *testing.T) { clearer := &fakeCacheClearer{} h := &CacheHandler{plugin: clearer} @@ if got := ctx.Response.StatusCode(); got != fasthttp.StatusBadRequest { t.Fatalf("expected 400 when cacheId user value missing, got %d", got) } + if len(clearer.idCalls) != 0 { + t.Fatalf("expected no Clear calls when cacheId is missing, got %v", clearer.idCalls) + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@transports/bifrost-http/handlers/cache_test.go` around lines 78 - 89, Add an assertion that the fake cache clearer was not invoked when the user value "cacheId" is missing: after calling CacheHandler.clearCache (in TestClearCache_MissingUserValue) assert that the fakeCacheClearer call-count field (e.g., callCount / clearedCalls on the fakeCacheClearer) is zero so the test fails if CacheHandler.clearCache still calls clearer.Clear...; if the fake doesn't already expose a counter, add one to fakeCacheClearer and increment it in its Clear method, then assert it remains 0.
112-139: ⚡ Quick winAdd negative-path coverage for
clearCacheByKey(empty/missing key).
clearCachealready guards invalid path values, butclearCacheByKeytests currently miss equivalent bad-input assertions. Adding them improves contract symmetry and catches accidental plugin calls.Proposed additional tests
+func TestClearCacheByKey_RejectsEmptyKey(t *testing.T) { + clearer := &fakeCacheClearer{} + h := &CacheHandler{plugin: clearer} + + ctx := newCacheCtx("cacheKey", "") + h.clearCacheByKey(ctx) + + if got := ctx.Response.StatusCode(); got != fasthttp.StatusBadRequest { + t.Fatalf("expected 400 for empty key, got %d", got) + } + if len(clearer.keyCalls) != 0 { + t.Fatalf("expected no Clear calls on bad key, got %v", clearer.keyCalls) + } +} + +func TestClearCacheByKey_MissingUserValue(t *testing.T) { + clearer := &fakeCacheClearer{} + h := &CacheHandler{plugin: clearer} + + ctx := &fasthttp.RequestCtx{} + h.clearCacheByKey(ctx) + + if got := ctx.Response.StatusCode(); got != fasthttp.StatusBadRequest { + t.Fatalf("expected 400 when cacheKey user value missing, got %d", got) + } + if len(clearer.keyCalls) != 0 { + t.Fatalf("expected no Clear calls when cacheKey is missing, got %v", clearer.keyCalls) + } +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@transports/bifrost-http/handlers/cache_test.go` around lines 112 - 139, Add a new test that covers the negative path for CacheHandler.clearCacheByKey: create a fakeCacheClearer and CacheHandler as in TestClearCacheByKey_OK, call h.clearCacheByKey with a context made via newCacheCtx using an empty string (or missing) key, assert that ctx.Response.StatusCode() is fasthttp.StatusBadRequest (or the handler's validation status) and assert that the fake clearer's keyCalls slice remains empty (i.e., ClearCacheForKey was not invoked); place this alongside TestClearCacheByKey_OK and TestClearCacheByKey_PluginErrorReturns500 to ensure symmetry of input validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/schemas/context.go`:
- Around line 153-158: The Root method currently only returns a single level of
valueDelegate which leaves intermediate pooled scopes in the chain; modify
BifrostContext.Root to traverse/unwarp the delegation chain by following
valueDelegate repeatedly until nil (e.g., loop while bc.valueDelegate != nil
updating bc = bc.valueDelegate) and then return the ultimate root; update the
method on type BifrostContext (Root) so it always resolves to the real request
root even when multiple scoped contexts were derived.
In `@plugins/semanticcache/plugin_api_test.go`:
- Around line 352-358: The test currently does a blocking receive <-sc.Stream
before calling ctx.Cancel(), which can hang if the first chunk never arrives;
change that receive to a guarded receive using a timeout (e.g. select between
sc.Stream and time.After or a context with timeout) so the test fails fast if no
first chunk is produced, then call ctx.Cancel() and proceed to the existing
drain/timeout logic; reference the sc.Stream receive and ctx.Cancel() locations
when making the change.
In `@plugins/semanticcache/plugin_core_test.go`:
- Around line 14-16: The test TestSemanticCacheBasicFunctionality is
timing-sensitive and should not run in parallel; remove the t.Parallel() call
(the parallelization invoked in TestSemanticCacheBasicFunctionality) so the test
runs serially and avoids amplified CPU/network contention that makes the ratio
assertion flaky in CI.
In `@plugins/semanticcache/plugin_embedding_test.go`:
- Around line 84-86: The test is supposed to exercise the "no cache key" path
but calling CreateContextWithCacheKey(t, "") still sets CacheKey (derived from
t.Name()); replace that call so the context truly has no CacheKey — e.g. use
context.Background() (or a newly added helper like CreateContextWithoutCacheKey)
in the test instead of CreateContextWithCacheKey(t, "") so the CacheKey field
remains unset and the uncached path is exercised.
In `@plugins/semanticcache/plugin_responses_test.go`:
- Around line 307-308: Replace the inconsistent context creation in the "no
cache key" test by using newBaseTestContext() instead of
CreateContextWithCacheKey(t, ""); locate the test that currently calls
CreateContextWithCacheKey(t, "") and change it to newBaseTestContext() so the
test uses an explicit base (no-key) context and avoids accidentally supplying a
cache key.
In `@plugins/semanticcache/search.go`:
- Around line 342-365: The loop currently unmarshals each chunk (json.Unmarshal
into cachedResponse) and skips malformed ones, which can truncate the stream and
skip stampCacheDebugForHit; instead, pre-validate the entire streamArray before
replay: iterate over all chunkStr entries, attempt json.Unmarshal into a
temporary struct (and ensure ExtraFields/RequestType are present), and if any
unmarshal/validation fails treat it as a cache miss (log the error and abort
replay) so that plugin.stampCacheDebugForHit
(plugin.stampCacheDebugForHit(state, cachedResponse.GetExtraFields(), result.ID,
...)) and downstream final-chunk handling are guaranteed to run only for
fully-valid cached streams.
- Around line 167-179: The switch on embedding currently handles EmbeddingStr,
EmbeddingArray and Embedding2DArray but omits EmbeddingInt8Array and
EmbeddingInt32Array; add cases for embedding.EmbeddingInt8Array and
embedding.EmbeddingInt32Array that convert those numeric arrays to []float32
using helper functions (e.g. toFloat32EmbeddingFromInt8 and
toFloat32EmbeddingFromInt32) and return the converted slice along with
inputTokens and nil error so those formats no longer fall through to the final
error branch.
---
Outside diff comments:
In `@plugins/semanticcache/plugin_edge_cases_test.go`:
- Around line 350-355: The subtests reuse the same cache key
("content-variations-test") causing state leakage between cases; update the
context creation in the test loop so each subtest gets a unique cache key (e.g.,
derive from tt.name or t.Name(), or append a UUID/timestamp) when calling
CreateContextWithCacheKey so each run uses an isolated store and prevents
earlier semantic entries from affecting later cases.
---
Nitpick comments:
In `@plugins/semanticcache/plugin_nil_content_test.go`:
- Around line 91-92: The test must fail if extractTextForEmbedding returns an
error for the nil-content case: after calling
plugin.extractTextForEmbedding(nil, tt.request) (symbols:
extractTextForEmbedding, plugin, tt.request) assert that err is nil and fail the
test when it's not (e.g. use t.Fatalf or require.NoError) before proceeding to
inspect text so the nil-content regression is caught.
In `@plugins/semanticcache/utils.go`:
- Around line 5-16: Replace the hot-path JSON usage of encoding/json in
plugins/semanticcache/utils.go with the repo-standard fast library: remove the
"encoding/json" import and add "github.com/bytedance/sonic" to the imports, then
change all json.Marshal/json.Unmarshal calls in this file (including the cache
read/write sites around the ranges mentioned: roughly lines 433-445 and 495-520)
to use sonic.Marshal/sonic.Unmarshal instead; leave any custom schema marshaling
that intentionally uses encoding/json (schemas package) unchanged.
In `@transports/bifrost-http/handlers/cache_test.go`:
- Around line 78-89: Add an assertion that the fake cache clearer was not
invoked when the user value "cacheId" is missing: after calling
CacheHandler.clearCache (in TestClearCache_MissingUserValue) assert that the
fakeCacheClearer call-count field (e.g., callCount / clearedCalls on the
fakeCacheClearer) is zero so the test fails if CacheHandler.clearCache still
calls clearer.Clear...; if the fake doesn't already expose a counter, add one to
fakeCacheClearer and increment it in its Clear method, then assert it remains 0.
- Around line 112-139: Add a new test that covers the negative path for
CacheHandler.clearCacheByKey: create a fakeCacheClearer and CacheHandler as in
TestClearCacheByKey_OK, call h.clearCacheByKey with a context made via
newCacheCtx using an empty string (or missing) key, assert that
ctx.Response.StatusCode() is fasthttp.StatusBadRequest (or the handler's
validation status) and assert that the fake clearer's keyCalls slice remains
empty (i.e., ClearCacheForKey was not invoked); place this alongside
TestClearCacheByKey_OK and TestClearCacheByKey_PluginErrorReturns500 to ensure
symmetry of input validation.
In `@transports/bifrost-http/handlers/cache.go`:
- Around line 55-69: The clearCacheByKey handler currently only type-asserts
cacheKey but doesn't reject empty strings; update clearCacheByKey to mirror
clearCache by checking cacheKey == "" after the type assertion and call
SendError(ctx, fasthttp.StatusBadRequest, "Invalid cache key") if empty, before
invoking h.plugin.ClearCacheForKey(cacheKey), leaving SendError and SendJSON
calls unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9762ab5e-8d60-4d60-b04f-6edf04c58a4a
📒 Files selected for processing (53)
.claude/skills/docs-writer/SKILL.md.gitignorecore/schemas/bifrost.gocore/schemas/context.godocs/features/semantic-caching.mdxdocs/migration-guides/v1.5.0.mdxdocs/openapi/openapi.jsondocs/openapi/openapi.yamldocs/openapi/paths/management/cache.yamlframework/logstore/matviews.goframework/logstore/rdb.goframework/logstore/tables.goframework/modelcatalog/sync.goframework/vectorstore/weaviate.goplugins/logging/main.goplugins/logging/operations.goplugins/semanticcache/main.goplugins/semanticcache/main_test.goplugins/semanticcache/plugin_api_test.goplugins/semanticcache/plugin_cache_type_test.goplugins/semanticcache/plugin_conversation_config_test.goplugins/semanticcache/plugin_core_test.goplugins/semanticcache/plugin_cross_cache_test.goplugins/semanticcache/plugin_default_cache_key_test.goplugins/semanticcache/plugin_edge_cases_test.goplugins/semanticcache/plugin_embedding_test.goplugins/semanticcache/plugin_image_generation_test.goplugins/semanticcache/plugin_integration_test.goplugins/semanticcache/plugin_nil_content_test.goplugins/semanticcache/plugin_no_store_test.goplugins/semanticcache/plugin_normalization_test.goplugins/semanticcache/plugin_paths_test.goplugins/semanticcache/plugin_responses_test.goplugins/semanticcache/plugin_streaming_test.goplugins/semanticcache/plugin_vectorstore_test.goplugins/semanticcache/search.goplugins/semanticcache/state.goplugins/semanticcache/stream.goplugins/semanticcache/test_utils.goplugins/semanticcache/utils.gotransports/bifrost-http/handlers/cache.gotransports/bifrost-http/handlers/cache_test.gotransports/bifrost-http/handlers/logging.gotransports/bifrost-http/handlers/middlewares.goui/app/workspace/config/views/pluginsForm.tsxui/app/workspace/logs/page.tsxui/app/workspace/logs/sheets/logDetailView.tsxui/components/filters/logsFilterSidebar.tsxui/lib/constants/logs.tsui/lib/store/apis/logsApi.tsui/lib/types/config.tsui/lib/types/logs.tsui/lib/types/schemas.ts
💤 Files with no reviewable changes (1)
- plugins/logging/operations.go
✅ Files skipped from review due to trivial changes (6)
- .gitignore
- ui/lib/types/logs.ts
- transports/bifrost-http/handlers/logging.go
- core/schemas/bifrost.go
- .claude/skills/docs-writer/SKILL.md
- transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (10)
- ui/lib/constants/logs.ts
- ui/app/workspace/logs/page.tsx
- framework/logstore/matviews.go
- ui/components/filters/logsFilterSidebar.tsx
- ui/app/workspace/logs/sheets/logDetailView.tsx
- framework/modelcatalog/sync.go
- ui/lib/store/apis/logsApi.ts
- framework/logstore/tables.go
- plugins/semanticcache/stream.go
- plugins/semanticcache/main.go
| if err := json.Unmarshal([]byte(chunkStr), &cachedResponse); err != nil { | ||
| plugin.logger.Warn("%s Failed to unmarshal stream chunk %d, skipping: %v", PluginLoggerPrefix, i, err) | ||
| plugin.logger.Warn("Failed to unmarshal stream chunk %d, skipping: %v", i, err) | ||
| continue | ||
| } | ||
|
|
||
| // Ensure RequestType is set on every chunk so downstream consumers | ||
| // (logging, telemetry, etc.) correctly identify this as a streaming response. | ||
| // (logging, telemetry) correctly identify this as a streaming response. | ||
| if ef := cachedResponse.GetExtraFields(); ef != nil && ef.RequestType == "" { | ||
| ef.RequestType = req.RequestType | ||
| } | ||
|
|
||
| // Add cache debug to only the last chunk | ||
| if i == len(streamArray)-1 { | ||
| ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) | ||
| extraFields := cachedResponse.GetExtraFields() | ||
| cacheDebug := schemas.BifrostCacheDebug{ | ||
| CacheHit: true, | ||
| HitType: bifrost.Ptr(string(cacheType)), | ||
| CacheID: bifrost.Ptr(result.ID), | ||
| RequestedProvider: bifrost.Ptr(string(requestedProvider)), | ||
| RequestedModel: bifrost.Ptr(requestedModel), | ||
| } | ||
| if cacheType == CacheTypeSemantic { | ||
| cacheDebug.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider)) | ||
| cacheDebug.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel) | ||
| cacheDebug.Threshold = &threshold | ||
| cacheDebug.Similarity = &similarity | ||
| cacheDebug.InputTokens = &inputTokens | ||
| } else { | ||
| cacheDebug.ProviderUsed = nil | ||
| cacheDebug.ModelUsed = nil | ||
| cacheDebug.Threshold = nil | ||
| cacheDebug.Similarity = nil | ||
| cacheDebug.InputTokens = nil | ||
| } | ||
| extraFields.CacheDebug = &cacheDebug | ||
| // stampCacheDebugForHit marks this chunk as the cache-hit final | ||
| // chunk; cache.PostLLMHook keys off CacheDebug.CacheHit=true to | ||
| // set BifrostContextKeyStreamEndIndicator on the root ctx | ||
| // synchronously (same goroutine as logging.PostLLMHook). | ||
| // | ||
| // We deliberately do NOT call ctx.Root().SetValue here. Doing | ||
| // so races against the receiver's PostLLMHook for the previous | ||
| // chunk: the cache replay can advance to iteration N (and | ||
| // write the indicator) while the receiver is still running | ||
| // PostLLMHooks for chunk N-1, poisoning that chunk's | ||
| // IsFinalChunk read and causing duplicate "final" events. | ||
| plugin.stampCacheDebugForHit(state, cachedResponse.GetExtraFields(), result.ID, requestedProvider, requestedModel, cacheType, threshold, similarity, inputTokens) |
There was a problem hiding this comment.
Fail closed on malformed cached stream chunks.
Skipping a bad chunk here turns a corrupt cache entry into a truncated stream. If the malformed chunk is the final one, stampCacheDebugForHit never runs, so downstream final-chunk handling and log finalization can be missed entirely. Treat any decode failure as a cache miss by validating the full chunk list before starting replay.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@plugins/semanticcache/search.go` around lines 342 - 365, The loop currently
unmarshals each chunk (json.Unmarshal into cachedResponse) and skips malformed
ones, which can truncate the stream and skip stampCacheDebugForHit; instead,
pre-validate the entire streamArray before replay: iterate over all chunkStr
entries, attempt json.Unmarshal into a temporary struct (and ensure
ExtraFields/RequestType are present), and if any unmarshal/validation fails
treat it as a cache miss (log the error and abort replay) so that
plugin.stampCacheDebugForHit (plugin.stampCacheDebugForHit(state,
cachedResponse.GetExtraFields(), result.ID, ...)) and downstream final-chunk
handling are guaranteed to run only for fully-valid cached streams.
2ee69fd to
59459da
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
plugins/semanticcache/search.go (1)
344-355:⚠️ Potential issue | 🟠 MajorStill fail closed on malformed cached stream chunks.
Skipping a bad chunk turns a corrupt cache entry into a partial replay. If the malformed chunk is the last one,
stampCacheDebugForHitnever runs, so downstream final-chunk handling/log finalization can be skipped entirely. Abort replay on the first decode/validation failure and treat the entry as a cache miss instead.Also applies to: 364-377
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/search.go` around lines 344 - 355, The current goroutine that iterates streamArray and unmarshals into schemas.BifrostResponse (inside the anonymous func that closes streamChan) silently skips malformed chunks by logging via plugin.logger.Warn and continuing; change this to fail closed: on the first json.Unmarshal or validation error for a chunk (in the loop over streamArray) immediately log the error, abort the replay by closing streamChan and returning (so no further chunks are sent), and ensure the cache is treated as a miss by invoking the same failure path used elsewhere (so stampCacheDebugForHit is not called); apply the same behavior to the later block handling indices 364-377 to keep consistent fail-closed semantics.
🧹 Nitpick comments (3)
ui/app/workspace/config/views/pluginsForm.tsx (2)
98-98: 💤 Low valueFilename uses camelCase instead of PascalCase.
Per coding guidelines, React component files should use PascalCase for filenames (
PluginsForm.tsxinstead ofpluginsForm.tsx). Since this is an existing file, this can be addressed in a separate housekeeping refactor.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ui/app/workspace/config/views/pluginsForm.tsx` at line 98, The file name uses camelCase but React component files should be PascalCase; rename the file to PluginsForm.tsx and update all imports/usages that reference the old filename (e.g., any import of PluginsForm from "pluginsForm" or similar) to the new PascalCase module name so the exported component PluginsForm continues to be imported correctly throughout the codebase; ensure build/test pass after the rename.
315-351: 💤 Low valueConsider adding
data-testidattributes for E2E test compatibility.The new provider
SelectandModelMultiselectcomponents lackdata-testidattributes. Per coding guidelines, new interactive elements should include them (e.g.,data-testid="semantic-cache-provider-select",data-testid="semantic-cache-embedding-model-select").Based on learnings, this can be deferred to a dedicated PR that also updates related E2E tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ui/app/workspace/config/views/pluginsForm.tsx` around lines 315 - 351, The Select for choosing provider (rendered with the Select/SelectTrigger/SelectContent/SelectItem components) and the ModelMultiselect (props: inputId="embedding_model", isSingleSelect, provider, value, onChange) are missing data-testid attributes required for E2E tests; add data-testid="semantic-cache-provider-select" to the provider Select trigger/component and data-testid="semantic-cache-embedding-model-select" to the ModelMultiselect (or its input wrapper) while preserving existing props and behavior (updateCacheConfigLocal and cacheConfig usage remain unchanged).plugins/semanticcache/plugin_core_test.go (1)
454-480: ⚡ Quick win
custom_ttlstill doesn't exercise TTL behavior.This branch only re-checks the generic direct-hit path, so a regression in
resolveTTL/expires_athandling would still pass. Use a short TTL and assert the entry expires, or inspect the storedexpires_atmetadata directly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/plugin_core_test.go` around lines 454 - 480, The test's custom_ttl branch only re-checks a direct cache hit and doesn't validate TTL expiry; update the test for the "custom_ttl" tt case to actually exercise resolveTTL/expires_at by configuring a short TTL (e.g., 1s) for that plugin, then either: (A) after the first successful cached response (response2 from setup.Client.ChatCompletionRequest), wait slightly longer than the short TTL and issue a new ChatCompletionRequest and assert a cache miss (i.e., AssertCacheMiss or that CacheDebug.CacheHit is false), or (B) directly read the stored cache metadata for that request (inspect response2.ExtraFields.CacheDebug or the underlying cache entry) and assert the expires_at value equals now + configured TTL (and that it is within an acceptable delta). Modify the test helper or fixture that builds the "custom_ttl" config and add the sleep+re-request or metadata assertion to ensure resolveTTL/expires_at behavior is verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/semanticcache/main.go`:
- Around line 354-358: createCacheState is persisting cache state too early
(called before isConversationHistoryThresholdExceeded and before the ParamsHash
empty-path in PostLLMHook), causing stale cacheStates to accumulate for requests
that are never cacheable; change the flow so you do not persist the cacheState
up-front—either create a non-persisted temporary state or delay calling
plugin.createCacheState until after
plugin.isConversationHistoryThresholdExceeded returns false and after you verify
ParamsHash != "" (i.e., only persist the state when the request is actually
cacheable in PostLLMHook and related paths referenced around
createCacheState/isConversationHistoryThresholdExceeded).
- Around line 625-640: In stampCacheDebugForMiss, ensure all hit-only fields are
cleared when stamping a miss: set CacheHit=false and CacheID as already done,
and explicitly nil-out or reset HitType, Threshold, Similarity, CacheHitLatency,
ProviderUsed, ModelUsed and InputTokens on extraFields.CacheDebug so no previous
hit data leaks into miss responses; update the function (stampCacheDebugForMiss)
to clear those fields on cd (the local CacheDebug variable) before returning.
---
Duplicate comments:
In `@plugins/semanticcache/search.go`:
- Around line 344-355: The current goroutine that iterates streamArray and
unmarshals into schemas.BifrostResponse (inside the anonymous func that closes
streamChan) silently skips malformed chunks by logging via plugin.logger.Warn
and continuing; change this to fail closed: on the first json.Unmarshal or
validation error for a chunk (in the loop over streamArray) immediately log the
error, abort the replay by closing streamChan and returning (so no further
chunks are sent), and ensure the cache is treated as a miss by invoking the same
failure path used elsewhere (so stampCacheDebugForHit is not called); apply the
same behavior to the later block handling indices 364-377 to keep consistent
fail-closed semantics.
---
Nitpick comments:
In `@plugins/semanticcache/plugin_core_test.go`:
- Around line 454-480: The test's custom_ttl branch only re-checks a direct
cache hit and doesn't validate TTL expiry; update the test for the "custom_ttl"
tt case to actually exercise resolveTTL/expires_at by configuring a short TTL
(e.g., 1s) for that plugin, then either: (A) after the first successful cached
response (response2 from setup.Client.ChatCompletionRequest), wait slightly
longer than the short TTL and issue a new ChatCompletionRequest and assert a
cache miss (i.e., AssertCacheMiss or that CacheDebug.CacheHit is false), or (B)
directly read the stored cache metadata for that request (inspect
response2.ExtraFields.CacheDebug or the underlying cache entry) and assert the
expires_at value equals now + configured TTL (and that it is within an
acceptable delta). Modify the test helper or fixture that builds the
"custom_ttl" config and add the sleep+re-request or metadata assertion to ensure
resolveTTL/expires_at behavior is verified.
In `@ui/app/workspace/config/views/pluginsForm.tsx`:
- Line 98: The file name uses camelCase but React component files should be
PascalCase; rename the file to PluginsForm.tsx and update all imports/usages
that reference the old filename (e.g., any import of PluginsForm from
"pluginsForm" or similar) to the new PascalCase module name so the exported
component PluginsForm continues to be imported correctly throughout the
codebase; ensure build/test pass after the rename.
- Around line 315-351: The Select for choosing provider (rendered with the
Select/SelectTrigger/SelectContent/SelectItem components) and the
ModelMultiselect (props: inputId="embedding_model", isSingleSelect, provider,
value, onChange) are missing data-testid attributes required for E2E tests; add
data-testid="semantic-cache-provider-select" to the provider Select
trigger/component and data-testid="semantic-cache-embedding-model-select" to the
ModelMultiselect (or its input wrapper) while preserving existing props and
behavior (updateCacheConfigLocal and cacheConfig usage remain unchanged).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 849d029e-ae40-4d26-b642-e346b85f6365
📒 Files selected for processing (55)
.claude/skills/docs-writer/SKILL.md.gitignorecore/schemas/bifrost.gocore/schemas/context.gocore/schemas/context_test.godocs/features/semantic-caching.mdxdocs/migration-guides/v1.5.0.mdxdocs/openapi/openapi.jsondocs/openapi/openapi.yamldocs/openapi/paths/management/cache.yamlframework/logstore/matviews.goframework/logstore/rdb.goframework/logstore/tables.goframework/modelcatalog/sync.goframework/vectorstore/weaviate.goplugins/logging/main.goplugins/logging/operations.goplugins/semanticcache/main.goplugins/semanticcache/main_test.goplugins/semanticcache/plugin_api_test.goplugins/semanticcache/plugin_cache_type_test.goplugins/semanticcache/plugin_conversation_config_test.goplugins/semanticcache/plugin_core_test.goplugins/semanticcache/plugin_cross_cache_test.goplugins/semanticcache/plugin_default_cache_key_test.goplugins/semanticcache/plugin_edge_cases_test.goplugins/semanticcache/plugin_embedding_test.goplugins/semanticcache/plugin_image_generation_test.goplugins/semanticcache/plugin_integration_test.goplugins/semanticcache/plugin_nil_content_test.goplugins/semanticcache/plugin_no_mutation_test.goplugins/semanticcache/plugin_no_store_test.goplugins/semanticcache/plugin_normalization_test.goplugins/semanticcache/plugin_paths_test.goplugins/semanticcache/plugin_responses_test.goplugins/semanticcache/plugin_streaming_test.goplugins/semanticcache/plugin_vectorstore_test.goplugins/semanticcache/search.goplugins/semanticcache/state.goplugins/semanticcache/stream.goplugins/semanticcache/test_utils.goplugins/semanticcache/utils.gotransports/bifrost-http/handlers/cache.gotransports/bifrost-http/handlers/cache_test.gotransports/bifrost-http/handlers/logging.gotransports/bifrost-http/handlers/middlewares.goui/app/workspace/config/views/pluginsForm.tsxui/app/workspace/logs/page.tsxui/app/workspace/logs/sheets/logDetailView.tsxui/components/filters/logsFilterSidebar.tsxui/lib/constants/logs.tsui/lib/store/apis/logsApi.tsui/lib/types/config.tsui/lib/types/logs.tsui/lib/types/schemas.ts
💤 Files with no reviewable changes (1)
- plugins/logging/operations.go
✅ Files skipped from review due to trivial changes (13)
- .gitignore
- plugins/logging/main.go
- ui/lib/constants/logs.ts
- transports/bifrost-http/handlers/logging.go
- .claude/skills/docs-writer/SKILL.md
- framework/modelcatalog/sync.go
- framework/logstore/rdb.go
- ui/lib/types/logs.ts
- ui/app/workspace/logs/sheets/logDetailView.tsx
- ui/app/workspace/logs/page.tsx
- ui/lib/types/schemas.ts
- plugins/semanticcache/stream.go
- plugins/semanticcache/plugin_no_store_test.go
🚧 Files skipped from review as they are similar to previous changes (25)
- ui/lib/store/apis/logsApi.ts
- docs/openapi/paths/management/cache.yaml
- docs/migration-guides/v1.5.0.mdx
- docs/openapi/openapi.json
- transports/bifrost-http/handlers/middlewares.go
- ui/lib/types/config.ts
- core/schemas/context.go
- plugins/semanticcache/plugin_streaming_test.go
- plugins/semanticcache/state.go
- plugins/semanticcache/plugin_image_generation_test.go
- plugins/semanticcache/plugin_vectorstore_test.go
- plugins/semanticcache/plugin_paths_test.go
- docs/features/semantic-caching.mdx
- ui/components/filters/logsFilterSidebar.tsx
- plugins/semanticcache/plugin_api_test.go
- plugins/semanticcache/plugin_normalization_test.go
- framework/logstore/matviews.go
- docs/openapi/openapi.yaml
- plugins/semanticcache/plugin_embedding_test.go
- plugins/semanticcache/plugin_default_cache_key_test.go
- plugins/semanticcache/utils.go
- plugins/semanticcache/plugin_nil_content_test.go
- transports/bifrost-http/handlers/cache.go
- plugins/semanticcache/test_utils.go
- transports/bifrost-http/handlers/cache_test.go
| // Create state up front so a reused/retried request ID never inherits stale fields. | ||
| state := plugin.createCacheState(requestID) | ||
|
|
||
| if plugin.isConversationHistoryThresholdExceeded(state, req) { | ||
| return req, nil, nil |
There was a problem hiding this comment.
Defer storing cacheState until the request is actually cacheable.
createCacheState() persists state before the threshold/metadata/hash skip paths. Those requests later hit the ParamsHash == "" early return in PostLLMHook, so their state is never cleared on the normal request path and only disappears when the background reaper runs. On high-volume uncached traffic, that becomes avoidable cacheStates growth.
Also applies to: 364-374
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@plugins/semanticcache/main.go` around lines 354 - 358, createCacheState is
persisting cache state too early (called before
isConversationHistoryThresholdExceeded and before the ParamsHash empty-path in
PostLLMHook), causing stale cacheStates to accumulate for requests that are
never cacheable; change the flow so you do not persist the cacheState
up-front—either create a non-persisted temporary state or delay calling
plugin.createCacheState until after
plugin.isConversationHistoryThresholdExceeded returns false and after you verify
ParamsHash != "" (i.e., only persist the state when the request is actually
cacheable in PostLLMHook and related paths referenced around
createCacheState/isConversationHistoryThresholdExceeded).
| func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) { | ||
| if isStream && !isFinalChunk { | ||
| return | ||
| } | ||
| if extraFields.CacheDebug == nil { | ||
| extraFields.CacheDebug = &schemas.BifrostCacheDebug{} | ||
| } | ||
| cd := extraFields.CacheDebug | ||
| cd.CacheHit = false | ||
| cd.CacheID = bifrost.Ptr(storageID) | ||
| if state.EmbeddingsInputTokens > 0 { | ||
| inputTokens := state.EmbeddingsInputTokens | ||
| cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider)) | ||
| cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel) | ||
| cd.InputTokens = &inputTokens | ||
| } |
There was a problem hiding this comment.
Clear hit-only cache_debug fields when stamping misses.
This path only flips CacheHit=false and sets CacheID. If extraFields.CacheDebug is already populated, HitType, Threshold, Similarity, CacheHitLatency, and the requested/provider/model fields leak through, which makes the new cache_hit_types filtering and UI badges internally inconsistent on misses.
Suggested fix
cd := extraFields.CacheDebug
cd.CacheHit = false
cd.CacheID = bifrost.Ptr(storageID)
+ cd.HitType = nil
+ cd.RequestedProvider = nil
+ cd.RequestedModel = nil
+ cd.CacheHitLatency = nil
+ cd.Threshold = nil
+ cd.Similarity = nil
+ cd.ProviderUsed = nil
+ cd.ModelUsed = nil
+ cd.InputTokens = nil
if state.EmbeddingsInputTokens > 0 {
inputTokens := state.EmbeddingsInputTokens
cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider))
cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel)
cd.InputTokens = &inputTokens
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) { | |
| if isStream && !isFinalChunk { | |
| return | |
| } | |
| if extraFields.CacheDebug == nil { | |
| extraFields.CacheDebug = &schemas.BifrostCacheDebug{} | |
| } | |
| cd := extraFields.CacheDebug | |
| cd.CacheHit = false | |
| cd.CacheID = bifrost.Ptr(storageID) | |
| if state.EmbeddingsInputTokens > 0 { | |
| inputTokens := state.EmbeddingsInputTokens | |
| cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider)) | |
| cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel) | |
| cd.InputTokens = &inputTokens | |
| } | |
| func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) { | |
| if isStream && !isFinalChunk { | |
| return | |
| } | |
| if extraFields.CacheDebug == nil { | |
| extraFields.CacheDebug = &schemas.BifrostCacheDebug{} | |
| } | |
| cd := extraFields.CacheDebug | |
| cd.CacheHit = false | |
| cd.CacheID = bifrost.Ptr(storageID) | |
| cd.HitType = nil | |
| cd.RequestedProvider = nil | |
| cd.RequestedModel = nil | |
| cd.CacheHitLatency = nil | |
| cd.Threshold = nil | |
| cd.Similarity = nil | |
| cd.ProviderUsed = nil | |
| cd.ModelUsed = nil | |
| cd.InputTokens = nil | |
| if state.EmbeddingsInputTokens > 0 { | |
| inputTokens := state.EmbeddingsInputTokens | |
| cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider)) | |
| cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel) | |
| cd.InputTokens = &inputTokens | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@plugins/semanticcache/main.go` around lines 625 - 640, In
stampCacheDebugForMiss, ensure all hit-only fields are cleared when stamping a
miss: set CacheHit=false and CacheID as already done, and explicitly nil-out or
reset HitType, Threshold, Similarity, CacheHitLatency, ProviderUsed, ModelUsed
and InputTokens on extraFields.CacheDebug so no previous hit data leaks into
miss responses; update the function (stampCacheDebugForMiss) to clear those
fields on cd (the local CacheDebug variable) before returning.
59459da to
d03a719
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
plugins/semanticcache/main.go (2)
354-358:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDefer
createCacheStateuntil the request is confirmed cacheable.
createCacheState(requestID)is called at Line 355 beforeisConversationHistoryThresholdExceeded(Line 357). When the threshold check returns early, the state is created but never cleared on the normal request path — it only gets reaped by the background cleanup loop. Under high-volume traffic with many uncached requests, this causes avoidablecacheStatesgrowth.Consider moving state creation after the threshold check, or lazily creating it only when caching will proceed.
🛠️ Proposed fix sketch
func (plugin *Plugin) PreLLMHook(ctx *schemas.BifrostContext, req *schemas.BifrostRequest) (*schemas.BifrostRequest, *schemas.LLMPluginShortCircuit, error) { // ... cache key and request ID checks ... if !isSemanticCacheSupportedRequestType(req.RequestType) { return req, nil, nil } - // Create state up front so a reused/retried request ID never inherits stale fields. - state := plugin.createCacheState(requestID) - - if plugin.isConversationHistoryThresholdExceeded(state, req) { + if plugin.isConversationHistoryThresholdExceeded(req) { return req, nil, nil } + // Create state only after confirming the request is cacheable. + state := plugin.createCacheState(requestID) + performDirectSearch, performSemanticSearch := plugin.resolveCacheTypes(ctx) // ... rest of the method ...This requires updating
isConversationHistoryThresholdExceededto not depend onstate, or passing a temporary/stack-local struct for the check.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/main.go` around lines 354 - 358, createCacheState(requestID) is being called before the cacheability check, causing unused entries to accumulate in cacheStates when isConversationHistoryThresholdExceeded(req) returns early; move or defer creation of state by calling createCacheState only after isConversationHistoryThresholdExceeded returns false (i.e., after the request is confirmed cacheable), or change isConversationHistoryThresholdExceeded to accept no state (or a temporary stack-local struct) so the threshold check can run without mutating/allocating cache state; update call sites to createCacheState(requestID) only when you will proceed to caching and ensure any code paths that relied on state are adjusted accordingly.
625-641:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClear hit-only fields when stamping cache misses.
stampCacheDebugForMisssetsCacheHit=falseandCacheID, but ifextraFields.CacheDebugwas previously populated (e.g., from an earlier hook or partial state), hit-only fields likeHitType,Threshold,Similarity,CacheHitLatency,RequestedProvider, andRequestedModelmay leak through. This can cause inconsistentcache_hit_typesfiltering and UI badges.🛠️ Proposed fix
func (plugin *Plugin) stampCacheDebugForMiss(state *cacheState, extraFields *schemas.BifrostResponseExtraFields, storageID string, isStream, isFinalChunk bool) { if isStream && !isFinalChunk { return } if extraFields.CacheDebug == nil { extraFields.CacheDebug = &schemas.BifrostCacheDebug{} } cd := extraFields.CacheDebug cd.CacheHit = false cd.CacheID = bifrost.Ptr(storageID) + // Clear hit-only fields to prevent stale data from leaking on misses + cd.HitType = nil + cd.Threshold = nil + cd.Similarity = nil + cd.CacheHitLatency = nil + cd.RequestedProvider = nil + cd.RequestedModel = nil if state.EmbeddingsInputTokens > 0 { inputTokens := state.EmbeddingsInputTokens cd.ProviderUsed = bifrost.Ptr(string(plugin.config.Provider)) cd.ModelUsed = bifrost.Ptr(plugin.config.EmbeddingModel) cd.InputTokens = &inputTokens + } else { + cd.ProviderUsed = nil + cd.ModelUsed = nil + cd.InputTokens = nil } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/semanticcache/main.go` around lines 625 - 641, stampCacheDebugForMiss must clear any hit-only fields on extraFields.CacheDebug to avoid leaking prior hit data; update the function (stampCacheDebugForMiss) to explicitly zero or nil out HitType, Threshold, Similarity, CacheHitLatency, RequestedProvider, and RequestedModel on cd before setting CacheHit=false and CacheID, leaving the existing embeddings-related assignments intact so only those hit-specific fields are cleared for a miss.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@plugins/semanticcache/main.go`:
- Around line 354-358: createCacheState(requestID) is being called before the
cacheability check, causing unused entries to accumulate in cacheStates when
isConversationHistoryThresholdExceeded(req) returns early; move or defer
creation of state by calling createCacheState only after
isConversationHistoryThresholdExceeded returns false (i.e., after the request is
confirmed cacheable), or change isConversationHistoryThresholdExceeded to accept
no state (or a temporary stack-local struct) so the threshold check can run
without mutating/allocating cache state; update call sites to
createCacheState(requestID) only when you will proceed to caching and ensure any
code paths that relied on state are adjusted accordingly.
- Around line 625-641: stampCacheDebugForMiss must clear any hit-only fields on
extraFields.CacheDebug to avoid leaking prior hit data; update the function
(stampCacheDebugForMiss) to explicitly zero or nil out HitType, Threshold,
Similarity, CacheHitLatency, RequestedProvider, and RequestedModel on cd before
setting CacheHit=false and CacheID, leaving the existing embeddings-related
assignments intact so only those hit-specific fields are cleared for a miss.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: af4ddbe1-9b1e-4d53-b76a-4f26cb9584fc
📒 Files selected for processing (55)
.claude/skills/docs-writer/SKILL.md.gitignorecore/schemas/bifrost.gocore/schemas/context.gocore/schemas/context_test.godocs/features/semantic-caching.mdxdocs/migration-guides/v1.5.0.mdxdocs/openapi/openapi.jsondocs/openapi/openapi.yamldocs/openapi/paths/management/cache.yamlframework/logstore/matviews.goframework/logstore/rdb.goframework/logstore/tables.goframework/modelcatalog/sync.goframework/vectorstore/weaviate.goplugins/logging/main.goplugins/logging/operations.goplugins/semanticcache/main.goplugins/semanticcache/main_test.goplugins/semanticcache/plugin_api_test.goplugins/semanticcache/plugin_cache_type_test.goplugins/semanticcache/plugin_conversation_config_test.goplugins/semanticcache/plugin_core_test.goplugins/semanticcache/plugin_cross_cache_test.goplugins/semanticcache/plugin_default_cache_key_test.goplugins/semanticcache/plugin_edge_cases_test.goplugins/semanticcache/plugin_embedding_test.goplugins/semanticcache/plugin_image_generation_test.goplugins/semanticcache/plugin_integration_test.goplugins/semanticcache/plugin_nil_content_test.goplugins/semanticcache/plugin_no_mutation_test.goplugins/semanticcache/plugin_no_store_test.goplugins/semanticcache/plugin_normalization_test.goplugins/semanticcache/plugin_paths_test.goplugins/semanticcache/plugin_responses_test.goplugins/semanticcache/plugin_streaming_test.goplugins/semanticcache/plugin_vectorstore_test.goplugins/semanticcache/search.goplugins/semanticcache/state.goplugins/semanticcache/stream.goplugins/semanticcache/test_utils.goplugins/semanticcache/utils.gotransports/bifrost-http/handlers/cache.gotransports/bifrost-http/handlers/cache_test.gotransports/bifrost-http/handlers/logging.gotransports/bifrost-http/handlers/middlewares.goui/app/workspace/config/views/pluginsForm.tsxui/app/workspace/logs/page.tsxui/app/workspace/logs/sheets/logDetailView.tsxui/components/filters/logsFilterSidebar.tsxui/lib/constants/logs.tsui/lib/store/apis/logsApi.tsui/lib/types/config.tsui/lib/types/logs.tsui/lib/types/schemas.ts
💤 Files with no reviewable changes (1)
- plugins/logging/operations.go
✅ Files skipped from review due to trivial changes (17)
- .gitignore
- ui/lib/constants/logs.ts
- .claude/skills/docs-writer/SKILL.md
- ui/app/workspace/logs/page.tsx
- core/schemas/context.go
- docs/openapi/openapi.json
- transports/bifrost-http/handlers/cache.go
- ui/lib/types/schemas.ts
- framework/logstore/tables.go
- transports/bifrost-http/handlers/logging.go
- framework/modelcatalog/sync.go
- docs/features/semantic-caching.mdx
- plugins/semanticcache/plugin_conversation_config_test.go
- transports/bifrost-http/handlers/middlewares.go
- plugins/semanticcache/plugin_normalization_test.go
- plugins/semanticcache/plugin_cross_cache_test.go
- plugins/semanticcache/plugin_no_store_test.go
🚧 Files skipped from review as they are similar to previous changes (30)
- docs/openapi/openapi.yaml
- core/schemas/context_test.go
- plugins/semanticcache/main_test.go
- plugins/semanticcache/state.go
- plugins/logging/main.go
- ui/lib/types/logs.ts
- docs/migration-guides/v1.5.0.mdx
- plugins/semanticcache/plugin_default_cache_key_test.go
- ui/lib/types/config.ts
- framework/vectorstore/weaviate.go
- docs/openapi/paths/management/cache.yaml
- framework/logstore/matviews.go
- plugins/semanticcache/plugin_no_mutation_test.go
- plugins/semanticcache/plugin_image_generation_test.go
- ui/components/filters/logsFilterSidebar.tsx
- framework/logstore/rdb.go
- plugins/semanticcache/plugin_vectorstore_test.go
- ui/app/workspace/logs/sheets/logDetailView.tsx
- transports/bifrost-http/handlers/cache_test.go
- plugins/semanticcache/plugin_api_test.go
- core/schemas/bifrost.go
- ui/app/workspace/config/views/pluginsForm.tsx
- plugins/semanticcache/plugin_streaming_test.go
- plugins/semanticcache/test_utils.go
- plugins/semanticcache/plugin_edge_cases_test.go
- plugins/semanticcache/stream.go
- plugins/semanticcache/plugin_embedding_test.go
- plugins/semanticcache/plugin_paths_test.go
- plugins/semanticcache/utils.go
- plugins/semanticcache/search.go

Summary
This PR refactors the semantic cache plugin to simplify its internal state management, improves cache lookup correctness, and adds a new
cache_hit_typesfilter to the logs API and UI. The direct cache lookup path is now a single deterministic point-fetch by a UUIDv5directCacheID(replacing the previous dual-path of chunk lookup + legacy metadata scan), and several context keys are consolidated. The UI gains a "Local Caching" filter sidebar section and cache hit type badges in the log detail view.Changes
Semantic cache plugin refactor:
performDirectChunkLookup+performLegacyDirectSearch) with a singleperformDirectSearchthat does an O(1)GetChunkby deterministicdirectCacheID(UUIDv5 derived from provider, model, cacheKey, requestHash, paramsHash).generateDirectCacheIDnow returns an error instead of silently falling back to a string concatenation, making failures explicit.request_hashis no longer stored as a top-level metadata field; it is encoded into thedirectCacheIDinstead.directCacheIDKey,paramsHashKey,embeddingsKey,embeddingsInputTokensKey), removing stale keys likerequestIDKey,requestHashKey,isCacheHitKey, andcacheHitTypeKey.shouldSkipCachingis extracted into its own method; cache-hit detection now readsCacheDebug.CacheHitfrom the response rather than a context flag.buildUnifiedMetadatano longer acceptsrequestHashas a parameter.addSingleResponserenamed toaddNonStreamingResponse.StreamAccumulatorfieldsHasError,FinalTimestamp, andFinishReasononStreamChunkare removed; error streams are handled by early return inPostLLMHook.ctx.Done()to prevent goroutine leaks on dropped consumers.runStreamCleanupLoopgoroutine (started byInit, stopped byCleanupviastopCh) replaces the one-shot cleanup call, periodically reaping stale stream accumulators.buildResponseFromResultnow acceptsthreshold,similarity, andinputTokensas pointers, andattachCacheDebugis extracted as a shared helper for both streaming and non-streaming paths.isExpiredEntryis extracted as a standalone function.chunkSortKeyreplaces the large inline sort comparator inprocessAccumulatedStream.hashSortedSet/sortedStringSetto prevent MCP's randomized map iteration from perturbing the request hash.extractAttachmentsForCachingis extracted so attachment URLs are included in the cache key metadata rather than the embedding text.extractTextForEmbeddingno longer returns aparamsHash; callers compute it once viabuildRequestMetadataForCaching+hashMap.generateEmbeddingmoved fromutils.gotosearch.go.generateRequestHashnow accepts prebuilt metadata to avoid recomputing it.removeFieldno longer mutates the input slice's backing array.PronunciationDictionaryLocators,TimestampGranularities,Include,AdditionalFormats, andInputImagesto their respective parameter metadata extractors.semantic_cache_*tosemantic_cache-*(underscore → hyphen separator after the plugin prefix).SelectFieldsno longer includesrequest_hash.VectorStorePropertiesno longer includes arequest_hashentry.CacheByModelandCacheByProviderdefault-value log messages added.Log filtering —
cache_hit_types:CacheHitTypes []stringtoSearchFiltersinframework/logstore/tables.go.applyFiltersinrdb.goapplies a JSON path filter oncache_debugfor both SQLite (json_extract) and PostgreSQL (substringregex) dialects, restricted to the allowlist["direct", "semantic"].canUseMatViewFiltersexcludes queries withCacheHitTypesset from the materialized-view fast path.getLogs,getLogsStats,parseHistogramFilters) parse acache_hit_typescomma-separated query parameter.UI:
LogsFilterSidebarwith checkboxes for "Direct cache" and "Semantic cache".cache_hit_typesis added to URL state, filter state, and thebuildFilterParamsAPI helper.cache_debug.hit_type.EmbeddingSupportedProvidersfor built-ins;custom_provider_config.allowed_requests.embeddingfor custom providers), shows an error message when no embedding provider is configured, and disables the toggle accordingly.ModelMultiselect(single-select mode) scoped to the selected provider.EmbeddingSupportedProvidersconstant added toui/lib/constants/logs.ts.Misc:
CorsMiddlewareand an auth debug log are commented out.transports/bifrost-http/v1.5.xadded to.gitignore.core/schemas/bifrost.goandframework/modelcatalog/sync.go.sync.goadded.Type of change
Affected areas
How to test
hit_typeincache_debug./logs?cache_hit_types=directand/logs?cache_hit_types=semanticand confirm only matching entries are returned.Breaking changes
The public semantic cache context key names have changed from
semantic_cache_*tosemantic_cache-*. Any caller settingCacheKey,CacheTTLKey,CacheThresholdKey,CacheTypeKey, orCacheNoStoreKeyvia the old string values will no longer be recognized by the plugin. Update all call sites to use the exported constants from the plugin package rather than raw string literals.request_hashis no longer stored as a top-level metadata field in the vector store. Existing cache entries written by prior versions will not be found by the new direct-search path (they will be treated as misses and re-populated).ClearCacheForRequestIDis documented as currently broken for entries written by the new direct-search path; callers should not rely on it until the TODO is resolved.Related issues
N/A
Security considerations
The
CacheHitTypesfilter allowlists values to"direct"and"semantic"before interpolating them into SQL, preventing arbitrary input from reaching the JSON path expression.Checklist
docs/contributing/README.mdand followed the guidelines