Skip to content

feat: multimodal embeddings#2559

Open
TejasGhatte wants to merge 1 commit intomainfrom
04-03-feat_multimodal_embeddings
Open

feat: multimodal embeddings#2559
TejasGhatte wants to merge 1 commit intomainfrom
04-03-feat_multimodal_embeddings

Conversation

@TejasGhatte
Copy link
Copy Markdown
Collaborator

@TejasGhatte TejasGhatte commented Apr 8, 2026

Summary

This PR introduces comprehensive multimodal embedding support across the Bifrost platform, enabling text, image, audio, video, and file inputs for embedding generation. The implementation standardizes embedding input handling through a new Contents structure while maintaining backward compatibility.

Changes

  • New embedding input structure: Replaced legacy Text/Texts/Embedding/Embeddings fields with unified Contents array supporting multiple modalities
  • Multimodal content support: Added EmbeddingContentPart types for text, image, audio, video, file, and token inputs
  • Provider implementations: Updated all embedding providers (OpenAI, Cohere, Gemini, Vertex, Bedrock, HuggingFace) to handle multimodal content
  • Validation framework: Added comprehensive validation for embedding inputs and content parts
  • Test infrastructure: Added multimodal embedding test scenarios and provider-specific test suites
  • Backward compatibility: Maintained support for existing embedding request formats through conversion layers

Key design decisions:

  • Each EmbeddingContent entry maps to one output embedding, allowing batch processing
  • Parts within a single content entry are aggregated into one embedding by the provider
  • Provider-specific limitations are handled gracefully (e.g., Vertex batch restrictions, HuggingFace text-only)

Type of change

  • Feature
  • Bug fix
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Test multimodal embedding functionality across providers:

# Core embedding tests
go test ./core/internal/llmtests -run TestMultimodalEmbedding -v

# Provider-specific tests
go test ./core/providers/cohere -run TestToCohereEmbeddingRequest -v
go test ./core/providers/gemini -run TestToGeminiEmbeddingRequest -v
go test ./core/providers/vertex -run TestToVertexGeminiEmbeddingRequest -v

# Integration tests
cd tests/integrations/python
python -m pytest tests/test_cohere.py::TestCohereIntegration::test_09_image_embedding -v
python -m pytest tests/test_google.py::TestGoogleIntegration::test_45_multimodal_embedding_text_and_image -v

# Full test suite
go test ./...

Environment variables for multimodal testing:

  • VERTEX_API_KEY, VERTEX_PROJECT_ID, VERTEX_REGION for Vertex multimodal models
  • COHERE_API_KEY for Cohere multimodal embeddings
  • GEMINI_API_KEY for Gemini embedding models

Screenshots/Recordings

N/A - Backend API changes only

Breaking changes

  • Yes
  • No

The new Contents structure is additive. Existing embedding requests using text, texts, embedding, or embeddings fields continue to work through automatic conversion.

Related issues

Addresses multimodal embedding support requirements and standardizes embedding input handling across providers.

Security considerations

  • Image/media content validation ensures proper data URI and URL handling
  • Base64 decoding includes safety checks for malformed data
  • Provider-specific media type restrictions prevent unsupported content submission
  • No new authentication or PII handling introduced

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 8, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Multimodal embeddings (text, image, audio, video) across providers; new embedding params: task_type, title, auto_truncate, truncate, max_tokens
    • Logs UI: view/copy structured embedding inputs; embedding inputs persisted and redacted for large media
  • Bug Fixes

    • Stricter embedding input validation with clearer rejection errors
    • Broader, more robust handling of diverse embedding encodings and dimension reporting
  • Tests

    • Expanded multimodal embedding unit and integration coverage across providers

Walkthrough

Refactors embeddings from one-of input fields to a typed Contents-based multimodal schema, adds EmbeddingInput validation, updates provider converters/routing to support multimodal inputs and typed embedding outputs, and propagates changes through tests, logging, tracing, caching, UI, and DB migrations.

Changes

Multimodal Embeddings (single cohesive DAG)

Layer / File(s) Summary
Data Shape
core/schemas/embedding.go, core/schemas/trace.go
Replace one-of EmbeddingInput with Contents []EmbeddingContent; add typed parts/media, validation helpers, new EmbeddingParameters fields, and EmbeddingsByType; change trace attribute key string.
Validation & Core API
core/bifrost.go, transports/bifrost-http/handlers/inference.go
Require req.Input != nil && len(req.Input.Contents) > 0, call req.Input.Validate() early and return structured BifrostError on failure; update recognized embedding param keys.
Provider Conversion Impl
core/providers/*/embedding.go, core/providers/*/types.go
Rework provider converters to consume/build Contents, add conversion helpers (OpenAI, Gemini, Cohere, Vertex, Bedrock, HuggingFace, etc.), change several converter signatures to return errors, and map responses into EmbeddingsByType.
Provider Routing & Wiring
core/providers/vertex/vertex.go, core/providers/gemini/gemini.go, core/providers/azure/azure.go, core/providers/openai/openai.go
Add model-type routing (Vertex Gemini vs native multimodal), choose embedding endpoints accordingly, simplify request-builder closures, and adjust auth/path assembly.
Response Parsing & Utilities
core/internal/llmtests/response_validation.go, core/internal/llmtests/utils.go, core/providers/bedrock/invoke.go
Unify embedding count/dimension checks across typed fields; implement extraction of vectors from Float/Base64/Int8/Uint8 (including base64→float32 decoding); update Bedrock/Titan handling to use Float fields.
Tests & Test Infra
core/internal/llmtests/*, core/providers/*/*_test.go, tests/integrations/python/*
Add multimodal embedding test runner and scenario flag/model config; update tests to use Contents; add unit/integration tests for Cohere/Gemini/Vertex multimodal conversions and Python integration tests.
Logging, Tracing, Caching & Storage
framework/tracing/llmspan.go, plugins/logging/*, plugins/semanticcache/*, framework/logstore/*, ui/*
Populate span input text from Contents, persist structured EmbeddingInput in logs with migration/backfill, redact large inline media, update semanticcache decoding to typed embeddings/sonic, and add UI view for structured embedding input.
Miscellaneous
core/schemas/serialization_test.go, plugins/semanticcache/go.mod, plugins/semanticcache/plugin_core_test.go
Adjust serialization tests to EmbeddingsByType, promote sonic to direct dependency, and remove obsolete flattening test.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Transport as HTTP Transport
    participant Bifrost as Bifrost
    participant Schema as Schema Validator
    participant ProviderConv as Provider Converter
    participant ProviderAPI as Provider API

    Client->>Transport: POST /v2/embed (EmbeddingInput.Contents)
    Transport->>Bifrost: Forward EmbeddingRequest
    Bifrost->>Schema: req.Input.Validate()
    alt validation fails
        Schema-->>Bifrost: Validation error
        Bifrost-->>Transport: Error (BifrostError)
        Transport-->>Client: 4xx/Error
    else validation passes
        Schema-->>Bifrost: Valid
        Bifrost->>ProviderConv: Convert to provider format (may error)
        alt conversion fails
            ProviderConv-->>Bifrost: Conversion error
            Bifrost-->>Transport: Error response
            Transport-->>Client: 4xx/Error
        else conversion succeeds
            ProviderConv->>ProviderAPI: Send provider request (text/image/video...)
            ProviderAPI-->>ProviderConv: Provider response (typed embeddings)
            ProviderConv->>Bifrost: Convert → BifrostEmbeddingResponse
            Bifrost-->>Transport: 200 OK + embeddings
            Transport-->>Client: 200 OK
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • akshaydeo
  • danpiths
  • SamstyleGhost

Poem

"🐰 Hop, I stitched contents part by part,
Text and image now play their part.
From schema roots to provider streams,
Multimodal dreams in tidy beams.
Nibble a carrot, ship the charts! 🥕"

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 04-03-feat_multimodal_embeddings

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 8, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


tejas ghatte seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Collaborator Author

TejasGhatte commented Apr 8, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@TejasGhatte TejasGhatte marked this pull request as ready for review April 8, 2026 04:29
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 8, 2026

Confidence Score: 3/5

Not safe to merge without addressing the historical embedding output deserialization breakage, the nil-interface panic in the Vertex text path, and the incorrect Cohere response-type classification.

Three distinct bugs affect correctness on shipped paths: EmbeddingsByType has no custom JSON unmarshaler so all historical embedding_output rows silently deserialise to nil; non-text content passed to a Vertex text-embedding model causes a nil-interface panic rather than a clean error; and ToCohereEmbeddingResponse misclassifies multi-type embedding responses as embeddings_floats.

core/schemas/embedding.go (EmbeddingsByType serialisation contract), core/providers/vertex/embedding.go (nil return from ToVertexEmbeddingRequest), core/providers/cohere/embedding.go (responseType logic in ToCohereEmbeddingResponse), framework/logstore/migrations.go (no migration for embedding_output format change).

Important Files Changed

Filename Overview
core/schemas/embedding.go Replaces EmbeddingInput/EmbeddingStruct with new Contents-based EmbeddingInput and EmbeddingsByType; EmbeddingsByType has no custom JSON serializer, breaking deserialization of historical EmbeddingOutput log rows.
core/providers/cohere/embedding.go Full multimodal rewrite; ToCohereEmbeddingResponse has incorrect responseType classification when float + quantised embedding types co-exist.
core/providers/vertex/embedding.go Adds Gemini and native multimodal paths; ToVertexEmbeddingRequest silently returns nil on non-text parts, causing a nil-interface panic in the vertex.go call site.
core/providers/gemini/embedding.go Comprehensive multimodal rewrite; handles single and batch embed-content responses correctly via geminiResponseEmbeddings helper.
core/providers/bedrock/embedding.go Titan path updated to Contents-based input; Cohere path now delegates to the shared Cohere converter, removing duplicate mapping logic.
core/providers/openai/embedding.go Adds ToOpenAIEmbeddingRequest/Response converters; now returns errors for unsupported content types; OpenaiEmbeddingResponse added with proper marshal/unmarshal.
framework/logstore/migrations.go Adds migration for embedding_input column with backfill from old input_history format; no corresponding migration for embedding_output format change.
plugins/semanticcache/utils.go Updated to use Contents-based EmbeddingInput; adds proper base64 IEEE-754 float32 decoding for the semantic cache embedding path.
plugins/logging/utils.go Replaces chat-message-based embedding logging with new extractEmbeddingInput; adds redactEmbeddingMediaData for large-payload threshold handling.
transports/bifrost-http/integrations/openai.go Embedding response converter now returns OpenAI-format response via ToOpenAIEmbeddingResponse instead of the raw BifrostEmbeddingResponse.
transports/bifrost-http/integrations/cohere.go Embedding response now converted via ToCohereEmbeddingResponse before returning to clients.

Comments Outside Diff (1)

  1. core/providers/vertex/embedding.go, line 936-944 (link)

    P1 Nil *VertexEmbeddingRequest returned as interface causes panic

    When ToVertexEmbeddingRequest encounters a non-text part it executes return nil. In the vertex.go call site the return type is (providerUtils.RequestBodyWithExtraParams, error), so the nil *VertexEmbeddingRequest pointer is wrapped in a non-nil interface value (type present, value nil). CheckContextAndGetRequestBody then calls GetExtraParams() on it, dereferencing the nil receiver and panicking. The function should return a proper error instead of a silent nil, matching the pattern used by ToVertexMultimodalEmbeddingRequest and ToVertexGeminiEmbeddingRequest.

Reviews (5): Last reviewed commit: "feat: multimodal embeddings" | Re-trigger Greptile

Comment thread core/providers/vertex/embedding.go Outdated
Comment thread core/providers/gemini/embedding.go
Comment thread core/providers/vertex/embedding.go
Comment thread core/providers/bedrock/embedding.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/bifrost.go (1)

911-940: ⚠️ Potential issue | 🟠 Major

Preserve legacy embedding inputs before enforcing Contents.

This now rejects direct callers that still populate legacy embedding fields (Input.Text, Input.Texts, Input.Embedding, Input.Embeddings), because the empty-Contents guard runs before any compatibility normalization and Validate() immediately assumes Contents is already populated. That breaks the stack’s stated backward-compatibility contract for embedding requests. Please normalize legacy fields into Contents before this check, or keep the legacy-field presence check here until that conversion has run.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 911 - 940, The empty-Contents guard in the
embedding request path rejects legacy callers because legacy fields
(req.Input.Text, req.Input.Texts, req.Input.Embedding, req.Input.Embeddings) are
not normalized to req.Input.Contents before the check; move or call the
compatibility normalization that populates req.Input.Contents for legacy fields
before the if that checks (req.Input == nil || len(req.Input.Contents) == 0) and
isLargePayloadPassthrough(ctx), or alternatively change that guard to also
accept legacy-field presence and run the normalization immediately prior to
calling req.Input.Validate(); ensure Validate() continues to run against the
normalized Contents so legacy callers remain supported.
🧹 Nitpick comments (9)
core/schemas/embedding_multimodal_test.go (1)

9-27: Add one happy-path validation case here.

These tests only pin failure paths. If Validate() regressed to reject every non-empty Contents value, this file would still stay green. A single valid text case and a single valid media case would lock down the acceptance contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/embedding_multimodal_test.go` around lines 9 - 27, Add a
happy-path validation test for both a text embedding and a media embedding:
create a TestEmbeddingInputValidateAcceptsNonEmpty that constructs an
EmbeddingInput with a non-empty Contents containing a single
EmbeddingContentPart of Type EmbeddingContentPartTypeText with Text set and
assert Validate() returns no error; and create a
TestEmbeddingContentPartValidateAcceptsSingleMedia that constructs an
EmbeddingContentPart of Type EmbeddingContentPartTypeImage with Image set to an
EmbeddingMediaPart (use Ptr("https://example.com/img.png") or the existing
helper) and assert part.Validate() returns nil; reference
EmbeddingInput.Validate, EmbeddingContentPart.Validate,
EmbeddingContentPartTypeImage/Text, and EmbeddingMediaPart to locate code.
plugins/semanticcache/test_utils.go (1)

557-560: Use bifrost.Ptr() here instead of taking the address of a loop copy.

This keeps the helper aligned with the repo convention and removes the extra temp variable.

♻️ Suggested cleanup
-			for i, text := range texts {
-				t := text
-				contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: &t}}
-			}
+			for i, text := range texts {
+				contents[i] = schemas.EmbeddingContent{{
+					Type: schemas.EmbeddingContentPartTypeText,
+					Text: bifrost.Ptr(text),
+				}}
+			}

Based on learnings, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically, and apply this consistently in test utilities.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/semanticcache/test_utils.go` around lines 557 - 560, The loop in the
test helper creates a loop-local copy (t := text) and takes its address for
EmbeddingContent.Text; replace that pattern with bifrost.Ptr(text) (i.e., set
Text: bifrost.Ptr(text)) to follow the repo convention and remove the temporary
variable; update the assignment in the loop that populates contents (and ensure
the bifrost package is imported if not already) so schemas.EmbeddingContent uses
bifrost.Ptr instead of &t.
core/internal/llmtests/embedding.go (1)

61-65: Use bifrost.Ptr(...) for pointer creation in test payloads.

This works, but for repo consistency prefer bifrost.Ptr(text) over taking the address of a local variable.

Suggested diff
 		contents := make([]schemas.EmbeddingContent, len(testTexts))
 		for i, text := range testTexts {
-			t := text
-			contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: &t}}
+			contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: bifrost.Ptr(text)}}
 		}

Based on learnings: In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&), including test utilities.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/embedding.go` around lines 61 - 65, Replace
local-address pointer creation in the test payload by using bifrost.Ptr to match
repo conventions: in the loop that builds contents (variable contents of type
[]schemas.EmbeddingContent) switch from creating a local t and taking &t to
using bifrost.Ptr(text) for the Text field (the place where
schemas.EmbeddingContent and schemas.EmbeddingContentPartTypeText are set).
Ensure bifrost is imported if not already.
core/providers/cohere/embedding_multimodal_test.go (1)

11-16: Use the pointer helper consistently in these fixtures.

This file already uses schemas.Ptr(...) for media fields, so using the same helper for Text avoids mixing two pointer-construction styles in the same test.

♻️ Suggested cleanup
-	text := "hello"
 	req := ToCohereEmbeddingRequest(&schemas.BifrostEmbeddingRequest{
 		Model: "embed-v4.0",
 		Input: &schemas.EmbeddingInput{
-			Contents: []schemas.EmbeddingContent{{{Type: schemas.EmbeddingContentPartTypeText, Text: &text}}},
+			Contents: []schemas.EmbeddingContent{{{Type: schemas.EmbeddingContentPartTypeText, Text: schemas.Ptr("hello")}}},
 		},
 	})
-	caption := "caption"
 	req := ToCohereEmbeddingRequest(&schemas.BifrostEmbeddingRequest{
 		Model: "embed-v4.0",
 		Input: &schemas.EmbeddingInput{
 			Contents: []schemas.EmbeddingContent{{
-				{Type: schemas.EmbeddingContentPartTypeText, Text: &caption},
+				{Type: schemas.EmbeddingContentPartTypeText, Text: schemas.Ptr("caption")},
 				{Type: schemas.EmbeddingContentPartTypeImage, Image: &schemas.EmbeddingMediaPart{URL: schemas.Ptr("https://example.com/cat.png")}},
 			}},
 		},
 	})

Based on learnings, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically, including test utilities, using an equivalent helper when that is what is already in scope.

Also applies to: 25-30

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/cohere/embedding_multimodal_test.go` around lines 11 - 16,
Replace direct address-of pointer usage for the EmbeddingInput.Text field with
the existing pointer helper used elsewhere in the test: instead of &text use
schemas.Ptr(text) (or bifrost.Ptr if that helper is in scope) when constructing
the EmbeddingContent in ToCohereEmbeddingRequest/
schemas.BifrostEmbeddingRequest; update the other occurrences noted (around
lines 25-30) so all pointer construction in this fixture is done via the same
Ptr helper for consistency.
core/providers/gemini/types.go (1)

1135-1137: Verify the new embedding flags use the same wire casing as the SDK.

This file already normalizes snake_case Gemini payloads in several places, but documentOcr / audioTrackExtraction currently only bind camelCase. If the SDK emits document_ocr / audio_track_extraction, those options will be silently dropped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/types.go` around lines 1135 - 1137, The DocumentOCR and
AudioTrackExtraction fields only accept camelCase JSON keys and will miss
snake_case keys; update the struct that defines DocumentOCR and
AudioTrackExtraction in core/providers/gemini/types.go to accept both wire
casings by implementing a custom UnmarshalJSON for that struct: parse into a
secondary map[string]json.RawMessage (or a small alias struct), check for both
"documentOcr" and "document_ocr" (and "audioTrackExtraction" and
"audio_track_extraction"), and set the DocumentOCR and AudioTrackExtraction
boolean pointers accordingly, preserving existing behavior for other fields.
core/providers/cohere/embedding.go (1)

117-127: Silent nil return on conversion error loses error context.

When cohereContentBlockFromEmbeddingPart fails at line 121 or 135, the function returns nil without any indication of why the conversion failed. This makes debugging difficult.

Consider logging the error or restructuring to return an error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/cohere/embedding.go` around lines 117 - 127, The conversion
loop silently returns nil on errors from cohereContentBlockFromEmbeddingPart,
losing context; update the surrounding function (the routine building
CohereEmbeddingInput and setting cohereReq.Inputs) to propagate errors instead
of returning nil: change the function signature to return (..., error) if
needed, capture the error from cohereContentBlockFromEmbeddingPart and return it
(or wrap it with context) rather than returning nil, and ensure both occurrences
(the loop at the first branch and the similar block at line ~135) follow the
same error propagation so callers can log or handle the specific conversion
error.
core/providers/gemini/embedding.go (2)

140-163: Redundant nil checks and potential logic issue in applyGeminiEmbeddingParams.

Lines 152-154 and 158-160 delete from req.ExtraParams but the preceding lines (151, 157) assign to the same map. The deletion is inside redundant nil checks after just assigning to the map. Additionally, the logic appears to extract a bool pointer from ExtraParams, then assign it back to ExtraParams, then delete it—which leaves the value unchanged.

If the intent is to promote these fields to top-level request fields (like DocumentOCR on the request struct), the current code doesn't achieve that.

♻️ Suggested fix to clarify intent

If the goal is to extract these params and set top-level fields on the request, you'd need fields like req.DocumentOCR and req.AudioTrackExtraction. If those don't exist, consider removing this block or clarifying the intent:

-	if params.ExtraParams != nil {
-		if documentOCR, ok := schemas.SafeExtractBoolPointer(params.ExtraParams["documentOcr"]); ok {
-			req.ExtraParams["documentOcr"] = documentOCR
-			if req.ExtraParams != nil {
-				delete(req.ExtraParams, "documentOcr")
-			}
-		}
-		if audioTrackExtraction, ok := schemas.SafeExtractBoolPointer(params.ExtraParams["audioTrackExtraction"]); ok {
-			req.ExtraParams["audioTrackExtraction"] = audioTrackExtraction
-			if req.ExtraParams != nil {
-				delete(req.ExtraParams, "audioTrackExtraction")
-			}
-		}
-	}
+	// documentOcr and audioTrackExtraction are passed through ExtraParams
+	// and handled by Gemini's API directly - no special extraction needed
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/embedding.go` around lines 140 - 163, The function
applyGeminiEmbeddingParams currently reads boolean values from
params.ExtraParams with schemas.SafeExtractBoolPointer, assigns them back into
req.ExtraParams and then immediately deletes them — the nil checks and deletes
are redundant and the code does not promote these values to top-level fields;
fix by either (A) removing this whole extraction/delete block if these flags
should remain only in ExtraParams, or (B) if you intend to promote them, add
top-level bool fields on the request (e.g., DocumentOCR and AudioTrackExtraction
on GeminiEmbeddingRequest), set req.DocumentOCR and req.AudioTrackExtraction
from the extracted pointers, and then delete those keys from req.ExtraParams;
update applyGeminiEmbeddingParams to use params.ExtraParams ->
SafeExtractBoolPointer -> set top-level req fields -> delete keys from
req.ExtraParams (no redundant nil checks).

409-421: Silent nil return on conversion error obscures failure cause.

When geminiContentToEmbeddingContent returns an error at lines 413 or 427, the function returns nil without propagating the error. The caller cannot distinguish between a nil request input and a conversion failure.

Consider either returning an error or logging the failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/embedding.go` around lines 409 - 421, The code silently
returns nil when geminiContentToEmbeddingContent fails, losing error context;
update the function that builds bifrostReq (the block handling request.Requests
and uses geminiContentToEmbeddingContent, bifrostReq, and
applyBifrostEmbeddingParams) to propagate the conversion error instead of
returning nil—i.e., change the return path to return (nil, err) or wrap and
return a descriptive error so callers can distinguish conversion failures, and
if the function signature cannot change, at minimum log the error with context
(including which req failed) before returning; ensure all callsites of this
function are updated to handle the propagated error when you adjust the
signature.
core/providers/vertex/embedding.go (1)

96-109: Same redundant ExtraParams extraction pattern as Gemini provider.

This block extracts boolean pointers from params.ExtraParams, assigns them back to req.ExtraParams (which is the same map), then deletes them. The net effect is no change. This mirrors the same issue in core/providers/gemini/embedding.go.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vertex/embedding.go` around lines 96 - 109, The code
redundantly reads boolean pointers from params.ExtraParams and writes them back
to req.ExtraParams then deletes keys from the same map, causing no net effect;
update the block around schemas.SafeExtractBoolPointer so you either (A) simply
copy values from params.ExtraParams into req.ExtraParams without deleting, or
(B) if the intent is to move keys, ensure you delete from params.ExtraParams
(not req.ExtraParams) and only perform the delete when params.ExtraParams and
req.ExtraParams are different maps; adjust the handling for the keys
"documentOcr" and "audioTrackExtraction" accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/account.go`:
- Around line 44-45: AllProviderConfigs' Vertex provider config is missing the
multimodal embedding flags so tests skip that scenario; update the Vertex entry
in AllProviderConfigs to set Scenarios.MultimodalEmbedding = true and set
MultimodalEmbeddingModel = "multimodalembedding@001" (and any corresponding
field assignment on the Vertex provider config struct) so the Scenarios and
model whitelist match the API key whitelist referenced earlier; locate the
Vertex provider config near the AllProviderConfigs variable and add those fields
to the config and its Scenarios struct assignment.

In `@core/internal/llmtests/embedding_multimodal.go`:
- Around line 47-86: validateNonEmptyVectors misses the case where a provider
returns a collapsed batch (one resp.Data entry with Embedding.Embedding2DArray);
update validateNonEmptyVectors to normalize that shape before iterating: if
len(resp.Data) == 1 and resp.Data[0].Embedding.Embedding2DArray != nil, treat
each row in Embedding2DArray as its own embedding entry (either by expanding
into a temporary slice of items or by iterating the 2D array directly), then use
getEmbeddingVector (or a small helper that extracts the vector from the 2D array
rows) to validate non-empty vectors and consistent dimensions for every expanded
embedding; also update the final t.Logf to report the expanded embedding count
instead of len(resp.Data).

In `@core/providers/bedrock/embedding.go`:
- Around line 11-13: The conversion function ToBedrockTitanEmbeddingRequest is
incorrectly collapsing the batch dimension by concatenating all Contents into a
single InputText; instead add a guard that asserts or returns an error if the
incoming request contains more than one content item (i.e., len(req.Contents) !=
1) so we only accept single-content requests here, and remove the logic that
stitches multiple Contents together (the code that builds a single InputText
from all parts); this ensures the core layer must fan out and prevents silently
returning one embedding for a multi-item request.

In `@core/providers/cohere/embedding.go`:
- Around line 195-200: The conversion currently silently drops extra images when
handling req.Images (case len(req.Images) > 0) and builds
bifrostReq.Input.Contents with only req.Images[0]; add explicit validation to
reject multiple images by checking if len(req.Images) > 1 and returning an error
(or propagate an appropriate conversion error) before constructing
bifrostReq.Input.Contents; reference the existing symbols req.Images,
bifrostReq.Input.Contents, EmbeddingContentPartTypeImage and EmbeddingMediaPart
when locating the change in core/providers/cohere/embedding.go.

In `@core/providers/gemini/embedding.go`:
- Around line 173-176: The current check conflates an empty contents slice with
an actual error and wraps a nil error (from EmbeddingInputToContents) causing
the %!w(<nil>) output; change the logic in embedding.go after calling
EmbeddingInputToContents to first handle err != nil and return fmt.Errorf("error
converting embedding input to contents: %w", err), and if err == nil but
len(contents) == 0 return a distinct non-wrapping error (e.g.,
fmt.Errorf("embedding input produced no contents")) so the message is clear when
contents is empty.

In `@core/providers/openai/embedding.go`:
- Around line 67-100: The converter currently flattens all parts across
bifrostReq.Input.Contents into global texts/tokenBatches, losing
per-EmbeddingContent boundaries; change the logic to iterate each content (each
entry in bifrostReq.Input.Contents) and build one OpenAIEmbeddingInput per
content: for each content, ensure parts are all the same type (error if mixed),
for text parts concatenate their strings into a single Text (e.g., join with a
space) and set OpenAIEmbeddingInput.Text, and for token parts concatenate token
slices into a single Embedding and set OpenAIEmbeddingInput.Embedding; finally
produce one input per content (a slice of OpenAIEmbeddingInput) instead of
flattening across contents so that each EmbeddingContent maps to one OpenAI
result (update code around bifrostReq.Input.Contents, OpenAIEmbeddingInput, and
the switch that currently sets Texts/Texts/Embeddings).

In `@core/providers/vertex/embedding.go`:
- Around line 149-152: Remove the stray debug print by deleting the
fmt.Println("b64", b64) call in core/providers/vertex/embedding.go; keep the
logic that calls extractBase64FromDataURI(*part.Image.Data) and assigns the
result to img.BytesBase64Encoded (variable b64), and ensure no other fmt.Println
debug statements remain in the same function or nearby code that handles
part.Image.Data or img.BytesBase64Encoded.

In `@core/providers/vertex/vertex.go`:
- Around line 1395-1402: The Embedding() method currently always fetches OAuth
credentials instead of supporting API-key auth like ChatStream() and
GenerateContent(); update Embedding() to first check key.Value.GetValue() and,
if non-empty, build authQuery and append it to the Gemini URL (use
getCompleteURLForGeminiEndpoint with ":embedContent" for
isGeminiEmbedding2Request) before making the request, otherwise fall back to the
existing OAuth flow (the same logic used in ChatStream()/GenerateContent());
ensure authQuery is used consistently when constructing the request URL so
VERTEX_API_KEY-only setups succeed.

In `@core/schemas/trace.go`:
- Line 219: The AttrDimensions constant was changed and that is a
telemetry-breaking change; revert AttrDimensions back to the existing legacy key
("gen_ai.embeddings.dimension.count") and introduce a new constant (e.g.,
AttrDimensionsV2 or AttrEmbeddingDimensionsKey) with the new wire key for
dual-writing during migration; update any writers to emit both constants where
needed but keep AttrDimensions unchanged to preserve
dashboards/filters/exporters that rely on it.

In `@plugins/logging/utils.go`:
- Around line 422-431: The loop building contentBlocks for logging currently
only appends parts where part.Type == schemas.EmbeddingContentPartTypeText,
dropping image/audio/video/file parts and producing empty user messages; update
the logic in the block that iterates request.EmbeddingRequest.Input.Contents to
also handle non-text parts by appending a schemas.ChatContentBlock for each
non-text part (e.g., set Type to a suitable value or to
schemas.ChatContentBlockTypeText with a Text placeholder) that records the
part.Type and any available metadata (filename, mime, size, description, or a
generated placeholder like "[image]" / "[audio]" / "[file]"). Ensure the new
branch references request.EmbeddingRequest.Input.Contents, contentBlocks,
part.Type, and schemas.ChatContentBlock so multimodal embedding inputs are
preserved in the log history.

In `@plugins/semanticcache/utils.go`:
- Around line 447-456: extractTextForEmbedding currently returns an empty string
for embedding requests that contain only non-text parts, which leads
generateEmbeddingsForStorage and performSemanticSearch to call
generateEmbedding("") and produce identical cache keys; update
extractTextForEmbedding to mirror Chat/Responses behavior by checking
len(textParts) > 0 and return a sentinel/error (or a boolean indicating "no
text") instead of an empty string, and update callers
(generateEmbeddingsForStorage and performSemanticSearch) to skip semantic
caching / embedding generation when that sentinel/no-text is returned so
multimodal-only embedding requests do not collide in the semantic cache.

In `@tests/integrations/python/tests/test_cohere.py`:
- Around line 159-196: Both tests request Cohere-only embedding features and
must skip when provider/model lacks those capabilities: update
test_06_custom_dimensions_embedding and test_07_multiple_embedding_types to
check supports_capability(provider, model, "<capability>") (from config_loader)
before running; for test_06 check capability for "embeddings_custom_dimensions"
or the equivalent capability used elsewhere (or guard specifically for
Cohere/embed-v4.0), and for test_07 check capability for "embeddings_int8" (or
guard for Cohere), calling pytest.skip(...) if unsupported — mirror the
capability-guard pattern used in test_pydanticai.py and use
get_provider_cohere_client/format_provider_model as before.

In `@tests/integrations/python/tests/test_google.py`:
- Around line 3091-3106: The test 'test_45' is inconsistent: the docstring says
"a text part" but the request builds contents with two identical text parts
using types.Content(parts=[types.Part(text=...), types.Part(text=...),
types.Part.from_bytes(...)]). Fix by either updating the docstring to say "two
text parts" or change the second types.Part(text=...) to a distinct text string
that reflects the intended test scenario so the intent and payload match.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 911-940: The empty-Contents guard in the embedding request path
rejects legacy callers because legacy fields (req.Input.Text, req.Input.Texts,
req.Input.Embedding, req.Input.Embeddings) are not normalized to
req.Input.Contents before the check; move or call the compatibility
normalization that populates req.Input.Contents for legacy fields before the if
that checks (req.Input == nil || len(req.Input.Contents) == 0) and
isLargePayloadPassthrough(ctx), or alternatively change that guard to also
accept legacy-field presence and run the normalization immediately prior to
calling req.Input.Validate(); ensure Validate() continues to run against the
normalized Contents so legacy callers remain supported.

---

Nitpick comments:
In `@core/internal/llmtests/embedding.go`:
- Around line 61-65: Replace local-address pointer creation in the test payload
by using bifrost.Ptr to match repo conventions: in the loop that builds contents
(variable contents of type []schemas.EmbeddingContent) switch from creating a
local t and taking &t to using bifrost.Ptr(text) for the Text field (the place
where schemas.EmbeddingContent and schemas.EmbeddingContentPartTypeText are
set). Ensure bifrost is imported if not already.

In `@core/providers/cohere/embedding_multimodal_test.go`:
- Around line 11-16: Replace direct address-of pointer usage for the
EmbeddingInput.Text field with the existing pointer helper used elsewhere in the
test: instead of &text use schemas.Ptr(text) (or bifrost.Ptr if that helper is
in scope) when constructing the EmbeddingContent in ToCohereEmbeddingRequest/
schemas.BifrostEmbeddingRequest; update the other occurrences noted (around
lines 25-30) so all pointer construction in this fixture is done via the same
Ptr helper for consistency.

In `@core/providers/cohere/embedding.go`:
- Around line 117-127: The conversion loop silently returns nil on errors from
cohereContentBlockFromEmbeddingPart, losing context; update the surrounding
function (the routine building CohereEmbeddingInput and setting
cohereReq.Inputs) to propagate errors instead of returning nil: change the
function signature to return (..., error) if needed, capture the error from
cohereContentBlockFromEmbeddingPart and return it (or wrap it with context)
rather than returning nil, and ensure both occurrences (the loop at the first
branch and the similar block at line ~135) follow the same error propagation so
callers can log or handle the specific conversion error.

In `@core/providers/gemini/embedding.go`:
- Around line 140-163: The function applyGeminiEmbeddingParams currently reads
boolean values from params.ExtraParams with schemas.SafeExtractBoolPointer,
assigns them back into req.ExtraParams and then immediately deletes them — the
nil checks and deletes are redundant and the code does not promote these values
to top-level fields; fix by either (A) removing this whole extraction/delete
block if these flags should remain only in ExtraParams, or (B) if you intend to
promote them, add top-level bool fields on the request (e.g., DocumentOCR and
AudioTrackExtraction on GeminiEmbeddingRequest), set req.DocumentOCR and
req.AudioTrackExtraction from the extracted pointers, and then delete those keys
from req.ExtraParams; update applyGeminiEmbeddingParams to use
params.ExtraParams -> SafeExtractBoolPointer -> set top-level req fields ->
delete keys from req.ExtraParams (no redundant nil checks).
- Around line 409-421: The code silently returns nil when
geminiContentToEmbeddingContent fails, losing error context; update the function
that builds bifrostReq (the block handling request.Requests and uses
geminiContentToEmbeddingContent, bifrostReq, and applyBifrostEmbeddingParams) to
propagate the conversion error instead of returning nil—i.e., change the return
path to return (nil, err) or wrap and return a descriptive error so callers can
distinguish conversion failures, and if the function signature cannot change, at
minimum log the error with context (including which req failed) before
returning; ensure all callsites of this function are updated to handle the
propagated error when you adjust the signature.

In `@core/providers/gemini/types.go`:
- Around line 1135-1137: The DocumentOCR and AudioTrackExtraction fields only
accept camelCase JSON keys and will miss snake_case keys; update the struct that
defines DocumentOCR and AudioTrackExtraction in core/providers/gemini/types.go
to accept both wire casings by implementing a custom UnmarshalJSON for that
struct: parse into a secondary map[string]json.RawMessage (or a small alias
struct), check for both "documentOcr" and "document_ocr" (and
"audioTrackExtraction" and "audio_track_extraction"), and set the DocumentOCR
and AudioTrackExtraction boolean pointers accordingly, preserving existing
behavior for other fields.

In `@core/providers/vertex/embedding.go`:
- Around line 96-109: The code redundantly reads boolean pointers from
params.ExtraParams and writes them back to req.ExtraParams then deletes keys
from the same map, causing no net effect; update the block around
schemas.SafeExtractBoolPointer so you either (A) simply copy values from
params.ExtraParams into req.ExtraParams without deleting, or (B) if the intent
is to move keys, ensure you delete from params.ExtraParams (not req.ExtraParams)
and only perform the delete when params.ExtraParams and req.ExtraParams are
different maps; adjust the handling for the keys "documentOcr" and
"audioTrackExtraction" accordingly.

In `@core/schemas/embedding_multimodal_test.go`:
- Around line 9-27: Add a happy-path validation test for both a text embedding
and a media embedding: create a TestEmbeddingInputValidateAcceptsNonEmpty that
constructs an EmbeddingInput with a non-empty Contents containing a single
EmbeddingContentPart of Type EmbeddingContentPartTypeText with Text set and
assert Validate() returns no error; and create a
TestEmbeddingContentPartValidateAcceptsSingleMedia that constructs an
EmbeddingContentPart of Type EmbeddingContentPartTypeImage with Image set to an
EmbeddingMediaPart (use Ptr("https://example.com/img.png") or the existing
helper) and assert part.Validate() returns nil; reference
EmbeddingInput.Validate, EmbeddingContentPart.Validate,
EmbeddingContentPartTypeImage/Text, and EmbeddingMediaPart to locate code.

In `@plugins/semanticcache/test_utils.go`:
- Around line 557-560: The loop in the test helper creates a loop-local copy (t
:= text) and takes its address for EmbeddingContent.Text; replace that pattern
with bifrost.Ptr(text) (i.e., set Text: bifrost.Ptr(text)) to follow the repo
convention and remove the temporary variable; update the assignment in the loop
that populates contents (and ensure the bifrost package is imported if not
already) so schemas.EmbeddingContent uses bifrost.Ptr instead of &t.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 00ea3a71-28fe-40ce-befa-ae03e2960b6e

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae64e6 and d148be8.

📒 Files selected for processing (37)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/embedding.go
  • core/internal/llmtests/embedding_multimodal.go
  • core/internal/llmtests/tests.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/embedding.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/embedding.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/cohere/types.go
  • core/providers/gemini/embedding.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/huggingface/embedding.go
  • core/providers/openai/embedding.go
  • core/providers/openai/openai.go
  • core/providers/openai/types.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/providers/vertex/types.go
  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/embedding.go
  • core/schemas/embedding_multimodal_test.go
  • core/schemas/trace.go
  • framework/tracing/llmspan.go
  • plugins/logging/utils.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/utils.go
  • tests/integrations/python/config.yml
  • tests/integrations/python/tests/test_cohere.py
  • tests/integrations/python/tests/test_google.py
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/cohere.go

Comment thread core/internal/llmtests/account.go Outdated
Comment thread core/internal/llmtests/embedding_multimodal.go
Comment thread core/providers/bedrock/embedding.go Outdated
Comment thread core/providers/cohere/embedding.go
Comment thread core/providers/gemini/embedding.go Outdated
Comment thread core/schemas/trace.go
Comment thread plugins/logging/utils.go Outdated
Comment thread plugins/semanticcache/utils.go Outdated
Comment thread tests/integrations/python/tests/test_cohere.py
Comment thread tests/integrations/python/tests/test_google.py
@TejasGhatte TejasGhatte force-pushed the 04-03-feat_multimodal_embeddings branch from d148be8 to 6f29fb7 Compare April 10, 2026 06:53
@coderabbitai coderabbitai Bot requested review from akshaydeo and danpiths April 10, 2026 06:54
Comment thread core/providers/cohere/embedding.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
core/internal/llmtests/response_validation.go (1)

1324-1348: ⚠️ Potential issue | 🟠 Major

Handle Base64 embeddings in the dimension path too.

Line 1325 now treats Base64 as valid vector data, but Lines 1337-1348 and 1521-1531 never derive dimensions from it. A Base64-only embedding will pass the presence check and then fail expected_dimensions with 0, while metrics record the wrong dimension. Either decode Base64 here as well or exclude it from hasData until dimensions can be computed.

As per coding guidelines, "core/internal/llmtests/**/*.go: Scenario-based LLM tests in core/internal/llmtests/ run against live provider APIs with dual-API testing."

Also applies to: 1519-1533

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/response_validation.go` around lines 1324 - 1348, The
presence check currently accepts embedding.Embedding.Base64 as valid data but
the dimensions-check logic (iterating response.Data and computing
actualDimensions from e.Float, e.Int8, e.Uint8, e.Binary, e.Ubinary) never
handles Base64, causing Base64-only embeddings to report dimension 0 and fail
expected_dimensions; update the dimension path to detect e.Base64, decode the
base64 payload to raw bytes and set actualDimensions = len(decodedBytes) (using
standard base64 decode), and update any other dimension/metrics computation that
reads actualDimensions (the loop over response.Data and the metrics block that
computes embedding dimensions) to use the decoded length so Base64-only
embeddings are validated correctly; alternatively, if you prefer not to decode
here, remove Base64 from the initial hasData check in the embedding presence
check (embedding.Embedding.Base64) so Base64-only embeddings are not treated as
having data.
ui/app/workspace/logs/sheets/logDetailView.tsx (1)

952-953: ⚠️ Potential issue | 🟠 Major

Use the same embedding object discriminator as the render path.

This helper checks log.object === "list", but the new embedding UI path at Line 766 treats embedding logs as "embedding". As written, Copy request body never enters the embedding branch for the logs this component now renders, and falls back to the unsupported-type toast instead.

Suggested fix
-		const isEmbedding = log.object === "list";
+		const isEmbedding = log.object === "embedding";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailView.tsx` around lines 952 - 953, The
isEmbedding discriminator currently checks log.object === "list" but the render
path uses "embedding", so update the condition in logDetailView.tsx (the
isEmbedding variable) to match the render path (e.g., check log.object ===
"embedding" or accept both "embedding" and "list" if backward compatibility is
needed) so the "Copy request body" flow enters the embedding branch; locate and
update the isEmbedding declaration and any dependent branches that gate the
embedding copy behavior to use the corrected discriminator.
core/providers/vertex/vertex.go (1)

1520-1548: ⚠️ Potential issue | 🟡 Minor

Preserve RawRequest capture in the new embedding response path.

These branches bypass providerUtils.HandleProviderResponse, so bifrostResponse.ExtraFields.RawRequest is never set for Vertex embeddings anymore when sendBackRawRequest is enabled. That regresses raw request logging/debugging for this operation; please keep the typed unmarshal, but restore the common raw request/response capture path before converting to the Bifrost response.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vertex/vertex.go` around lines 1520 - 1548, The Vertex
embedding branch skips the common raw request/response capture so
bifrostResponse.ExtraFields.RawRequest isn't populated when
provider.sendBackRawRequest is enabled; restore that capture before converting
to BifrostEmbeddingResponse by invoking the same handling used elsewhere (e.g.,
call providerUtils.HandleProviderResponse or otherwise unmarshal jsonBody into a
map and set bifrostResponse.ExtraFields.RawRequest) in both branches (the
gemini.GeminiEmbeddingResponse path and the VertexEmbeddingResponse path) using
the existing flags provider.sendBackRawRequest/provider.sendBackRawResponse and
functions providerUtils.ShouldSendBackRawResponse and providerUtils.EnrichError
to keep error handling consistent. Ensure the RawRequest is set prior to
assigning bifrostResponse.Model/ExtraFields so the typed unmarshalling
(VertexEmbeddingResponse.ToBifrostEmbeddingResponse /
gemini.ToBifrostEmbeddingResponse) remains unchanged.
core/providers/gemini/types.go (1)

1255-1279: ⚠️ Potential issue | 🟠 Major

Handle nested snake_case fields inside file_data.

This only aliases the outer file_data key. FileData still has camelCase-only tags, so payloads like {"file_data":{"file_uri":"...","mime_type":"..."}} will unmarshal to a non-nil FileData with empty fields. That makes the new snake_case path fail downstream even though the JSON shape is otherwise valid.

💡 Suggested follow-up
func (f *FileData) UnmarshalJSON(data []byte) error {
	type alias struct {
		DisplayName      string `json:"displayName,omitempty"`
		DisplayNameSnake string `json:"display_name,omitempty"`
		FileURI          string `json:"fileUri,omitempty"`
		FileURISnake     string `json:"file_uri,omitempty"`
		MIMEType         string `json:"mimeType,omitempty"`
		MIMETypeSnake    string `json:"mime_type,omitempty"`
	}

	var aux alias
	if err := sonic.Unmarshal(data, &aux); err != nil {
		return err
	}

	f.DisplayName = aux.DisplayName
	if f.DisplayName == "" {
		f.DisplayName = aux.DisplayNameSnake
	}
	f.FileURI = aux.FileURI
	if f.FileURI == "" {
		f.FileURI = aux.FileURISnake
	}
	f.MIMEType = aux.MIMEType
	if f.MIMEType == "" {
		f.MIMEType = aux.MIMETypeSnake
	}
	return nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/types.go` around lines 1255 - 1279, The outer aliasing
in PartAlias doesn't handle snake_case keys nested inside file_data because
FileData's struct tags are camelCase-only; implement a custom UnmarshalJSON on
the FileData type that defines an internal alias with both camelCase and
snake_case json tags (e.g., DisplayName/display_name, FileURI/file_uri,
MIMEType/mime_type), unmarshal into that aux, and then populate FileData's
fields by preferring the camelCase values and falling back to the snake_case
values when empty so that p.FileData (and any code using FileData) receives the
actual nested values.
♻️ Duplicate comments (4)
core/providers/cohere/embedding.go (1)

195-200: ⚠️ Potential issue | 🟠 Major

Reject images[] requests with more than one image.

This still converts only req.Images[0], so additional images are silently dropped instead of being rejected or documented as unsupported.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/cohere/embedding.go` around lines 195 - 200, The code in
embedding.go silently drops additional images by only converting req.Images[0];
update the handler that checks req.Images (the case starting with "case
len(req.Images) > 0") to explicitly reject requests where len(req.Images) > 1 by
returning an error (or appropriate response) instead of proceeding, and keep the
existing behavior for exactly one image by populating bifrostReq.Input.Contents
with a single schemas.EmbeddingContent entry using
schemas.EmbeddingContentPartTypeImage and schemas.EmbeddingMediaPart{URL: &url};
ensure the error message clearly states that multiple images are unsupported and
references req.Images so callers understand the validation.
plugins/semanticcache/utils.go (1)

450-459: ⚠️ Potential issue | 🟠 Major

Skip semantic caching when an embedding request has no text parts.

This branch still returns "" with no error for image/audio/file-only EmbeddingInput.Contents. The semantic-cache lookup/store paths then embed the empty string, so distinct multimodal-only embedding requests collapse onto the same semantic-cache vector. Please mirror the chat/responses branches here by failing fast (or returning a sentinel) and letting callers bypass semantic caching for those requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/semanticcache/utils.go` around lines 450 - 459, After collecting
textParts from req.EmbeddingRequest.Input.Contents (the loop that checks
part.Type == schemas.EmbeddingContentPartTypeText), detect when len(textParts)
== 0 and fail fast instead of returning an empty string: return a sentinel error
(e.g., ErrNoTextParts) or the existing "skip cache" sentinel used by
chat/responses, along with metadataHash, so callers can bypass semantic caching;
if ErrNoTextParts doesn't exist, add it as a package-level var and use it here.
tests/integrations/python/tests/test_cohere.py (1)

159-196: ⚠️ Potential issue | 🟠 Major

Gate these Cohere-only embedding features behind provider capabilities.

These two tests still run against every provider in the generic embeddings scenario, but output_dimension=512 and embedding_types=["float", "int8"] are feature-specific knobs. Without a capability check, the suite fails on unsupported providers with expected API errors instead of catching transport regressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integrations/python/tests/test_cohere.py` around lines 159 - 196, Both
tests (test_06_custom_dimensions_embedding and test_07_multiple_embedding_types)
run for all providers but rely on Cohere-only features; add capability checks
and skip when the current provider/model doesn't support them. Use the existing
test helpers (e.g., get_provider_cohere_client, format_provider_model,
get_cross_provider_params_for_scenario) to query provider capabilities and gate:
in test_06 check for a capability like support for
output_dimension/custom_dimensions (or a provider capability flag for 512-d
embeddings) and call pytest.skip if missing; in test_07 check for support of
multiple embedding types / int8 embeddings and skip if not present. Ensure the
checks run before creating the client so unsupported providers are skipped
rather than failing.
core/providers/bedrock/embedding.go (1)

12-40: ⚠️ Potential issue | 🔴 Critical

Don’t flatten multiple Contents entries into one Titan request.

Contents is the batch dimension in this PR. Concatenating every content entry into one InputText silently turns an N-item request into a single Titan embedding, so the response shape no longer matches the caller’s input. This converter should accept exactly one content item and only aggregate parts within that item.

🔧 Safer shape guard
-// Titan produces a single embedding per request. All text parts across all content entries
-// are stitched together into one input string. Non-text parts are not supported.
+// Titan produces a single embedding per request. Batching across content entries
+// must be handled upstream; only parts within one content item are aggregated.
 func ToBedrockTitanEmbeddingRequest(bifrostReq *schemas.BifrostEmbeddingRequest) (*BedrockTitanEmbeddingRequest, error) {
 	if bifrostReq == nil {
 		return nil, fmt.Errorf("bifrost embedding request is nil")
 	}

 	if bifrostReq.Input == nil || len(bifrostReq.Input.Contents) == 0 {
 		return nil, fmt.Errorf("no input provided for Titan embedding")
 	}
+	if len(bifrostReq.Input.Contents) != 1 {
+		return nil, fmt.Errorf("amazon titan embedding models support exactly one content item per request")
+	}

 	if bifrostReq.Params != nil && bifrostReq.Params.Dimensions != nil {
 		return nil, fmt.Errorf("amazon Titan embedding models do not support custom dimensions parameter")
 	}

 	var sb strings.Builder
-	for _, content := range bifrostReq.Input.Contents {
-		for _, part := range content {
-			if part.Type != schemas.EmbeddingContentPartTypeText || part.Text == nil {
-				return nil, fmt.Errorf("amazon Titan embedding models only support text input")
-			}
-			sb.WriteString(*part.Text)
-			sb.WriteString(" \n")
-		}
+	for _, part := range bifrostReq.Input.Contents[0] {
+		if part.Type != schemas.EmbeddingContentPartTypeText || part.Text == nil {
+			return nil, fmt.Errorf("amazon Titan embedding models only support text input")
+		}
+		sb.WriteString(*part.Text)
+		sb.WriteString(" \n")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/embedding.go` around lines 12 - 40, The converter
ToBedrockTitanEmbeddingRequest currently concatenates all entries in
bifrostReq.Input.Contents into one BedrockTitanEmbeddingRequest.InputText,
collapsing the batch dimension; change it to require exactly one content item
(error if len(bifrostReq.Input.Contents) != 1) and only iterate over that single
content's parts to build InputText, preserving the 1:1 request/response shape;
keep the existing validation for part types
(schemas.EmbeddingContentPartTypeText) and the BedrockTitanEmbeddingRequest
construction unchanged.
🧹 Nitpick comments (5)
plugins/semanticcache/test_utils.go (1)

557-559: Prefer bifrost.Ptr over taking &t here.

This repo standardizes on the pointer helper in test utilities, and it lets you drop the extra temporary.

♻️ Suggested cleanup
 		Input: func() *schemas.EmbeddingInput {
 			contents := make([]schemas.EmbeddingContent, len(texts))
 			for i, text := range texts {
-				t := text
-				contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: &t}}
+				contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: bifrost.Ptr(text)}}
 			}
 			return &schemas.EmbeddingInput{Contents: contents}
 		}(),

Based on learnings, "prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/semanticcache/test_utils.go` around lines 557 - 559, Replace the
direct address operator usage (&t) when populating schemas.EmbeddingContent in
the loop over texts with the repo-standard pointer helper bifrost.Ptr(t);
specifically update the assignment that currently sets Text: &t to Text:
bifrost.Ptr(t) and add an import for the bifrost package if it's not already
present so the test_utils.go loop that builds contents uses bifrost.Ptr
consistently.
core/providers/fireworks/fireworks_test.go (1)

191-193: Use bifrost.Ptr for the text part pointer.

This test now takes the address of a temporary just to build *string. Prefer the repo pointer helper here for consistency.

♻️ Suggested cleanup
 		Input: &schemas.EmbeddingInput{
 			Contents: []schemas.EmbeddingContent{
-				{{Type: schemas.EmbeddingContentPartTypeText, Text: &text}},
+				{{Type: schemas.EmbeddingContentPartTypeText, Text: bifrost.Ptr(text)}},
 			},
 		},

Based on learnings, "prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/fireworks/fireworks_test.go` around lines 191 - 193, The test
currently takes the address of a temporary string (variable text) when
constructing the Contents slice of schemas.EmbeddingContent; replace that
pattern by using the repository pointer helper bifrost.Ptr to create the *string
instead. Update the EmbeddingContent construction (the Contents:
[]schemas.EmbeddingContent{...} entry where Type is
schemas.EmbeddingContentPartTypeText) to call bifrost.Ptr(yourText) rather than
&text so the pointer creation is consistent with other tests and utilities.
core/providers/gemini/embedding.go (2)

386-394: Error context lost on conversion failure.

When geminiContentToEmbeddingContent fails, the method returns nil without any indication of the error. This makes debugging difficult. The same pattern at lines 400-403.

Consider logging the error or changing the method signature to return an error if this aligns with other converter patterns in the stack.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/embedding.go` around lines 386 - 394, The conversion
loop using geminiContentToEmbeddingContent silently swallows errors by returning
nil; update the converter to propagate or log conversion failures: modify the
surrounding function that builds bifrostReq (reference bifrostReq,
geminiContentToEmbeddingContent and applyBifrostEmbeddingParams) to return an
error (or at minimum log the error via the existing logger) when
geminiContentToEmbeddingContent fails, and ensure callers handle the returned
error instead of getting a nil result; apply the same change to the analogous
block at lines 400-403 so errors are not lost.

153-155: Error message could be clearer when contents array is empty.

The error "bifrost request is nil or input is nil" also triggers when len(bifrostReq.Input.Contents) == 0, which may confuse debugging since Input is technically not nil in that case.

✏️ Suggested improvement
 	if bifrostReq == nil || bifrostReq.Input == nil || len(bifrostReq.Input.Contents) == 0 {
-		return nil, fmt.Errorf("bifrost request is nil or input is nil")
+		return nil, fmt.Errorf("bifrost request, input, or contents is nil/empty")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/embedding.go` around lines 153 - 155, The current
nil-check lumps three failure modes into one message; update the validation
around bifrostReq (the block testing bifrostReq, bifrostReq.Input, and
len(bifrostReq.Input.Contents)) to return distinct, clear errors for each case
(e.g., "bifrost request is nil", "bifrost request input is nil", and "bifrost
request input contents is empty") so callers can tell whether Input exists but
Contents is empty; adjust the fmt.Errorf calls accordingly where bifrostReq and
bifrostReq.Input.Contents are validated.
core/providers/vertex/embedding.go (1)

40-46: Silent failure on non-text content parts may confuse users.

When a non-text part is encountered, the function silently returns nil. Since the doc comment states "All contents must be text-only," a logged warning or a clearer signal would help users understand why their multimodal input failed on the text embedding endpoint.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vertex/embedding.go` around lines 40 - 46, The loop that
builds sb silently returns nil when encountering a non-text part (checking
part.Type != schemas.EmbeddingContentPartTypeText || part.Text == nil), which
hides why embedding failed; update this in the function that assembles content
(the loop using sb and iterating over content/part) to log a warning including
the offending part.Type (or part index) and return an explicit error or clear
failure value instead of nil so callers get a clear signal that non-text content
was rejected; use the package logger or existing request logger to emit the
warning and replace the silent return with a descriptive error return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/utils.go`:
- Around line 627-669: getEmbeddingVector currently ignores bit-packed
embeddings (Binary / Ubinary) and returns "no valid embedding data found";
update getEmbeddingVector to detect and handle e.Binary and e.Ubinary similarly
to the base64 path: if either is present, treat the byte slice as little-endian
float32-packed data, verify length%4==0, iterate 4-byte chunks with
binary.LittleEndian.Uint32 and math.Float32frombits to populate and return a
[]float64, and include the same error handling as the base64 branch for bad
lengths; reference the function name getEmbeddingVector and the embedding fields
e.Binary and e.Ubinary when making the change.

In `@core/providers/cohere/embedding.go`:
- Around line 90-95: The converter is aliasing bifrostReq.Params.ExtraParams
into cohereReq.ExtraParams and then mutating it, causing side effects; instead,
when Params.ExtraParams is non-nil create a new map, copy each key/value into it
(e.g., make(map[string]interface{}, len(...)) and loop to assign), set
cohereReq.ExtraParams to that clone, and only mutate the clone; apply the same
cloning approach wherever the converters copy ExtraParams (the other converter
blocks referenced around the later ranges) so conversion functions remain pure
and do not modify the caller's map.
- Around line 334-367: cohereResp.Embeddings is never initialized and
hasFloat/hasBase64 are never set, causing a panic on the first append and
incorrect final assignment; fix by initializing cohereResp.Embeddings (or
embeddingData) before any appends, set hasFloat = true when encountering
emb.Float and hasBase64 = true when encountering emb.Base64 while iterating
bifrostResp.Data, append into the initialized embedding struct (e.g.,
embeddingData.Float, embeddingData.Base64, etc.), and after the loop assign
cohereResp.Embeddings = embeddingData only if hasFloat || hasBase64 (otherwise
leave nil or set appropriate type marker). Ensure you reference
CohereEmbeddingResponse, embeddingData, hasFloat, hasBase64 and the loop over
bifrostResp.Data when making the change.

In `@core/providers/gemini/embedding.go`:
- Around line 136-146: The code mutates the caller's params.ExtraParams map by
assigning it to req.ExtraParams and then deleting keys; fix this in the
embedding conversion by making a shallow copy of params.ExtraParams (handle nil)
and assign that copy to req.ExtraParams, then run schemas.SafeExtractBoolPointer
on the copied map for "documentOcr" and "audioTrackExtraction" and delete those
keys from the copy instead of the original; keep using the same helper
schemas.SafeExtractBoolPointer and set req.DocumentOCR and
req.AudioTrackExtraction accordingly.

In `@core/providers/huggingface/embedding.go`:
- Around line 98-111: The extractTextFromContent function currently concatenates
text parts without separators, collapsing boundaries (e.g., ["foo","bar"] ->
"foobar"); update extractTextFromContent to insert a delimiter (e.g., a single
space or newline) between parts when appending to the strings.Builder so each
part boundary is preserved while keeping the existing validation (check
part.Type == schemas.EmbeddingContentPartTypeText and part.Text != nil) and the
empty-content error; modify the loop in extractTextFromContent to write the
delimiter only when sb.Len() > 0 before appending the next *part.Text.

In `@core/providers/openai/types.go`:
- Around line 53-58: The OpenAIEmbeddingInput flattening causes multiple input
parts from a single schemas.EmbeddingContent to be sent as separate upstream
inputs; update ToOpenAIEmbeddingRequest to preserve one EmbeddingContent per
OpenAI input by aggregating parts of each schemas.EmbeddingContent before
populating OpenAIEmbeddingInput.Texts and .Embeddings (and any tokenBatches
logic) so each element of request.Input.Contents maps to exactly one
OpenAIEmbeddingInput entry; locate and change the logic in
ToOpenAIEmbeddingRequest (core/providers/openai/embedding.go) that iterates
request.Input.Contents and currently appends parts into single slices to instead
combine/concat parts per content and push one OpenAIEmbeddingInput per content,
ensuring fields Text, Texts, Embedding, or Embeddings are set appropriately for
that aggregated content.

In `@core/providers/vertex/embedding.go`:
- Around line 95-105: The code mutates the input map by assigning
params.ExtraParams to req.ExtraParams and then deleting keys; change this to
create a shallow copy of params.ExtraParams and assign the copy to
req.ExtraParams, then perform the SafeExtractBoolPointer checks (for
"documentOcr" and "audioTrackExtraction") and delete those keys from the copied
map while setting req.DocumentOCR and req.AudioTrackExtraction; reference the
symbols params.ExtraParams, req.ExtraParams, schemas.SafeExtractBoolPointer,
req.DocumentOCR, and req.AudioTrackExtraction to locate and update the logic so
the original params.ExtraParams remains unmodified.

In `@framework/logstore/migrations.go`:
- Around line 2413-2453: The migration only reads input_history[0], dropping
subsequent legacy inputs; update the SQL in migrations.go (the UPDATE that sets
embedding_input for object_type='embedding') to iterate over all entries in
input_history instead of hard-coding the first element: for Postgres, replace
(input_history::jsonb -> 0 -> 'content' -> 'content_blocks') with a full
expansion like jsonb_array_elements(input_history::jsonb) -> 'content' ->
'content_blocks' (or nested jsonb_array_elements twice) so you aggregate text
blocks from every history item before jsonb_agg into embedding_input; for
SQLite, change json_extract(input_history, '$[0].content.content_blocks') and
its json_each to iterate over all array entries (e.g., json_each(input_history)
then json_each(json_extract(..., '$.content.content_blocks')) ) so
json_group_array collects blocks from every history entry while keeping the
existing filters and embedding_input/null checks. Ensure the aggregate still
builds objects with type='text' and non-empty text fields.

In `@plugins/logging/main.go`:
- Around line 207-208: The EmbeddingInput field currently stores full
schemas.EmbeddingContent (including Image/Audio/File/Video.Data) which can
contain large/base64 or sensitive blobs; before assigning to EmbeddingInput, run
the same truncation/redaction utility used for transcription and image inputs
(the threshold-based redactor used elsewhere) to strip or truncate raw Data
fields and large payloads, and assign the sanitized result to EmbeddingInput;
update all assignment sites including the other occurrences noted around the
block at lines 480-483 to use the redaction helper so logs never receive full
binary/base64 content.

In `@tests/integrations/python/tests/test_cohere.py`:
- Line 50: The parameter annotation for assert_valid_cohere_embedding_response
uses an implicit optional (expected_dimensions: int = None); change it to an
explicit optional type (expected_dimensions: int | None = None or
expected_dimensions: Optional[int] = None) and, if using Optional, add the
necessary typing import; update the function signature and any related type
hints accordingly.

In `@ui/app/workspace/logs/sheets/logDetailView.tsx`:
- Around line 1016-1022: The current branch (inside the isEmbedding case)
flattens log.embedding_input into a texts array and drops non-text parts;
instead, preserve full EmbeddingContent semantics by assigning the original
embedding_input structure to requestBody.input when log.embedding_input exists
(i.e., requestBody.input = log.embedding_input) so each EmbeddingContent maps to
one output embedding; keep the existing text-only extraction and fallback only
when embedding_input is absent or empty (use the map/filter/flat logic only for
older logs that lack embedding_input) and ensure any type expectations for
requestBody.input are respected when switching to the full embedding_input
structure.

---

Outside diff comments:
In `@core/internal/llmtests/response_validation.go`:
- Around line 1324-1348: The presence check currently accepts
embedding.Embedding.Base64 as valid data but the dimensions-check logic
(iterating response.Data and computing actualDimensions from e.Float, e.Int8,
e.Uint8, e.Binary, e.Ubinary) never handles Base64, causing Base64-only
embeddings to report dimension 0 and fail expected_dimensions; update the
dimension path to detect e.Base64, decode the base64 payload to raw bytes and
set actualDimensions = len(decodedBytes) (using standard base64 decode), and
update any other dimension/metrics computation that reads actualDimensions (the
loop over response.Data and the metrics block that computes embedding
dimensions) to use the decoded length so Base64-only embeddings are validated
correctly; alternatively, if you prefer not to decode here, remove Base64 from
the initial hasData check in the embedding presence check
(embedding.Embedding.Base64) so Base64-only embeddings are not treated as having
data.

In `@core/providers/gemini/types.go`:
- Around line 1255-1279: The outer aliasing in PartAlias doesn't handle
snake_case keys nested inside file_data because FileData's struct tags are
camelCase-only; implement a custom UnmarshalJSON on the FileData type that
defines an internal alias with both camelCase and snake_case json tags (e.g.,
DisplayName/display_name, FileURI/file_uri, MIMEType/mime_type), unmarshal into
that aux, and then populate FileData's fields by preferring the camelCase values
and falling back to the snake_case values when empty so that p.FileData (and any
code using FileData) receives the actual nested values.

In `@core/providers/vertex/vertex.go`:
- Around line 1520-1548: The Vertex embedding branch skips the common raw
request/response capture so bifrostResponse.ExtraFields.RawRequest isn't
populated when provider.sendBackRawRequest is enabled; restore that capture
before converting to BifrostEmbeddingResponse by invoking the same handling used
elsewhere (e.g., call providerUtils.HandleProviderResponse or otherwise
unmarshal jsonBody into a map and set bifrostResponse.ExtraFields.RawRequest) in
both branches (the gemini.GeminiEmbeddingResponse path and the
VertexEmbeddingResponse path) using the existing flags
provider.sendBackRawRequest/provider.sendBackRawResponse and functions
providerUtils.ShouldSendBackRawResponse and providerUtils.EnrichError to keep
error handling consistent. Ensure the RawRequest is set prior to assigning
bifrostResponse.Model/ExtraFields so the typed unmarshalling
(VertexEmbeddingResponse.ToBifrostEmbeddingResponse /
gemini.ToBifrostEmbeddingResponse) remains unchanged.

In `@ui/app/workspace/logs/sheets/logDetailView.tsx`:
- Around line 952-953: The isEmbedding discriminator currently checks log.object
=== "list" but the render path uses "embedding", so update the condition in
logDetailView.tsx (the isEmbedding variable) to match the render path (e.g.,
check log.object === "embedding" or accept both "embedding" and "list" if
backward compatibility is needed) so the "Copy request body" flow enters the
embedding branch; locate and update the isEmbedding declaration and any
dependent branches that gate the embedding copy behavior to use the corrected
discriminator.

---

Duplicate comments:
In `@core/providers/bedrock/embedding.go`:
- Around line 12-40: The converter ToBedrockTitanEmbeddingRequest currently
concatenates all entries in bifrostReq.Input.Contents into one
BedrockTitanEmbeddingRequest.InputText, collapsing the batch dimension; change
it to require exactly one content item (error if len(bifrostReq.Input.Contents)
!= 1) and only iterate over that single content's parts to build InputText,
preserving the 1:1 request/response shape; keep the existing validation for part
types (schemas.EmbeddingContentPartTypeText) and the
BedrockTitanEmbeddingRequest construction unchanged.

In `@core/providers/cohere/embedding.go`:
- Around line 195-200: The code in embedding.go silently drops additional images
by only converting req.Images[0]; update the handler that checks req.Images (the
case starting with "case len(req.Images) > 0") to explicitly reject requests
where len(req.Images) > 1 by returning an error (or appropriate response)
instead of proceeding, and keep the existing behavior for exactly one image by
populating bifrostReq.Input.Contents with a single schemas.EmbeddingContent
entry using schemas.EmbeddingContentPartTypeImage and
schemas.EmbeddingMediaPart{URL: &url}; ensure the error message clearly states
that multiple images are unsupported and references req.Images so callers
understand the validation.

In `@plugins/semanticcache/utils.go`:
- Around line 450-459: After collecting textParts from
req.EmbeddingRequest.Input.Contents (the loop that checks part.Type ==
schemas.EmbeddingContentPartTypeText), detect when len(textParts) == 0 and fail
fast instead of returning an empty string: return a sentinel error (e.g.,
ErrNoTextParts) or the existing "skip cache" sentinel used by chat/responses,
along with metadataHash, so callers can bypass semantic caching; if
ErrNoTextParts doesn't exist, add it as a package-level var and use it here.

In `@tests/integrations/python/tests/test_cohere.py`:
- Around line 159-196: Both tests (test_06_custom_dimensions_embedding and
test_07_multiple_embedding_types) run for all providers but rely on Cohere-only
features; add capability checks and skip when the current provider/model doesn't
support them. Use the existing test helpers (e.g., get_provider_cohere_client,
format_provider_model, get_cross_provider_params_for_scenario) to query provider
capabilities and gate: in test_06 check for a capability like support for
output_dimension/custom_dimensions (or a provider capability flag for 512-d
embeddings) and call pytest.skip if missing; in test_07 check for support of
multiple embedding types / int8 embeddings and skip if not present. Ensure the
checks run before creating the client so unsupported providers are skipped
rather than failing.

---

Nitpick comments:
In `@core/providers/fireworks/fireworks_test.go`:
- Around line 191-193: The test currently takes the address of a temporary
string (variable text) when constructing the Contents slice of
schemas.EmbeddingContent; replace that pattern by using the repository pointer
helper bifrost.Ptr to create the *string instead. Update the EmbeddingContent
construction (the Contents: []schemas.EmbeddingContent{...} entry where Type is
schemas.EmbeddingContentPartTypeText) to call bifrost.Ptr(yourText) rather than
&text so the pointer creation is consistent with other tests and utilities.

In `@core/providers/gemini/embedding.go`:
- Around line 386-394: The conversion loop using geminiContentToEmbeddingContent
silently swallows errors by returning nil; update the converter to propagate or
log conversion failures: modify the surrounding function that builds bifrostReq
(reference bifrostReq, geminiContentToEmbeddingContent and
applyBifrostEmbeddingParams) to return an error (or at minimum log the error via
the existing logger) when geminiContentToEmbeddingContent fails, and ensure
callers handle the returned error instead of getting a nil result; apply the
same change to the analogous block at lines 400-403 so errors are not lost.
- Around line 153-155: The current nil-check lumps three failure modes into one
message; update the validation around bifrostReq (the block testing bifrostReq,
bifrostReq.Input, and len(bifrostReq.Input.Contents)) to return distinct, clear
errors for each case (e.g., "bifrost request is nil", "bifrost request input is
nil", and "bifrost request input contents is empty") so callers can tell whether
Input exists but Contents is empty; adjust the fmt.Errorf calls accordingly
where bifrostReq and bifrostReq.Input.Contents are validated.

In `@core/providers/vertex/embedding.go`:
- Around line 40-46: The loop that builds sb silently returns nil when
encountering a non-text part (checking part.Type !=
schemas.EmbeddingContentPartTypeText || part.Text == nil), which hides why
embedding failed; update this in the function that assembles content (the loop
using sb and iterating over content/part) to log a warning including the
offending part.Type (or part index) and return an explicit error or clear
failure value instead of nil so callers get a clear signal that non-text content
was rejected; use the package logger or existing request logger to emit the
warning and replace the silent return with a descriptive error return.

In `@plugins/semanticcache/test_utils.go`:
- Around line 557-559: Replace the direct address operator usage (&t) when
populating schemas.EmbeddingContent in the loop over texts with the
repo-standard pointer helper bifrost.Ptr(t); specifically update the assignment
that currently sets Text: &t to Text: bifrost.Ptr(t) and add an import for the
bifrost package if it's not already present so the test_utils.go loop that
builds contents uses bifrost.Ptr consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 51b97aa5-0857-43ae-ad75-271ffc131093

📥 Commits

Reviewing files that changed from the base of the PR and between d148be8 and 6f29fb7.

📒 Files selected for processing (50)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/embedding.go
  • core/internal/llmtests/embedding_multimodal.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/tests.go
  • core/internal/llmtests/utils.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/embedding.go
  • core/providers/bedrock/invoke.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/embedding.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/cohere/types.go
  • core/providers/fireworks/fireworks_test.go
  • core/providers/gemini/embedding.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/huggingface/embedding.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/openai/embedding.go
  • core/providers/openai/openai.go
  • core/providers/openai/types.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/providers/vertex/types.go
  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/embedding.go
  • core/schemas/embedding_multimodal_test.go
  • core/schemas/serialization_test.go
  • core/schemas/trace.go
  • framework/logstore/migrations.go
  • framework/logstore/tables.go
  • framework/tracing/llmspan.go
  • plugins/logging/main.go
  • plugins/logging/utils.go
  • plugins/logging/writer.go
  • plugins/semanticcache/go.mod
  • plugins/semanticcache/plugin_core_test.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/utils.go
  • tests/integrations/python/config.yml
  • tests/integrations/python/tests/test_cohere.py
  • tests/integrations/python/tests/test_google.py
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/cohere.go
  • ui/app/workspace/logs/sheets/logDetailView.tsx
💤 Files with no reviewable changes (1)
  • plugins/semanticcache/plugin_core_test.go
✅ Files skipped from review due to trivial changes (6)
  • core/schemas/embedding_multimodal_test.go
  • core/providers/openai/openai.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/openai/embedding.go
🚧 Files skipped from review as they are similar to previous changes (14)
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/types.go
  • core/internal/llmtests/embedding.go
  • core/providers/gemini/gemini.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/trace.go
  • core/providers/gemini/gemini_test.go
  • transports/bifrost-http/handlers/inference.go
  • tests/integrations/python/config.yml
  • core/internal/llmtests/account.go
  • tests/integrations/python/tests/test_google.py
  • core/schemas/embedding.go
  • framework/tracing/llmspan.go
  • core/providers/vertex/types.go

Comment thread core/internal/llmtests/utils.go
Comment thread core/providers/cohere/embedding.go
Comment thread core/providers/cohere/embedding.go Outdated
Comment thread core/providers/gemini/embedding.go
Comment thread core/providers/huggingface/embedding.go
Comment thread core/providers/vertex/embedding.go
Comment on lines +2413 to +2453
return tx.Exec(`
UPDATE logs
SET embedding_input = (
SELECT jsonb_agg(
jsonb_build_array(
jsonb_build_object('type', 'text', 'text', block->>'text')
)
)
FROM jsonb_array_elements(
(input_history::jsonb -> 0 -> 'content' -> 'content_blocks')
) AS block
WHERE block->>'type' = 'text'
AND (block->>'text') IS NOT NULL
AND (block->>'text') != ''
)
WHERE object_type = 'embedding'
AND input_history IS NOT NULL
AND input_history NOT IN ('', '[]', 'null')
AND embedding_input IS NULL
`).Error
case "sqlite":
return tx.Exec(`
UPDATE logs
SET embedding_input = (
SELECT json_group_array(
json_array(
json_object('type', 'text', 'text', json_extract(block.value, '$.text'))
)
)
FROM json_each(
json_extract(input_history, '$[0].content.content_blocks')
) AS block
WHERE json_extract(block.value, '$.type') = 'text'
AND json_extract(block.value, '$.text') IS NOT NULL
AND json_extract(block.value, '$.text') != ''
)
WHERE object_type = 'embedding'
AND input_history IS NOT NULL
AND input_history NOT IN ('', '[]', 'null')
AND embedding_input IS NULL
`).Error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This backfill only preserves the first legacy embedding input.

Line 2422 and Line 2443 hard-code input_history[0], so historical rows with more than one legacy embedding input get truncated during migration. That breaks the backward-compatibility story for old plural embedding requests by silently dropping every entry after the first.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/logstore/migrations.go` around lines 2413 - 2453, The migration
only reads input_history[0], dropping subsequent legacy inputs; update the SQL
in migrations.go (the UPDATE that sets embedding_input for
object_type='embedding') to iterate over all entries in input_history instead of
hard-coding the first element: for Postgres, replace (input_history::jsonb -> 0
-> 'content' -> 'content_blocks') with a full expansion like
jsonb_array_elements(input_history::jsonb) -> 'content' -> 'content_blocks' (or
nested jsonb_array_elements twice) so you aggregate text blocks from every
history item before jsonb_agg into embedding_input; for SQLite, change
json_extract(input_history, '$[0].content.content_blocks') and its json_each to
iterate over all array entries (e.g., json_each(input_history) then
json_each(json_extract(..., '$.content.content_blocks')) ) so json_group_array
collects blocks from every history entry while keeping the existing filters and
embedding_input/null checks. Ensure the aggregate still builds objects with
type='text' and non-empty text fields.

Comment thread plugins/logging/main.go
Comment thread tests/integrations/python/tests/test_cohere.py Outdated
Comment thread ui/app/workspace/logs/sheets/logDetailView.tsx Outdated
@TejasGhatte TejasGhatte force-pushed the 04-03-feat_multimodal_embeddings branch from 6f29fb7 to 4f21ee8 Compare April 10, 2026 08:10
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/bedrock/invoke.go (1)

1002-1019: ⚠️ Potential issue | 🟠 Major

Preserve non-float embedding types in the invoke response shim.

This path always reads d.Embedding.Float and always emits response_type: "embeddings_floats", but the request path still accepts embedding_types. If the provider returns Int8, Uint8, Binary, Ubinary, or Base64, the compat response will come back empty or mislabeled instead of matching the requested type.

At minimum, branch on the populated EmbeddingsByType field and emit the matching native envelope, or fail loudly instead of silently flattening everything to float.

Based on learnings: In maximhq/bifrost, the active embedding representation is EmbeddingsByType with Float, Int8, Uint8, Binary, Ubinary, and Base64 fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 1002 - 1019, The shim always
reads d.Embedding.Float and emits a float-only envelope
(BedrockInvokeCohereEmbeddingResp with ResponseType "embeddings_floats") even
when the provider populated other types; update the embedding response
construction in the Titan/resp.Data path to inspect EmbeddingsByType (e.g.,
check resp.Data[*].Embedding.EmbeddingsByType and its Float, Int8, Uint8,
Binary, Ubinary, Base64 fields) and branch accordingly to return the matching
native envelope (preserving Int8/Uint8/Binary/Ubinary/Base64 formats) or return
an explicit error instead of always flattening to float; adjust the code paths
that currently reference d.Embedding.Float and resp.Data[0].Embedding.Float to
use the correct EmbeddingsByType fields and set ResponseType to the correct
native type when building BedrockInvokeCohereEmbeddingResp or
BedrockInvokeEmbeddingResp.
ui/app/workspace/logs/sheets/logDetailView.tsx (1)

947-953: ⚠️ Potential issue | 🟠 Major

Use the embedding object key in the copy-request-body path.

Line 952 still derives isEmbedding from "list", but this file renders embedding inputs behind log.object === "embedding". As written, real embedding logs will skip the new log.embedding_input copy branch, while list-model logs can be misclassified instead.

Suggested fix
-		const isEmbedding = log.object === "list";
+		const isEmbedding = log.object === "embedding";

Also applies to: 1016-1018

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailView.tsx` around lines 947 - 953, The
isEmbedding detection currently derives from log.object === "list", causing real
embedding logs to be missed and list-model logs misclassified; update the
isEmbedding checks (the variable named isEmbedding and any similar checks around
the copy-request-body branch that reference log.object) to use log.object ===
"embedding" (and include the chunk variant if needed, e.g., "embedding.chunk")
so the copy button uses log.embedding_input for true embedding logs; apply the
same replacement at the other occurrence near the copy-request-body logic (the
block around the second isEmbedding declaration).
core/providers/gemini/types.go (1)

1255-1279: ⚠️ Potential issue | 🟠 Major

file_data snake_case support is only half-implemented.

This new fallback recognizes top-level file_data, but it still unmarshals into FileData, whose fields only accept fileUri / mimeType / displayName. A Python-SDK payload like file_data: {"file_uri":"...","mime_type":"application/pdf"} will therefore produce a non-nil FileData with empty fields, so file-based multimodal inputs still fail after this branch. Please add snake_case handling for the nested FileData object as well.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/types.go` around lines 1255 - 1279, The code only falls
back to top-level file_data into aux.FileDataSnake but does not translate nested
snake_case fields into the canonical FileData, so Python-style payloads like
file_data: {"file_uri": "...", "mime_type": "..."} end up as empty fields.
Update the unmarshal/assignment logic for PartAlias/FileData handling: add a
FileDataSnake struct (or ensure aux.FileDataSnake exists) that defines
snake_case fields (file_uri, mime_type, display_name), then after
sonic.Unmarshal map those snake_case fields into the canonical FileData fields
(e.g., p.FileData.FileUri = aux.FileDataSnake.FileUri, p.FileData.MimeType =
aux.FileDataSnake.MimeType, p.FileData.DisplayName =
aux.FileDataSnake.DisplayName) when p.FileData is nil so nested snake_case
payloads are correctly populated; keep references to PartAlias, FileData,
FileDataSnake, p.FileData and sonic.Unmarshal to locate where to modify the
code.
🧹 Nitpick comments (1)
plugins/semanticcache/test_utils.go (1)

555-562: Use bifrost.Ptr(...) for pointer creation consistency.

At Line 559, pointer creation can follow repo convention for readability and consistency.

♻️ Proposed refactor
 		Input: func() *schemas.EmbeddingInput {
 			contents := make([]schemas.EmbeddingContent, len(texts))
 			for i, text := range texts {
-				t := text
-				contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: &t}}
+				contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: bifrost.Ptr(text)}}
 			}
 			return &schemas.EmbeddingInput{Contents: contents}
 		}(),

Based on learnings: In maximhq/bifrost, prefer using bifrost.Ptr() instead of &value for pointer construction consistency across code paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/semanticcache/test_utils.go` around lines 555 - 562, Replace direct
address-of usages when building the EmbeddingInput contents with the repo
helper: in the anonymous Input func that returns *schemas.EmbeddingInput (the
block creating contents := make([]schemas.EmbeddingContent, len(texts)) and
assigning contents[i] = schemas.EmbeddingContent{{Type:
schemas.EmbeddingContentPartTypeText, Text: &t}}), use bifrost.Ptr(t) instead of
&t so the Text pointer is created via bifrost.Ptr(...) for consistency with
project conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/embedding_multimodal.go`:
- Around line 107-110: The skip logic uses vertexNoBatch := testConfig.Provider
== schemas.Vertex which incorrectly prevents BatchImages/BatchMultimodal tests
for all Vertex configs; change the guard to target only the single-content
Gemini embedding route (e.g., check the model or a single-content flag instead
of Provider). Update vertexNoBatch to something like: vertexNoBatch :=
testConfig.Provider == schemas.Vertex && (testConfig.Model ==
"multimodalembedding@001" || testConfig.SingleContentGemini == true), and apply
the same change at the other occurrence (around the 208-210 block) so batch
image coverage still runs for Vertex multimodal models that support per-input
batching.

In `@core/internal/llmtests/response_validation.go`:
- Around line 1324-1326: The dimension-check logic incorrectly treats Base64
embeddings as empty because it manually inspects embedding.Embedding fields;
replace those manual checks with the existing getEmbeddingVector helper (from
core/internal/llmtests/utils.go) to extract the vector and then test len(vector)
> 0 and use that length for expected_dimensions; update all occurrences where
similar manual checks are used (the current hasData/embedding dimension
branches, and the other similar blocks noted) to call
getEmbeddingVector(embedding) instead of inspecting Base64/Binary/Int8/Uint8
fields directly so Base64 is counted and Binary/Ubinary remain intentionally
skipped.

In `@core/providers/bedrock/invoke.go`:
- Around line 397-409: The embedding request builder currently only sets
req.Input.Contents for r.InputText and r.Texts, leaving req.Input nil for
image-only or mixed requests; update the code in the embedding path that creates
req.Input so it also iterates r.Images and r.Inputs and appends corresponding
schemas.EmbeddingContent entries (using the proper EmbeddingContentPartType for
images/inputs and assigning the image URL or input payload pointers) so
req.Input is always non-nil when any of r.InputText, r.Texts, r.Images, or
r.Inputs are present; modify the section that constructs contents (the block
that sets contents := make([]schemas.EmbeddingContent, ...)) to include entries
for r.Images and r.Inputs and then set req.Input =
&schemas.EmbeddingInput{Contents: contents}.

In `@core/providers/gemini/embedding.go`:
- Around line 247-254: The current usage accounting reads only
embeddings[0].Statistics.TokenCount; update the logic in embedding.go (around
where geminiResp, embeddings and bifrostResp.Usage are handled) to sum
TokenCount across all embeddings that have Statistics (e.g., iterate embeddings
and accumulate Statistics.TokenCount when non-nil), and only if no per-embedding
statistics exist (sum == 0) fall back to
geminiResp.Metadata.BillableCharacterCount; then assign that summed value to
bifrostResp.Usage.PromptTokens and bifrostResp.Usage.TotalTokens so batched
embeddings are correctly accounted for.
- Around line 383-391: The loop currently overwrites bifrostReq.Params per item
so only the last item's embedding params survive; instead compute the params for
the first item (using applyBifrostEmbeddingParams) as the baseline, then for
each subsequent request compute its params and compare to the baseline; if any
mismatch is found, return an error (reject the batch) or only set
bifrostReq.Params if all items' params are identical. Reference the
request.Requests loop, geminiContentToEmbeddingContent,
applyBifrostEmbeddingParams, and bifrostReq.Params when locating where to
implement the baseline computation, per-item comparison, and early rejection.

In `@core/providers/vertex/embedding.go`:
- Around line 136-167: The loop that builds each multimodal instance currently
overwrites earlier parts (instance.Text, instance.Image, instance.Video) instead
of aggregating; change it to concatenate text parts (e.g., join successive
schemas.EmbeddingContentPartTypeText parts into instance.Text with a separator
like "\n\n") and treat image/video parts deterministically: if you want to
reject duplicates, return an error when instance.Image or instance.Video is
already set (with a clear message referencing
VertexMultimodalImageInput/VertexMultimodalVideoInput), or implement an explicit
merge strategy (e.g., collect multiple inputs into slices) rather than assigning
over them; update the switch handling of schemas.EmbeddingContentPartTypeImage
and ...Video (including use of extractBase64FromDataURI) to check for existing
instance.Image/instance.Video before assigning, and ensure the final instances =
append(instances, instance) receives the aggregated instance.
- Around line 212-237: The current converter emits multiple EmbeddingData
entries per `prediction` (from `prediction.TextEmbedding`,
`prediction.ImageEmbedding`, and `prediction.VideoEmbeddings`), breaking the 1
content → 1 embedding contract; update the code that builds `embeddings` (the
block creating schemas.EmbeddingData from `prediction`) so it produces exactly
one EmbeddingData per `prediction`: either (A) merge modality vectors into a
single vector (e.g., concatenate or compute a fixed-size average) and set that
as the single `Embedding` for the `EmbeddingData`, or (B) pick a single modality
by priority (e.g., prefer `TextEmbedding` then `ImageEmbedding` then the first
`VideoEmbeddings` entry) and use only that vector; implement the chosen strategy
where the current multiple appends occur (refer to `prediction.TextEmbedding`,
`prediction.ImageEmbedding`, `prediction.VideoEmbeddings`, `embeddings`, and
`schemas.EmbeddingData`) and ensure `idx` is incremented exactly once per
`prediction`; alternatively, if you opt to reject mixed-modality responses,
return an explicit error upstream when more than one modality is present instead
of appending multiple rows.

In `@core/providers/vertex/vertex.go`:
- Around line 1522-1540: After converting provider responses to bifrostResponse,
guard against nil before populating ExtraFields: check if bifrostResponse == nil
after calling gemini.ToBifrostEmbeddingResponse(...) and after
vertexResponse.ToBifrostEmbeddingResponse(), and if nil return a provider error
via providerUtils.EnrichError(providerUtils.NewBifrostOperationError(...)) (use
an appropriate schemas.ErrProviderResponse* code for "no embeddings/malformed
response") passing ctx, jsonBody, responseBody, provider.sendBackRawRequest and
provider.sendBackRawResponse so you don't dereference
bifrostResponse.ExtraFields on malformed-but-200 responses.

In `@core/schemas/embedding.go`:
- Around line 66-80: The EmbeddingMediaPart.Validate currently only checks for
non-nil Data/URL and allows empty payloads; update EmbeddingMediaPart.Validate
to treat empty values as unset by verifying that Data and URL are not only
non-nil but also contain non-zero length (e.g., check len of the underlying
string/byte slice pointed to by Data and the string pointed to by URL) and only
increment set when they are non-empty; if neither is non-empty or more than one
is set, return the existing error about exactly one of data or url being set (or
a similar clear error) so blank media payloads are rejected centrally.
- Around line 29-31: The EmbeddingInput struct's Contents field lacks a JSON tag
and therefore marshals as "Contents" instead of the expected snake_case
"contents"; update the EmbeddingInput definition by adding a JSON struct tag
(e.g., `json:"contents"` or `json:"contents,omitempty"`) to the Contents field
so it serializes as "contents" consistently across the schema (reference the
EmbeddingInput type and its Contents field).

In `@framework/logstore/tables.go`:
- Line 186: The EmbeddingInputParsed field is typed as schemas.EmbeddingContent
which doesn't exist; update the type to the actual exported embedding-content
type from core/schemas and adjust the import so the file uses that package
(e.g., import core/schemas package and replace schemas.EmbeddingContent with the
exported name from core/schemas). Ensure the field declaration
EmbeddingInputParsed []<CorrectExportedType> `gorm:"-"
json:"embedding_input,omitempty"` uses the exact exported identifier and that
any references to schemas in this file are updated to the correct package alias.

In `@plugins/semanticcache/utils.go`:
- Around line 450-459: The loop that builds textParts from
req.EmbeddingRequest.Input.Contents currently returns the raw joined string,
skipping normalizeText() and causing inconsistent caching; update this path to
normalize the text the same way as getNormalizedInputForCaching() by applying
normalizeText() (either to each *part.Text before appending to textParts or to
the final strings.Join(textParts, " ")) so the returned string is the normalized
value used for hashing and embedding lookup.

In `@tests/integrations/python/tests/test_google.py`:
- Around line 25-27: The module docstring's numbered list has drifted (14 → 45 →
46 → 15); update the two inserted items titled "Multimodal embedding - text +
image content (Gemini/Vertex gemini-embedding-2-preview)" and "Multimodal
embedding - batch contents (Gemini/Vertex gemini-embedding-2-preview)" to follow
the sequence (change 45 → 15 and 46 → 16) and then adjust any subsequent list
entries (e.g., the existing "15. List models") so numbering remains monotonic
(increment following items as needed).

In `@transports/bifrost-http/handlers/inference.go`:
- Around line 1085-1089: Remove the early transport-side validation that checks
req.Input == nil || len(req.Input.Contents) == 0 and the call to
req.Input.Validate() in the inference HTTP handler; this validation is owned by
core/bifrost.go which implements the large-payload passthrough exception. Either
delete these two checks from the top of the handler or, if you must validate at
transport layer, move the checks to after the Bifrost context conversion step
and mirror the core passthrough logic (allow large/multimodal inputs through
when the Bifrost context indicates passthrough) so the handler forwards inputs
as-is and lets provider/core rejections propagate.

---

Outside diff comments:
In `@core/providers/bedrock/invoke.go`:
- Around line 1002-1019: The shim always reads d.Embedding.Float and emits a
float-only envelope (BedrockInvokeCohereEmbeddingResp with ResponseType
"embeddings_floats") even when the provider populated other types; update the
embedding response construction in the Titan/resp.Data path to inspect
EmbeddingsByType (e.g., check resp.Data[*].Embedding.EmbeddingsByType and its
Float, Int8, Uint8, Binary, Ubinary, Base64 fields) and branch accordingly to
return the matching native envelope (preserving Int8/Uint8/Binary/Ubinary/Base64
formats) or return an explicit error instead of always flattening to float;
adjust the code paths that currently reference d.Embedding.Float and
resp.Data[0].Embedding.Float to use the correct EmbeddingsByType fields and set
ResponseType to the correct native type when building
BedrockInvokeCohereEmbeddingResp or BedrockInvokeEmbeddingResp.

In `@core/providers/gemini/types.go`:
- Around line 1255-1279: The code only falls back to top-level file_data into
aux.FileDataSnake but does not translate nested snake_case fields into the
canonical FileData, so Python-style payloads like file_data: {"file_uri": "...",
"mime_type": "..."} end up as empty fields. Update the unmarshal/assignment
logic for PartAlias/FileData handling: add a FileDataSnake struct (or ensure
aux.FileDataSnake exists) that defines snake_case fields (file_uri, mime_type,
display_name), then after sonic.Unmarshal map those snake_case fields into the
canonical FileData fields (e.g., p.FileData.FileUri = aux.FileDataSnake.FileUri,
p.FileData.MimeType = aux.FileDataSnake.MimeType, p.FileData.DisplayName =
aux.FileDataSnake.DisplayName) when p.FileData is nil so nested snake_case
payloads are correctly populated; keep references to PartAlias, FileData,
FileDataSnake, p.FileData and sonic.Unmarshal to locate where to modify the
code.

In `@ui/app/workspace/logs/sheets/logDetailView.tsx`:
- Around line 947-953: The isEmbedding detection currently derives from
log.object === "list", causing real embedding logs to be missed and list-model
logs misclassified; update the isEmbedding checks (the variable named
isEmbedding and any similar checks around the copy-request-body branch that
reference log.object) to use log.object === "embedding" (and include the chunk
variant if needed, e.g., "embedding.chunk") so the copy button uses
log.embedding_input for true embedding logs; apply the same replacement at the
other occurrence near the copy-request-body logic (the block around the second
isEmbedding declaration).

---

Nitpick comments:
In `@plugins/semanticcache/test_utils.go`:
- Around line 555-562: Replace direct address-of usages when building the
EmbeddingInput contents with the repo helper: in the anonymous Input func that
returns *schemas.EmbeddingInput (the block creating contents :=
make([]schemas.EmbeddingContent, len(texts)) and assigning contents[i] =
schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text:
&t}}), use bifrost.Ptr(t) instead of &t so the Text pointer is created via
bifrost.Ptr(...) for consistency with project conventions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ad84d434-521e-4a4d-8678-782f164b7906

📥 Commits

Reviewing files that changed from the base of the PR and between 6f29fb7 and 4f21ee8.

📒 Files selected for processing (51)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/embedding.go
  • core/internal/llmtests/embedding_multimodal.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/tests.go
  • core/internal/llmtests/utils.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/embedding.go
  • core/providers/bedrock/invoke.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/embedding.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/cohere/types.go
  • core/providers/fireworks/fireworks_test.go
  • core/providers/gemini/embedding.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/huggingface/embedding.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/openai/embedding.go
  • core/providers/openai/openai.go
  • core/providers/openai/types.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/providers/vertex/types.go
  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/embedding.go
  • core/schemas/embedding_multimodal_test.go
  • core/schemas/serialization_test.go
  • core/schemas/trace.go
  • framework/logstore/migrations.go
  • framework/logstore/tables.go
  • framework/tracing/llmspan.go
  • plugins/logging/main.go
  • plugins/logging/utils.go
  • plugins/logging/writer.go
  • plugins/semanticcache/go.mod
  • plugins/semanticcache/plugin_core_test.go
  • plugins/semanticcache/test_utils.go
  • plugins/semanticcache/utils.go
  • tests/integrations/python/config.yml
  • tests/integrations/python/tests/test_cohere.py
  • tests/integrations/python/tests/test_google.py
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/cohere.go
  • ui/app/workspace/logs/sheets/logDetailView.tsx
  • ui/lib/types/logs.ts
💤 Files with no reviewable changes (1)
  • plugins/semanticcache/plugin_core_test.go
✅ Files skipped from review due to trivial changes (12)
  • core/schemas/embedding_multimodal_test.go
  • core/providers/azure/azure.go
  • core/providers/cohere/types.go
  • core/providers/vertex/embedding_multimodal_test.go
  • framework/logstore/migrations.go
  • plugins/semanticcache/go.mod
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/gemini/gemini.go
  • tests/integrations/python/tests/test_cohere.py
  • core/providers/gemini/gemini_test.go
  • core/bifrost.go
🚧 Files skipped from review as they are similar to previous changes (16)
  • plugins/logging/writer.go
  • core/internal/llmtests/embedding.go
  • core/internal/llmtests/tests.go
  • core/internal/llmtests/utils.go
  • plugins/logging/main.go
  • core/providers/cohere/cohere_test.go
  • core/providers/vertex/vertex_test.go
  • core/internal/llmtests/account.go
  • core/providers/fireworks/fireworks_test.go
  • tests/integrations/python/config.yml
  • core/providers/huggingface/huggingface_test.go
  • core/providers/openai/embedding.go
  • core/schemas/serialization_test.go
  • core/providers/cohere/embedding.go
  • core/providers/openai/openai.go
  • core/providers/bedrock/embedding.go

Comment thread core/internal/llmtests/embedding_multimodal.go Outdated
Comment thread core/internal/llmtests/response_validation.go
Comment on lines +397 to +409
inputText := r.InputText
req.Input = &schemas.EmbeddingInput{
Contents: []schemas.EmbeddingContent{
{{Type: schemas.EmbeddingContentPartTypeText, Text: &inputText}},
},
}
} else if len(r.Texts) > 0 {
req.Input = &schemas.EmbeddingInput{Texts: r.Texts}
contents := make([]schemas.EmbeddingContent, len(r.Texts))
for i, t := range r.Texts {
text := t
contents[i] = schemas.EmbeddingContent{{Type: schemas.EmbeddingContentPartTypeText, Text: &text}}
}
req.Input = &schemas.EmbeddingInput{Contents: contents}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Image-only and mixed Bedrock embed requests still arrive as nil input.

DetectInvokeRequestType now routes images/inputs bodies into the embedding path, but this converter only populates req.Input.Contents for inputText and texts. For image-only or mixed Cohere requests, req.Input stays nil and the core embedding validator rejects the request before provider-specific handling ever sees ExtraParams.

Please build schemas.EmbeddingContent entries from r.Images / r.Inputs here instead of leaving those payloads only in ExtraParams.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 397 - 409, The embedding
request builder currently only sets req.Input.Contents for r.InputText and
r.Texts, leaving req.Input nil for image-only or mixed requests; update the code
in the embedding path that creates req.Input so it also iterates r.Images and
r.Inputs and appends corresponding schemas.EmbeddingContent entries (using the
proper EmbeddingContentPartType for images/inputs and assigning the image URL or
input payload pointers) so req.Input is always non-nil when any of r.InputText,
r.Texts, r.Images, or r.Inputs are present; modify the section that constructs
contents (the block that sets contents := make([]schemas.EmbeddingContent, ...))
to include entries for r.Images and r.Inputs and then set req.Input =
&schemas.EmbeddingInput{Contents: contents}.

Comment thread core/providers/gemini/embedding.go Outdated
Comment thread core/providers/gemini/embedding.go Outdated
Comment thread core/schemas/embedding.go
Comment on lines +66 to +80
func (m *EmbeddingMediaPart) Validate() error {
if m == nil {
return fmt.Errorf("embedding media payload is nil")
}
set := 0
if m.Data != nil {
set++
}
if m.URL != nil {
set++
}
if set != 1 {
return fmt.Errorf("embedding media payload must set exactly one of data or url")
}
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject blank media payloads in the central validator.

Lines 71-79 only check whether Data/URL are non-nil, so {"data":""} and {"url":""} still pass validation even though they contain no usable media. That pushes malformed multimodal requests further downstream than this new validation layer intends.

Suggested fix
 func (m *EmbeddingMediaPart) Validate() error {
 	if m == nil {
 		return fmt.Errorf("embedding media payload is nil")
 	}
 	set := 0
 	if m.Data != nil {
+		if *m.Data == "" {
+			return fmt.Errorf("embedding media data is empty")
+		}
 		set++
 	}
 	if m.URL != nil {
+		if *m.URL == "" {
+			return fmt.Errorf("embedding media url is empty")
+		}
 		set++
 	}
 	if set != 1 {
 		return fmt.Errorf("embedding media payload must set exactly one of data or url")
 	}
 	return nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (m *EmbeddingMediaPart) Validate() error {
if m == nil {
return fmt.Errorf("embedding media payload is nil")
}
set := 0
if m.Data != nil {
set++
}
if m.URL != nil {
set++
}
if set != 1 {
return fmt.Errorf("embedding media payload must set exactly one of data or url")
}
return nil
func (m *EmbeddingMediaPart) Validate() error {
if m == nil {
return fmt.Errorf("embedding media payload is nil")
}
set := 0
if m.Data != nil {
if *m.Data == "" {
return fmt.Errorf("embedding media data is empty")
}
set++
}
if m.URL != nil {
if *m.URL == "" {
return fmt.Errorf("embedding media url is empty")
}
set++
}
if set != 1 {
return fmt.Errorf("embedding media payload must set exactly one of data or url")
}
return nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/embedding.go` around lines 66 - 80, The
EmbeddingMediaPart.Validate currently only checks for non-nil Data/URL and
allows empty payloads; update EmbeddingMediaPart.Validate to treat empty values
as unset by verifying that Data and URL are not only non-nil but also contain
non-zero length (e.g., check len of the underlying string/byte slice pointed to
by Data and the string pointed to by URL) and only increment set when they are
non-empty; if neither is non-empty or more than one is set, return the existing
error about exactly one of data or url being set (or a similar clear error) so
blank media payloads are rejected centrally.

Comment thread framework/logstore/tables.go
Comment thread plugins/semanticcache/utils.go Outdated
Comment thread tests/integrations/python/tests/test_google.py
Comment thread transports/bifrost-http/handlers/inference.go
@BearTS BearTS force-pushed the 04-03-feat_multimodal_embeddings branch from 4f21ee8 to 258c523 Compare May 4, 2026 06:53
Comment thread plugins/semanticcache/utils.go
Comment thread core/providers/cohere/embedding.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/providers/bedrock/invoke.go (2)

413-452: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Map max_tokens and truncate onto typed embedding params.

These fields are only forwarded via ExtraParams here, but cohere.ToCohereEmbeddingRequest() reads Params.MaxTokens and Params.Truncate. Bedrock-native Cohere embedding requests therefore drop both settings on the provider hop.

🛠️ Suggested fix
 	if r.Truncate != nil {
-		extraParams["truncate"] = *r.Truncate
+		// handled via typed params below
 	}
 	if len(r.Images) > 0 {
 		extraParams["images"] = r.Images
 	}
 	if len(r.Inputs) > 0 {
 		extraParams["inputs"] = r.Inputs
 	}
 	if r.MaxTokens != nil {
-		extraParams["max_tokens"] = *r.MaxTokens
+		// handled via typed params below
 	}
@@
 	params := &schemas.EmbeddingParameters{
 		Dimensions: dimensions,
+		MaxTokens:  r.MaxTokens,
+		Truncate:   r.Truncate,
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 413 - 452, The code currently
only forwards MaxTokens and Truncate via extraParams, but
cohere.ToCohereEmbeddingRequest expects typed fields on
schemas.EmbeddingParameters; update the mapping so that if r.MaxTokens != nil
set params.MaxTokens = *r.MaxTokens and if r.Truncate != nil set params.Truncate
= *r.Truncate (in addition to keeping them in extraParams if desired) before
assigning req.Params, ensuring schemas.EmbeddingParameters carries these typed
values for downstream cohere conversion.

999-1011: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve typed Cohere embeddings in native Bedrock responses.

This branch always emits embeddings_floats from d.Embedding.Float. If the Bifrost response carries Base64, Int8, Uint8, Binary, or Ubinary, Bedrock clients either lose the requested type or receive empty float vectors instead of the native embeddings_by_type envelope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 999 - 1011, The current branch
unconditionally converts embeddings from resp.Data using d.Embedding.Float and
returns a BedrockInvokeCohereEmbeddingResp with ResponseType
"embeddings_floats"; instead, detect which embedding field is present on each
resp.Data item (e.g., d.Embedding.Float, d.Embedding.Base64, d.Embedding.Int8,
d.Embedding.Uint8, d.Embedding.Binary, d.Embedding.Ubinary) and preserve the
native typed data in the response envelope rather than forcing float
conversion—update the logic around resp.Data iteration to build a payload that
keeps the original type and set ResponseType to "embeddings_by_type" (or the
appropriate native-type tag) when non-float embeddings are present so Bedrock
clients receive the original typed embeddings intact while still falling back to
the float path for Float embeddings; ensure you return a
BedrockInvokeCohereEmbeddingResp that contains the corresponding typed field(s)
rather than always populating Embeddings from d.Embedding.Float.
♻️ Duplicate comments (1)
core/internal/llmtests/response_validation.go (1)

1345-1360: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Base64 embeddings still validate as 0-dimensional.

These switches now accept Base64 as real embedding data indirectly via the earlier hasData check, but they never count it here. Any provider returning base64-encoded embeddings will fail expected_dimensions, and embedding_dimensions will be recorded as 0 even though the embedding is valid. Reuse getEmbeddingVector(...) in both places so validation and metrics stay aligned with the active embedding representations.

Proposed fix
 	if expectedDimensions, ok := expectations.ProviderSpecific["expected_dimensions"].(int); ok {
 		for i, embedding := range response.Data {
-			e := embedding.Embedding
-			var actualDimensions int
-			switch {
-			case len(e.Float) > 0:
-				actualDimensions = len(e.Float)
-			case len(e.Int8) > 0:
-				actualDimensions = len(e.Int8)
-			case len(e.Uint8) > 0:
-				actualDimensions = len(e.Uint8)
-			case len(e.Binary) > 0:
-				actualDimensions = len(e.Binary)
-			case len(e.Ubinary) > 0:
-				actualDimensions = len(e.Ubinary)
-			}
+			vec, err := getEmbeddingVector(embedding)
+			if err != nil {
+				result.Passed = false
+				result.Errors = append(result.Errors,
+					fmt.Sprintf("Embedding %d: failed to extract vector: %v", i, err))
+				continue
+			}
+			actualDimensions := len(vec)
 			if actualDimensions != expectedDimensions {
 				result.Passed = false
 				result.Errors = append(result.Errors,
 					fmt.Sprintf("Embedding %d has %d dimensions, expected %d", i, actualDimensions, expectedDimensions))
 			}
@@
 	result.MetricsCollected["has_data"] = response.Data != nil
 	result.MetricsCollected["embedding_count"] = len(response.Data)
 	result.MetricsCollected["has_usage"] = response.Usage != nil
 	if len(response.Data) > 0 {
-		e := response.Data[0].Embedding
-		var dimensions int
-		switch {
-		case len(e.Float) > 0:
-			dimensions = len(e.Float)
-		case len(e.Int8) > 0:
-			dimensions = len(e.Int8)
-		case len(e.Uint8) > 0:
-			dimensions = len(e.Uint8)
-		case len(e.Binary) > 0:
-			dimensions = len(e.Binary)
-		case len(e.Ubinary) > 0:
-			dimensions = len(e.Ubinary)
+		if vec, err := getEmbeddingVector(response.Data[0]); err == nil {
+			result.MetricsCollected["embedding_dimensions"] = len(vec)
 		}
-		result.MetricsCollected["embedding_dimensions"] = dimensions
 	}
 }

Also applies to: 1530-1545

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/response_validation.go` around lines 1345 - 1360, The
expected-dimensions check currently inspects only
Float/Int8/Uint8/Binary/Ubinary fields and thus treats Base64 embeddings as
zero-length; update the logic that computes actualDimensions (the block inside
the if for expectations.ProviderSpecific["expected_dimensions"] and the similar
block around lines 1530-1545) to call and reuse getEmbeddingVector(...) to
derive the embedding vector length instead of the manual switch, so both
validation and recorded embedding_dimensions use the same representation; use
the returned slice length from getEmbeddingVector(response.Data[i].Embedding)
(handling any error/ok result per existing helper convention) to set
actualDimensions and keep metric/validation consistent.
🧹 Nitpick comments (1)
core/providers/openai/types.go (1)

134-135: 💤 Low value

Update the outdated inline comment.

The comment states "Can be string or []string" but OpenAIEmbeddingInput now supports four shapes: string, []string, []int, and [][]int.

📝 Suggested documentation fix
 type OpenAIEmbeddingRequest struct {
 	Model string                `json:"model"`
-	Input *OpenAIEmbeddingInput `json:"input"` // Can be string or []string
+	Input *OpenAIEmbeddingInput `json:"input"` // One-of: string, []string, []int (tokens), or [][]int (token batches)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/types.go` around lines 134 - 135, The inline comment on
the Input field of the embedding request is outdated; update the comment for
Input *OpenAIEmbeddingInput in core/providers/openai/types.go (next to the Model
field) to list all supported shapes: string, []string, []int, and [][]int (or
similar phrasing) so it accurately documents the OpenAIEmbeddingInput type's
four accepted forms.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@core/providers/bedrock/invoke.go`:
- Around line 413-452: The code currently only forwards MaxTokens and Truncate
via extraParams, but cohere.ToCohereEmbeddingRequest expects typed fields on
schemas.EmbeddingParameters; update the mapping so that if r.MaxTokens != nil
set params.MaxTokens = *r.MaxTokens and if r.Truncate != nil set params.Truncate
= *r.Truncate (in addition to keeping them in extraParams if desired) before
assigning req.Params, ensuring schemas.EmbeddingParameters carries these typed
values for downstream cohere conversion.
- Around line 999-1011: The current branch unconditionally converts embeddings
from resp.Data using d.Embedding.Float and returns a
BedrockInvokeCohereEmbeddingResp with ResponseType "embeddings_floats"; instead,
detect which embedding field is present on each resp.Data item (e.g.,
d.Embedding.Float, d.Embedding.Base64, d.Embedding.Int8, d.Embedding.Uint8,
d.Embedding.Binary, d.Embedding.Ubinary) and preserve the native typed data in
the response envelope rather than forcing float conversion—update the logic
around resp.Data iteration to build a payload that keeps the original type and
set ResponseType to "embeddings_by_type" (or the appropriate native-type tag)
when non-float embeddings are present so Bedrock clients receive the original
typed embeddings intact while still falling back to the float path for Float
embeddings; ensure you return a BedrockInvokeCohereEmbeddingResp that contains
the corresponding typed field(s) rather than always populating Embeddings from
d.Embedding.Float.

---

Duplicate comments:
In `@core/internal/llmtests/response_validation.go`:
- Around line 1345-1360: The expected-dimensions check currently inspects only
Float/Int8/Uint8/Binary/Ubinary fields and thus treats Base64 embeddings as
zero-length; update the logic that computes actualDimensions (the block inside
the if for expectations.ProviderSpecific["expected_dimensions"] and the similar
block around lines 1530-1545) to call and reuse getEmbeddingVector(...) to
derive the embedding vector length instead of the manual switch, so both
validation and recorded embedding_dimensions use the same representation; use
the returned slice length from getEmbeddingVector(response.Data[i].Embedding)
(handling any error/ok result per existing helper convention) to set
actualDimensions and keep metric/validation consistent.

---

Nitpick comments:
In `@core/providers/openai/types.go`:
- Around line 134-135: The inline comment on the Input field of the embedding
request is outdated; update the comment for Input *OpenAIEmbeddingInput in
core/providers/openai/types.go (next to the Model field) to list all supported
shapes: string, []string, []int, and [][]int (or similar phrasing) so it
accurately documents the OpenAIEmbeddingInput type's four accepted forms.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d8894dcf-a494-49dc-a489-43c10a4908c1

📥 Commits

Reviewing files that changed from the base of the PR and between 4f21ee8 and 258c523.

📒 Files selected for processing (34)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/embedding.go
  • core/internal/llmtests/embedding_multimodal.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/tests.go
  • core/internal/llmtests/utils.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/embedding.go
  • core/providers/bedrock/invoke.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/embedding.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/cohere/types.go
  • core/providers/fireworks/fireworks_test.go
  • core/providers/gemini/embedding.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/huggingface/embedding.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/openai/embedding.go
  • core/providers/openai/openai.go
  • core/providers/openai/types.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/providers/vertex/types.go
  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/embedding.go
  • core/schemas/embedding_multimodal_test.go
  • core/schemas/serialization_test.go
  • core/schemas/trace.go
✅ Files skipped from review due to trivial changes (12)
  • core/providers/cohere/types.go
  • core/schemas/embedding_multimodal_test.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/internal/llmtests/embedding.go
  • core/providers/openai/openai.go
  • core/providers/gemini/types.go
  • core/providers/vertex/vertex_test.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/gemini/gemini_test.go
  • core/schemas/embedding.go
  • core/bifrost.go
  • core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/fireworks/fireworks_test.go
  • core/providers/openai/embedding.go
  • core/providers/gemini/gemini.go
  • core/internal/llmtests/tests.go
  • core/internal/llmtests/utils.go
  • core/schemas/trace.go
  • core/providers/vertex/vertex.go

@TejasGhatte TejasGhatte changed the base branch from v1.5.0 to graphite-base/2559 May 7, 2026 06:06
@TejasGhatte TejasGhatte force-pushed the graphite-base/2559 branch from f42eac2 to f83ea4a Compare May 7, 2026 06:06
@TejasGhatte TejasGhatte force-pushed the 04-03-feat_multimodal_embeddings branch from 258c523 to 3ce9372 Compare May 7, 2026 06:06
@TejasGhatte TejasGhatte changed the base branch from graphite-base/2559 to main May 7, 2026 06:06
Comment thread core/schemas/embedding.go
Comment on lines +226 to 233
type EmbeddingsByType struct {
Float []float64 `json:"float,omitempty"` // Float embeddings
Int8 []int8 `json:"int8,omitempty"` // Int8 embeddings
Uint8 []uint8 `json:"uint8,omitempty"` // Uint8 embeddings
Binary []int8 `json:"binary,omitempty"` // Binary embeddings
Ubinary []uint8 `json:"ubinary,omitempty"` // Unsigned binary embeddings
Base64 *string `json:"base64,omitempty"` // Base64 embeddings
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Historical EmbeddingOutput data silently unreadable after migration

EmbeddingsByType has no custom UnmarshalJSON. Old rows in the embedding_output column stored EmbeddingData.Embedding via the old EmbeddingStruct.UnmarshalJSON, which serialised as a raw JSON array or string (e.g. [0.1, 0.2] or "base64…"). Trying to decode those values into the new struct with field names ({"float":[…]}) fails — sonic returns an error and DeserializeFields silently sets EmbeddingOutputParsed = nil. All historical embedding outputs will show as empty in the UI.

A migration for embedding_output analogous to migrationAddEmbeddingInputColumn is needed, or EmbeddingsByType needs a custom UnmarshalJSON that recognises the legacy []float64 / string forms.

Comment on lines +367 to +370
responseType := "embeddings_by_type"
if hasFloat && !hasBase64 {
responseType = "embeddings_floats"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 responseType is wrong when float and quantised types are both present

hasFloat && !hasBase64 only guards against the base64 case. If the response contains float and Int8/Uint8/Binary/Ubinary embeddings (a valid multi-type Cohere response), the condition is still true and responseType is set to "embeddings_floats" instead of "embeddings_by_type". The check should require that float is the only populated type.

Suggested change
responseType := "embeddings_by_type"
if hasFloat && !hasBase64 {
responseType = "embeddings_floats"
}
responseType := "embeddings_by_type"
if hasFloat && !hasBase64 &&
len(cohereResp.Embeddings.Int8) == 0 &&
len(cohereResp.Embeddings.Uint8) == 0 &&
len(cohereResp.Embeddings.Binary) == 0 &&
len(cohereResp.Embeddings.Ubinary) == 0 {
responseType = "embeddings_floats"
}

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/gemini/types.go (1)

1267-1292: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

file_data snake_case support still drops nested fields.

This fallback only switches the outer key from fileData to file_data. If the nested object also uses snake_case (file_uri, mime_type, display_name), FileData still unmarshals to empty fields because it only has camelCase tags. That means Python-SDK-style file parts still fail on this path.

💡 Suggested follow-up
 type FileData struct {
 	DisplayName string `json:"displayName,omitempty"`
 	FileURI     string `json:"fileUri,omitempty"`
 	MIMEType    string `json:"mimeType,omitempty"`
 }
+
+func (f *FileData) UnmarshalJSON(data []byte) error {
+	type alias struct {
+		DisplayName      string `json:"displayName,omitempty"`
+		DisplayNameSnake string `json:"display_name,omitempty"`
+		FileURI          string `json:"fileUri,omitempty"`
+		FileURISnake     string `json:"file_uri,omitempty"`
+		MIMEType         string `json:"mimeType,omitempty"`
+		MIMETypeSnake    string `json:"mime_type,omitempty"`
+	}
+
+	var aux alias
+	if err := sonic.Unmarshal(data, &aux); err != nil {
+		return err
+	}
+
+	if aux.DisplayName != "" {
+		f.DisplayName = aux.DisplayName
+	} else {
+		f.DisplayName = aux.DisplayNameSnake
+	}
+	if aux.FileURI != "" {
+		f.FileURI = aux.FileURI
+	} else {
+		f.FileURI = aux.FileURISnake
+	}
+	if aux.MIMEType != "" {
+		f.MIMEType = aux.MIMEType
+	} else {
+		f.MIMEType = aux.MIMETypeSnake
+	}
+	return nil
+}
♻️ Duplicate comments (3)
core/providers/gemini/embedding.go (1)

396-407: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don’t collapse mixed per-item params onto the first batch entry.

BifrostEmbeddingRequest only has one shared Params, but this path always derives it from request.Requests[0]. If later Gemini batch items carry different taskType, title, outputDimensionality, documentOcr, or audioTrackExtraction, that information is silently dropped during reverse conversion. Either verify all entries agree before collapsing them, or reject heterogeneous batches here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/gemini/embedding.go` around lines 396 - 407, The conversion in
the Gemini->Bifrost path currently collapses per-item params into a single
Params using applyBifrostEmbeddingParams(bifrostReq.Params, request.Requests[0])
which silently drops differing per-request fields; update the logic in the
function handling request.Requests so that before collapsing you iterate
request.Requests and verify that taskType, title, outputDimensionality,
documentOcr, and audioTrackExtraction (the per-item fields) are identical across
all entries — if they are identical set bifrostReq.Params once from any entry,
otherwise return an error/reject the batch (or alternatively return a clear
validation error) so heterogeneous batches are not silently lost; reference
geminiContentToEmbeddingContent and BifrostEmbeddingRequest when implementing
the check.
core/providers/vertex/embedding.go (2)

212-237: ⚠️ Potential issue | 🟠 Major

Emit exactly one embedding row per prediction.

This block can append a text embedding, an image embedding, and multiple video embeddings for a single prediction, which breaks the 1 content -> 1 output invariant and misaligns Data[index] with the original Input.Contents entry.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/vertex/embedding.go` around lines 212 - 237, The code
currently appends multiple embedding rows per prediction (text, image, and every
video), breaking the 1 content -> 1 output invariant; change the logic in the
embedding construction (the block that references prediction.TextEmbedding,
prediction.ImageEmbedding, prediction.VideoEmbeddings, schemas.EmbeddingData and
idx) to pick exactly one embedding per prediction—e.g., choose
prediction.TextEmbedding if present, else prediction.ImageEmbedding, else the
first entry of prediction.VideoEmbeddings—and append a single
schemas.EmbeddingData with that chosen vector and increment idx once. Ensure you
copy the chosen slice into EmbeddingsByType{Float: append([]float64(nil), ...)}
as before and remove the multiple appends/loop.

136-167: ⚠️ Potential issue | 🟠 Major

Aggregate parts instead of overwriting them.

One EmbeddingContent is supposed to aggregate its parts into one embedding, but instance.Text = part.Text, instance.Image = img, and instance.Video = vid keep only the last value of each modality. Earlier text fragments or duplicate media parts are silently discarded.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/vertex/embedding.go` around lines 136 - 167, The loop
currently overwrites instance.Text/instance.Image/instance.Video for each part,
losing earlier parts; change it to aggregate parts: concatenate text fragments
into instance.Text (e.g., append with separator or join), and collect image and
video inputs into slices (e.g., instance.Images []VertexMultimodalImageInput and
instance.Videos []VertexMultimodalVideoInput) instead of single pointers; update
the Vertex multimodal instance type and any downstream consumers that reference
instance.Image/instance.Video to handle slices, and in the switch cases append
the created img/vid values rather than assigning them.
🧹 Nitpick comments (2)
core/providers/openai/openai.go (1)

1960-1961: ⚡ Quick win

Let core populate provider/model attribution fields.

Provider and ResolvedModelUsed are owned by the centralized extra-field population path for unary responses. Setting them here makes embeddings inconsistent with the other OpenAI handlers and risks freezing provider-layer values before the core enricher runs.

Based on learnings: Provider, OriginalModelRequested, and ResolvedModelUsed on BifrostResponseExtraFields are populated centrally by (*BifrostResponse).PopulateExtraFields in core/bifrost.go.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/openai/openai.go` around lines 1960 - 1961, Remove the
premature population of provider/model attribution in
core/providers/openai/openai.go by deleting the assignments to
response.ExtraFields.Provider and response.ExtraFields.ResolvedModelUsed (and
any similar direct writes such as OriginalModelRequested) so the centralized
population path can run; locate the block where response.ExtraFields is being
set in the OpenAI handler and remove those provider/model assignments, leaving
other extra-field writes intact and relying on
(*BifrostResponse).PopulateExtraFields in core/bifrost.go to populate Provider,
OriginalModelRequested, and ResolvedModelUsed.
core/providers/azure/azure.go (1)

880-883: ⚡ Quick win

Let core populate ExtraFields.Provider.

This field is supposed to be filled centrally on successful responses. Setting it in the provider duplicates that contract and makes this path easier to drift from the rest of the stack.

♻️ Suggested cleanup
 response := openaiResp.ToBifrostEmbeddingResponse()
 
- response.ExtraFields.Provider = provider.GetProviderKey()
 response.ExtraFields.Latency = latency.Milliseconds()
 response.ExtraFields.ProviderResponseHeaders = providerResponseHeaders

Based on learnings: In maximhq/bifrost, the fields OriginalModelRequested, Provider, and ResolvedModelUsed on BifrostResponseExtraFields are populated centrally by (*BifrostResponse).PopulateExtraFields in core/bifrost.go for all response paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/azure/azure.go` around lines 880 - 883, Remove the
provider-level population of ExtraFields.Provider in the Azure embedding
response path: after calling openaiResp.ToBifrostEmbeddingResponse() you should
not set response.ExtraFields.Provider = provider.GetProviderKey(); leave other
fields (e.g., response.ExtraFields.Latency = latency.Milliseconds()) intact. The
core centralizer (*BifrostResponse).PopulateExtraFields is responsible for
filling OriginalModelRequested, Provider, and ResolvedModelUsed, so delete the
assignment of Provider here to avoid duplicating/contradicting that contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/providers/gemini/embedding.go`:
- Around line 190-206: The loop is incorrectly copying the aggregate
bifrostResp.Usage.PromptTokens into each GeminiEmbedding.Statistics causing
overcounting; remove the per-item assignment inside the loop in embedding.go (do
not set ContentEmbeddingStatistics within the for i, embedding := range
bifrostResp.Data block) and instead only populate geminiResp.Embeddings with
Values there; after the loop, if there is exactly one embedding
(len(geminiResp.Embeddings) == 1) set geminiResp.Embedding and then assign
Statistics on that single embedding using bifrostResp.Usage.PromptTokens (via
ContentEmbeddingStatistics) while keeping the aggregate usage in
geminiResp.Metadata as you already do; do not create per-item Statistics from
aggregate Usage when batching.

In `@core/providers/openai/embedding.go`:
- Around line 135-160: The loop over each content entry can silently drop
entries when all parts have nil/empty Text and no Tokens, causing index
mismatches; after the inner loop that builds sb and tokens (the loop iterating
over part in content and using sb, tokens, texts, tokenBatches), add a guard: if
sb.Len() == 0 && len(tokens) == 0 return an error (e.g. fmt.Errorf("openai
embedding received empty content entry: all parts have nil/empty Text and no
Tokens")) so the caller is notified instead of silently discarding the input;
this mirrors the defensive behavior in extractTextFromContent used in the
HuggingFace converter.

In `@core/providers/openai/openai.go`:
- Around line 1945-1959: The large-payload branch must use the same
OpenAI→Bifrost conversion as the normal path: instead of unmarshalling directly
into schemas.BifrostEmbeddingResponse, unmarshal into OpenaiEmbeddingResponse
and call its ToBifrostEmbeddingResponse() to produce the Bifrost response; also
populate rawRequest/rawResponse via customResponseHandler or
providerUtils.HandleProviderResponse just like the normal branch, and use
providerUtils.EnrichError for error handling to keep schema and multimodal/typed
embedding fields consistent across both paths (refer to OpenaiEmbeddingResponse,
ToBifrostEmbeddingResponse, customResponseHandler,
providerUtils.HandleProviderResponse, and providerUtils.EnrichError).

In `@core/providers/vertex/embedding.go`:
- Around line 148-149: Validate that part.Image.URL and part.Video.URL start
with "gs://” before setting img.GCSUri or vid.GCSUri: use
strings.HasPrefix(*part.Image.URL, "gs://") and
strings.HasPrefix(*part.Video.URL, "gs://") and if the check fails return an
error (e.g., fmt.Errorf("image/video URL must be a GCS URI (gs://...)")) instead
of assigning; add the necessary strings import and apply the same validation
logic to both the image assignment (img.GCSUri) and the video assignment
(vid.GCSUri) so non-gs:// URIs are rejected locally rather than sent to Vertex.

In `@core/schemas/embedding.go`:
- Around line 226-232: The EmbeddingsByType struct currently declares Binary
[]int8 and Ubinary []uint8 which breaks the public contract; change these fields
to Binary []byte and Ubinary []byte (keeping their json tags and other fields
unchanged) so the struct matches the repo-wide contract and callers/converters
expecting packed-binary embeddings continue to work; update any related comments
or docs referencing Binary/Ubinary types if present.
- Around line 117-152: The EmbeddingContentPart validator currently only checks
p.Tokens for nil, allowing an empty slice (Tokens: []int{}) to pass; update the
EmbeddingContentPartTypeTokens branch in the Validate method to reject empty
token slices by checking len(p.Tokens) == 0 and returning an error like
"embedding content part type %q requires non-empty tokens payload" when
zero-length, instead of only checking for nil.

---

Duplicate comments:
In `@core/providers/gemini/embedding.go`:
- Around line 396-407: The conversion in the Gemini->Bifrost path currently
collapses per-item params into a single Params using
applyBifrostEmbeddingParams(bifrostReq.Params, request.Requests[0]) which
silently drops differing per-request fields; update the logic in the function
handling request.Requests so that before collapsing you iterate request.Requests
and verify that taskType, title, outputDimensionality, documentOcr, and
audioTrackExtraction (the per-item fields) are identical across all entries — if
they are identical set bifrostReq.Params once from any entry, otherwise return
an error/reject the batch (or alternatively return a clear validation error) so
heterogeneous batches are not silently lost; reference
geminiContentToEmbeddingContent and BifrostEmbeddingRequest when implementing
the check.

In `@core/providers/vertex/embedding.go`:
- Around line 212-237: The code currently appends multiple embedding rows per
prediction (text, image, and every video), breaking the 1 content -> 1 output
invariant; change the logic in the embedding construction (the block that
references prediction.TextEmbedding, prediction.ImageEmbedding,
prediction.VideoEmbeddings, schemas.EmbeddingData and idx) to pick exactly one
embedding per prediction—e.g., choose prediction.TextEmbedding if present, else
prediction.ImageEmbedding, else the first entry of
prediction.VideoEmbeddings—and append a single schemas.EmbeddingData with that
chosen vector and increment idx once. Ensure you copy the chosen slice into
EmbeddingsByType{Float: append([]float64(nil), ...)} as before and remove the
multiple appends/loop.
- Around line 136-167: The loop currently overwrites
instance.Text/instance.Image/instance.Video for each part, losing earlier parts;
change it to aggregate parts: concatenate text fragments into instance.Text
(e.g., append with separator or join), and collect image and video inputs into
slices (e.g., instance.Images []VertexMultimodalImageInput and instance.Videos
[]VertexMultimodalVideoInput) instead of single pointers; update the Vertex
multimodal instance type and any downstream consumers that reference
instance.Image/instance.Video to handle slices, and in the switch cases append
the created img/vid values rather than assigning them.

---

Nitpick comments:
In `@core/providers/azure/azure.go`:
- Around line 880-883: Remove the provider-level population of
ExtraFields.Provider in the Azure embedding response path: after calling
openaiResp.ToBifrostEmbeddingResponse() you should not set
response.ExtraFields.Provider = provider.GetProviderKey(); leave other fields
(e.g., response.ExtraFields.Latency = latency.Milliseconds()) intact. The core
centralizer (*BifrostResponse).PopulateExtraFields is responsible for filling
OriginalModelRequested, Provider, and ResolvedModelUsed, so delete the
assignment of Provider here to avoid duplicating/contradicting that contract.

In `@core/providers/openai/openai.go`:
- Around line 1960-1961: Remove the premature population of provider/model
attribution in core/providers/openai/openai.go by deleting the assignments to
response.ExtraFields.Provider and response.ExtraFields.ResolvedModelUsed (and
any similar direct writes such as OriginalModelRequested) so the centralized
population path can run; locate the block where response.ExtraFields is being
set in the OpenAI handler and remove those provider/model assignments, leaving
other extra-field writes intact and relying on
(*BifrostResponse).PopulateExtraFields in core/bifrost.go to populate Provider,
OriginalModelRequested, and ResolvedModelUsed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2d61229a-9fb6-4a1c-b16b-18d6d4c77bf9

📥 Commits

Reviewing files that changed from the base of the PR and between 258c523 and 3ce9372.

📒 Files selected for processing (33)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/embedding.go
  • core/internal/llmtests/embedding_multimodal.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/tests.go
  • core/internal/llmtests/utils.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/embedding.go
  • core/providers/bedrock/invoke.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/embedding.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/cohere/types.go
  • core/providers/fireworks/fireworks_test.go
  • core/providers/gemini/embedding.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/huggingface/embedding.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/mistral/custom_provider_test.go
  • core/providers/openai/embedding.go
  • core/providers/openai/openai.go
  • core/providers/openai/types.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/embedding_multimodal_test.go
  • core/providers/vertex/types.go
  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/embedding.go
  • core/schemas/embedding_multimodal_test.go
✅ Files skipped from review due to trivial changes (8)
  • core/providers/cohere/types.go
  • core/schemas/embedding_multimodal_test.go
  • core/internal/llmtests/tests.go
  • core/providers/gemini/gemini.go
  • core/providers/cohere/embedding_multimodal_test.go
  • core/providers/huggingface/huggingface_test.go
  • core/providers/gemini/embedding_multimodal_test.go
  • core/providers/cohere/embedding.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • core/providers/vertex/embedding_multimodal_test.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/utils.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/embedding_multimodal.go
  • core/providers/vertex/vertex.go
  • core/bifrost.go

Comment on lines 190 to 206
for i, embedding := range bifrostResp.Data {
var values []float64

// Extract embedding values from BifrostEmbeddingResponse
if embedding.Embedding.EmbeddingArray != nil {
values = append([]float64(nil), embedding.Embedding.EmbeddingArray...)
} else if len(embedding.Embedding.Embedding2DArray) > 0 {
// If it's a 2D array, take the first array
values = append([]float64(nil), embedding.Embedding.Embedding2DArray[0]...)
}

geminiEmbedding := GeminiEmbedding{
Values: values,
}

// Add statistics if available (token count from usage metadata)
geminiEmbedding := GeminiEmbedding{Values: append([]float64(nil), embedding.Embedding.Float...)}
if bifrostResp.Usage != nil {
geminiEmbedding.Statistics = &ContentEmbeddingStatistics{
TokenCount: int32(bifrostResp.Usage.PromptTokens),
}
}

geminiResp.Embeddings[i] = geminiEmbedding
}

// Set metadata if available (for Vertex API compatibility)
if len(geminiResp.Embeddings) == 1 {
geminiResp.Embedding = &geminiResp.Embeddings[0]
}
if bifrostResp.Usage != nil {
geminiResp.Metadata = &EmbedContentMetadata{
BillableCharacterCount: int32(bifrostResp.Usage.PromptTokens),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t copy aggregate usage into every embedding’s Statistics.

Lines 190-196 assign the same bifrostResp.Usage.PromptTokens to every embedding item. For batched responses that value is aggregate request usage, so clients that sum per-item stats—and this file now does on the reverse path—will overcount by len(bifrostResp.Data). Only populate Statistics when you have true per-item counts, or restrict it to the single-embedding case and keep the aggregate in Metadata.

💡 Minimal safe change
 	for i, embedding := range bifrostResp.Data {
 		geminiEmbedding := GeminiEmbedding{Values: append([]float64(nil), embedding.Embedding.Float...)}
-		if bifrostResp.Usage != nil {
+		if bifrostResp.Usage != nil && len(bifrostResp.Data) == 1 {
 			geminiEmbedding.Statistics = &ContentEmbeddingStatistics{
 				TokenCount: int32(bifrostResp.Usage.PromptTokens),
 			}
 		}
 		geminiResp.Embeddings[i] = geminiEmbedding
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/gemini/embedding.go` around lines 190 - 206, The loop is
incorrectly copying the aggregate bifrostResp.Usage.PromptTokens into each
GeminiEmbedding.Statistics causing overcounting; remove the per-item assignment
inside the loop in embedding.go (do not set ContentEmbeddingStatistics within
the for i, embedding := range bifrostResp.Data block) and instead only populate
geminiResp.Embeddings with Values there; after the loop, if there is exactly one
embedding (len(geminiResp.Embeddings) == 1) set geminiResp.Embedding and then
assign Statistics on that single embedding using bifrostResp.Usage.PromptTokens
(via ContentEmbeddingStatistics) while keeping the aggregate usage in
geminiResp.Metadata as you already do; do not create per-item Statistics from
aggregate Usage when batching.

Comment on lines +50 to +56
var embStruct EmbeddingStruct
switch {
case d.Embedding.Base64 != nil:
embStruct.EmbeddingStr = d.Embedding.Base64
case d.Embedding.Float != nil:
embStruct.EmbeddingArray = d.Embedding.Float
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Silent data drop for Int8/Uint8/Binary/Ubinary embedding types.

EmbeddingsByType includes Int8, Uint8, Binary, and Ubinary fields, but only Float and Base64 are mapped. If embeddings from another provider (e.g., Cohere int8) flow through this converter, the output EmbeddingStruct will be entirely empty — no error, no data.

Add an explicit error for unhandled types so callers discover the issue rather than receiving silent zeroes:

🛡️ Proposed fix
 	switch {
 	case d.Embedding.Base64 != nil:
 		embStruct.EmbeddingStr = d.Embedding.Base64
 	case d.Embedding.Float != nil:
 		embStruct.EmbeddingArray = d.Embedding.Float
+	case d.Embedding.Int8 != nil || d.Embedding.Uint8 != nil || d.Embedding.Binary != nil || d.Embedding.Ubinary != nil:
+		return nil, fmt.Errorf("openai embedding response format does not support int8/uint8/binary embedding types")
 	}

Note: update the function signature to (*OpenaiEmbeddingResponse, error) accordingly, or handle the error at a higher level if a lossy conversion is preferred.

Comment on lines +135 to +160
for _, part := range content {
switch part.Type {
case schemas.EmbeddingContentPartTypeText:
if part.Text != nil {
if sb.Len() > 0 {
sb.WriteString(" \n")
}
sb.WriteString(*part.Text)
}
case schemas.EmbeddingContentPartTypeTokens:
if part.Tokens != nil {
tokens = append(tokens, part.Tokens...)
}
default:
return nil, fmt.Errorf("openai embedding does not support %q input", part.Type)
}
}
if sb.Len() > 0 && len(tokens) > 0 {
return nil, fmt.Errorf("openai embedding does not support mixing text and token inputs within a single content entry")
}
if sb.Len() > 0 {
texts = append(texts, sb.String())
} else if len(tokens) > 0 {
tokenBatches = append(tokenBatches, tokens)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Content entries with all-nil or all-empty-string Text parts are silently dropped, causing an index mismatch.

When part.Text is nil (line 138) or *part.Text is "" (contributes nothing to sb), the write is skipped. If every part in a content entry hits this path, the entry produces neither a texts element nor a tokenBatches element and is silently discarded. The caller then receives fewer embeddings than the number of input contents with no error to signal the mismatch.

extractTextFromContent in the HuggingFace converter correctly errors in these cases; the same defensive approach should be applied here:

🛡️ Proposed fix
 			case schemas.EmbeddingContentPartTypeText:
-				if part.Text != nil {
-					if sb.Len() > 0 {
-						sb.WriteString(" \n")
-					}
-					sb.WriteString(*part.Text)
+				if part.Text == nil {
+					return nil, fmt.Errorf("openai embedding: text part has nil Text pointer")
 				}
+				if sb.Len() > 0 {
+					sb.WriteString(" \n")
+				}
+				sb.WriteString(*part.Text)
 			case schemas.EmbeddingContentPartTypeTokens:
-				if part.Tokens != nil {
-					tokens = append(tokens, part.Tokens...)
+				if part.Tokens == nil {
+					return nil, fmt.Errorf("openai embedding: tokens part has nil Tokens slice")
 				}
+				tokens = append(tokens, part.Tokens...)

Then, after the inner loop, add a guard for an entirely empty content entry:

+		if sb.Len() == 0 && len(tokens) == 0 {
+			return nil, fmt.Errorf("openai embedding: content entry at index %d produced no usable input", /* index */)
+		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/openai/embedding.go` around lines 135 - 160, The loop over
each content entry can silently drop entries when all parts have nil/empty Text
and no Tokens, causing index mismatches; after the inner loop that builds sb and
tokens (the loop iterating over part in content and using sb, tokens, texts,
tokenBatches), add a guard: if sb.Len() == 0 && len(tokens) == 0 return an error
(e.g. fmt.Errorf("openai embedding received empty content entry: all parts have
nil/empty Text and no Tokens")) so the caller is notified instead of silently
discarding the input; this mirrors the defensive behavior in
extractTextFromContent used in the HuggingFace converter.

Comment on lines +1945 to +1959
openaiResp := &OpenaiEmbeddingResponse{}

var rawRequest, rawResponse interface{}

if customResponseHandler != nil {
rawRequest, rawResponse, bifrostErr = customResponseHandler(body, response, jsonData, sendBackRawRequest, sendBackRawResponse)
rawRequest, rawResponse, bifrostErr = customResponseHandler(body, openaiResp, jsonData, sendBackRawRequest, sendBackRawResponse)
} else {
rawRequest, rawResponse, bifrostErr = providerUtils.HandleProviderResponse(body, response, jsonData, sendBackRawRequest, sendBackRawResponse)
rawRequest, rawResponse, bifrostErr = providerUtils.HandleProviderResponse(body, openaiResp, jsonData, sendBackRawRequest, sendBackRawResponse)
}

if bifrostErr != nil {
return nil, providerUtils.EnrichError(ctx, bifrostErr, jsonData, body, sendBackRawRequest, sendBackRawResponse)
}

response := openaiResp.ToBifrostEmbeddingResponse()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the same OpenAI→Bifrost conversion in the large-payload path.

The normal path now decodes into OpenaiEmbeddingResponse and converts with ToBifrostEmbeddingResponse(), but the large-payload branch above at Line 1888 still unmarshals directly into schemas.BifrostEmbeddingResponse. That makes large embedding responses follow a different schema path and can drop multimodal/typed embedding fields exactly on the oversized responses this PR is adding.

💡 Suggested fix
 	if len(lpResult.ResponseBody) > 0 {
-		response := &schemas.BifrostEmbeddingResponse{}
-		if err := sonic.Unmarshal(lpResult.ResponseBody, response); err != nil {
+		openaiResp := &OpenaiEmbeddingResponse{}
+		if err := sonic.Unmarshal(lpResult.ResponseBody, openaiResp); err != nil {
 			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderResponseUnmarshal, err)
 		}
+		response := openaiResp.ToBifrostEmbeddingResponse()
 		response.ExtraFields = schemas.BifrostResponseExtraFields{Latency: lpResult.Latency}
 		return response, nil
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/openai/openai.go` around lines 1945 - 1959, The large-payload
branch must use the same OpenAI→Bifrost conversion as the normal path: instead
of unmarshalling directly into schemas.BifrostEmbeddingResponse, unmarshal into
OpenaiEmbeddingResponse and call its ToBifrostEmbeddingResponse() to produce the
Bifrost response; also populate rawRequest/rawResponse via customResponseHandler
or providerUtils.HandleProviderResponse just like the normal branch, and use
providerUtils.EnrichError for error handling to keep schema and multimodal/typed
embedding fields consistent across both paths (refer to OpenaiEmbeddingResponse,
ToBifrostEmbeddingResponse, customResponseHandler,
providerUtils.HandleProviderResponse, and providerUtils.EnrichError).

Comment on lines +148 to +149
} else if part.Image.URL != nil {
img.GCSUri = part.Image.URL
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate gs:// URLs before assigning gcsUri.

part.Image.URL is copied straight into gcsUri, and part.Video.URL is only checked for nil. That lets https://.../s3://... inputs through even though this request shape is explicitly GCS-only, so the failure gets deferred to Vertex instead of being rejected locally.

Proposed fix
 				if part.Image.Data != nil {
 					b64 := extractBase64FromDataURI(*part.Image.Data)
 					img.BytesBase64Encoded = &b64
 				} else if part.Image.URL != nil {
+					if !strings.HasPrefix(*part.Image.URL, "gs://") {
+						return nil, fmt.Errorf("vertex multimodal embedding requires a GCS URI for image input")
+					}
 					img.GCSUri = part.Image.URL
 				} else {
 					return nil, fmt.Errorf("image part must set data or url")
 				}
@@
-				if part.Video.URL == nil {
+				if part.Video.URL == nil || !strings.HasPrefix(*part.Video.URL, "gs://") {
 					return nil, fmt.Errorf("vertex multimodal embedding requires a GCS URI for video input")
 				}
 				vid := &VertexMultimodalVideoInput{GCSUri: part.Video.URL}

Also applies to: 158-161

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/providers/vertex/embedding.go` around lines 148 - 149, Validate that
part.Image.URL and part.Video.URL start with "gs://” before setting img.GCSUri
or vid.GCSUri: use strings.HasPrefix(*part.Image.URL, "gs://") and
strings.HasPrefix(*part.Video.URL, "gs://") and if the check fails return an
error (e.g., fmt.Errorf("image/video URL must be a GCS URI (gs://...)")) instead
of assigning; add the necessary strings import and apply the same validation
logic to both the image assignment (img.GCSUri) and the video assignment
(vid.GCSUri) so non-gs:// URIs are rejected locally rather than sent to Vertex.

Comment thread core/schemas/embedding.go
Comment on lines +117 to +152
if p.Tokens != nil {
set++
}
if set != 1 {
return fmt.Errorf("embedding content part must set exactly one modality")
}

switch p.Type {
case EmbeddingContentPartTypeText:
if p.Text == nil {
return fmt.Errorf("embedding content part type %q requires text payload", p.Type)
}
case EmbeddingContentPartTypeImage:
if p.Image == nil {
return fmt.Errorf("embedding content part type %q requires image payload", p.Type)
}
return p.Image.Validate()
case EmbeddingContentPartTypeAudio:
if p.Audio == nil {
return fmt.Errorf("embedding content part type %q requires audio payload", p.Type)
}
return p.Audio.Validate()
case EmbeddingContentPartTypeFile:
if p.File == nil {
return fmt.Errorf("embedding content part type %q requires file payload", p.Type)
}
return p.File.Validate()
case EmbeddingContentPartTypeVideo:
if p.Video == nil {
return fmt.Errorf("embedding content part type %q requires video payload", p.Type)
}
return p.Video.Validate()
case EmbeddingContentPartTypeTokens:
if p.Tokens == nil {
return fmt.Errorf("embedding content part type %q requires tokens payload", p.Type)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject empty token inputs in the central validator.

Tokens: []int{} currently passes validation because this branch only checks for nil. That lets an empty token payload through the new shared validation layer and pushes an invalid embedding request downstream.

Suggested fix
 	if p.Tokens != nil {
+		if len(p.Tokens) == 0 {
+			return fmt.Errorf("embedding content part type %q requires at least one token", p.Type)
+		}
 		set++
 	}
@@
 	case EmbeddingContentPartTypeTokens:
-		if p.Tokens == nil {
+		if len(p.Tokens) == 0 {
 			return fmt.Errorf("embedding content part type %q requires tokens payload", p.Type)
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if p.Tokens != nil {
set++
}
if set != 1 {
return fmt.Errorf("embedding content part must set exactly one modality")
}
switch p.Type {
case EmbeddingContentPartTypeText:
if p.Text == nil {
return fmt.Errorf("embedding content part type %q requires text payload", p.Type)
}
case EmbeddingContentPartTypeImage:
if p.Image == nil {
return fmt.Errorf("embedding content part type %q requires image payload", p.Type)
}
return p.Image.Validate()
case EmbeddingContentPartTypeAudio:
if p.Audio == nil {
return fmt.Errorf("embedding content part type %q requires audio payload", p.Type)
}
return p.Audio.Validate()
case EmbeddingContentPartTypeFile:
if p.File == nil {
return fmt.Errorf("embedding content part type %q requires file payload", p.Type)
}
return p.File.Validate()
case EmbeddingContentPartTypeVideo:
if p.Video == nil {
return fmt.Errorf("embedding content part type %q requires video payload", p.Type)
}
return p.Video.Validate()
case EmbeddingContentPartTypeTokens:
if p.Tokens == nil {
return fmt.Errorf("embedding content part type %q requires tokens payload", p.Type)
}
if p.Tokens != nil {
if len(p.Tokens) == 0 {
return fmt.Errorf("embedding content part type %q requires at least one token", p.Type)
}
set++
}
if set != 1 {
return fmt.Errorf("embedding content part must set exactly one modality")
}
switch p.Type {
case EmbeddingContentPartTypeText:
if p.Text == nil {
return fmt.Errorf("embedding content part type %q requires text payload", p.Type)
}
case EmbeddingContentPartTypeImage:
if p.Image == nil {
return fmt.Errorf("embedding content part type %q requires image payload", p.Type)
}
return p.Image.Validate()
case EmbeddingContentPartTypeAudio:
if p.Audio == nil {
return fmt.Errorf("embedding content part type %q requires audio payload", p.Type)
}
return p.Audio.Validate()
case EmbeddingContentPartTypeFile:
if p.File == nil {
return fmt.Errorf("embedding content part type %q requires file payload", p.Type)
}
return p.File.Validate()
case EmbeddingContentPartTypeVideo:
if p.Video == nil {
return fmt.Errorf("embedding content part type %q requires video payload", p.Type)
}
return p.Video.Validate()
case EmbeddingContentPartTypeTokens:
if len(p.Tokens) == 0 {
return fmt.Errorf("embedding content part type %q requires tokens payload", p.Type)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/schemas/embedding.go` around lines 117 - 152, The EmbeddingContentPart
validator currently only checks p.Tokens for nil, allowing an empty slice
(Tokens: []int{}) to pass; update the EmbeddingContentPartTypeTokens branch in
the Validate method to reject empty token slices by checking len(p.Tokens) == 0
and returning an error like "embedding content part type %q requires non-empty
tokens payload" when zero-length, instead of only checking for nil.

Comment thread core/schemas/embedding.go
Comment on lines +226 to +232
type EmbeddingsByType struct {
Float []float64 `json:"float,omitempty"` // Float embeddings
Int8 []int8 `json:"int8,omitempty"` // Int8 embeddings
Uint8 []uint8 `json:"uint8,omitempty"` // Uint8 embeddings
Binary []int8 `json:"binary,omitempty"` // Binary embeddings
Ubinary []uint8 `json:"ubinary,omitempty"` // Unsigned binary embeddings
Base64 *string `json:"base64,omitempty"` // Base64 embeddings
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep Binary and Ubinary on the published []byte contract.

These two fields are the packed-binary embedding variants in the live schema. Declaring them as []int8 / []uint8 changes the public type and will diverge from callers and converters that now expect the repo-wide EmbeddingsByType contract.

Suggested fix
 type EmbeddingsByType struct {
 	Float   []float64 `json:"float,omitempty"`   // Float embeddings
 	Int8    []int8    `json:"int8,omitempty"`    // Int8 embeddings
 	Uint8   []uint8   `json:"uint8,omitempty"`   // Uint8 embeddings
-	Binary  []int8    `json:"binary,omitempty"`  // Binary embeddings
-	Ubinary []uint8   `json:"ubinary,omitempty"` // Unsigned binary embeddings
+	Binary  []byte    `json:"binary,omitempty"`  // Binary embeddings
+	Ubinary []byte    `json:"ubinary,omitempty"` // Unsigned binary embeddings
 	Base64  *string   `json:"base64,omitempty"`  // Base64 embeddings
 }

Based on learnings: In maximhq/bifrost, the active embedding representation is EmbeddingsByType with fields Float []float64, Int8 []int8, Uint8 []uint8, Binary []byte, Ubinary []byte, and Base64 *string.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type EmbeddingsByType struct {
Float []float64 `json:"float,omitempty"` // Float embeddings
Int8 []int8 `json:"int8,omitempty"` // Int8 embeddings
Uint8 []uint8 `json:"uint8,omitempty"` // Uint8 embeddings
Binary []int8 `json:"binary,omitempty"` // Binary embeddings
Ubinary []uint8 `json:"ubinary,omitempty"` // Unsigned binary embeddings
Base64 *string `json:"base64,omitempty"` // Base64 embeddings
type EmbeddingsByType struct {
Float []float64 `json:"float,omitempty"` // Float embeddings
Int8 []int8 `json:"int8,omitempty"` // Int8 embeddings
Uint8 []uint8 `json:"uint8,omitempty"` // Uint8 embeddings
Binary []byte `json:"binary,omitempty"` // Binary embeddings
Ubinary []byte `json:"ubinary,omitempty"` // Unsigned binary embeddings
Base64 *string `json:"base64,omitempty"` // Base64 embeddings
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/schemas/embedding.go` around lines 226 - 232, The EmbeddingsByType
struct currently declares Binary []int8 and Ubinary []uint8 which breaks the
public contract; change these fields to Binary []byte and Ubinary []byte
(keeping their json tags and other fields unchanged) so the struct matches the
repo-wide contract and callers/converters expecting packed-binary embeddings
continue to work; update any related comments or docs referencing Binary/Ubinary
types if present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants