Skip to content

feat: add model alias#2355

Merged
akshaydeo merged 2 commits intov1.5.0from
03-27-feat_add_model_alias
Apr 2, 2026
Merged

feat: add model alias#2355
akshaydeo merged 2 commits intov1.5.0from
03-27-feat_add_model_alias

Conversation

@Pratham-Mishra04
Copy link
Copy Markdown
Collaborator

Summary

Briefly explain the purpose of this PR and the problem it solves.

Changes

  • What was changed and why
  • Any notable design decisions or trade-offs

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Describe the steps to validate this change. Include commands and expected outcomes.

# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build

If adding new configs or environment variables, document them here.

Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

Breaking changes

  • Yes
  • No

If yes, describe impact and migration instructions.

Related issues

Link related issues and discussions. Example: Closes #123

Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 28, 2026

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4450e3c5-2e16-4787-bda3-bed9330a3f9f

📥 Commits

Reviewing files that changed from the base of the PR and between 99bcebe and 5fa7f30.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (183)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/image_edit.go
  • core/internal/llmtests/image_generation.go
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/speech_synthesis.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/internal/llmtests/transcription_stream.go
  • core/internal/llmtests/video.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/batch.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/errors.go
  • core/providers/anthropic/models.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/text.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/utils.go
  • core/providers/azure/azure.go
  • core/providers/azure/files.go
  • core/providers/azure/models.go
  • core/providers/azure/utils.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/invoke.go
  • core/providers/bedrock/models.go
  • core/providers/bedrock/rerank_test.go
  • core/providers/bedrock/s3.go
  • core/providers/bedrock/signer.go
  • core/providers/bedrock/text.go
  • core/providers/cerebras/cerebras.go
  • core/providers/cohere/chat.go
  • core/providers/cohere/cohere.go
  • core/providers/cohere/errors.go
  • core/providers/cohere/models.go
  • core/providers/elevenlabs/elevenlabs.go
  • core/providers/elevenlabs/errors.go
  • core/providers/elevenlabs/models.go
  • core/providers/gemini/batch.go
  • core/providers/gemini/errors.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/models.go
  • core/providers/gemini/videos.go
  • core/providers/groq/groq.go
  • core/providers/huggingface/errors.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/responses.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/huggingface/utils.go
  • core/providers/mistral/mistral.go
  • core/providers/mistral/models.go
  • core/providers/mistral/transcription.go
  • core/providers/nebius/errors.go
  • core/providers/nebius/nebius.go
  • core/providers/ollama/ollama.go
  • core/providers/openai/batch.go
  • core/providers/openai/errors.go
  • core/providers/openai/errors_test.go
  • core/providers/openai/files.go
  • core/providers/openai/images.go
  • core/providers/openai/large_payload.go
  • core/providers/openai/models.go
  • core/providers/openai/openai.go
  • core/providers/openai/transcription.go
  • core/providers/openai/videos.go
  • core/providers/openrouter/openrouter.go
  • core/providers/parasail/parasail.go
  • core/providers/perplexity/chat.go
  • core/providers/perplexity/perplexity.go
  • core/providers/replicate/errors.go
  • core/providers/replicate/files.go
  • core/providers/replicate/images.go
  • core/providers/replicate/models.go
  • core/providers/replicate/replicate.go
  • core/providers/replicate/utils.go
  • core/providers/replicate/videos.go
  • core/providers/runway/errors.go
  • core/providers/runway/runway.go
  • core/providers/runway/videos.go
  • core/providers/sgl/sgl.go
  • core/providers/utils/large_response.go
  • core/providers/utils/make_request_test.go
  • core/providers/utils/models.go
  • core/providers/utils/utils.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/errors.go
  • core/providers/vertex/models.go
  • core/providers/vertex/rerank.go
  • core/providers/vertex/rerank_test.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/utils.go
  • core/providers/vllm/vllm.go
  • core/providers/xai/errors.go
  • core/providers/xai/xai.go
  • core/schemas/account.go
  • core/schemas/bifrost.go
  • core/schemas/chatcompletions.go
  • core/schemas/models.go
  • core/schemas/models_test.go
  • core/schemas/tracer.go
  • core/utils.go
  • docs/features/litellm-compat.mdx
  • docs/openapi/schemas/management/providers.yaml
  • docs/providers/provider-routing.mdx
  • docs/providers/supported-providers/azure.mdx
  • docs/providers/supported-providers/bedrock.mdx
  • docs/providers/supported-providers/replicate.mdx
  • docs/providers/supported-providers/vertex.mdx
  • docs/quickstart/gateway/provider-configuration.mdx
  • docs/quickstart/go-sdk/provider-configuration.mdx
  • framework/configstore/clientconfig.go
  • framework/configstore/encryption_test.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/encryption_test.go
  • framework/configstore/tables/key.go
  • framework/configstore/tables/virtualkey.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/sync.go
  • framework/streaming/accumulator_test.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/streaming/types.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/governance/model_provider_governance_test.go
  • plugins/governance/store.go
  • plugins/litellmcompat/texttochat.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/logging/utils.go
  • plugins/logging/writer.go
  • plugins/maxim/main.go
  • plugins/mocker/main.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/telemetry/main.go
  • tests/governance/e2e_test.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/provider_keys.go
  • transports/bifrost-http/handlers/wsresponses.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/cursor.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/config_test.go
  • transports/config.schema.json
  • transports/schema_test/config_schema_test.go
  • ui/app/globals.css
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/columns.tsx
  • ui/app/workspace/providers/dialogs/addNewKeySheet.tsx
  • ui/app/workspace/providers/fragments/apiKeysFormFragment.tsx
  • ui/app/workspace/routing-rules/tree/page.tsx
  • ui/app/workspace/routing-rules/tree/views/routingTreeView.tsx
  • ui/app/workspace/routing-rules/views/routingRulesView.tsx
  • ui/components/filters/filterPopover.tsx
  • ui/components/prompts/promptsView.tsx
  • ui/components/ui/headersTable.tsx
  • ui/lib/constants/config.ts
  • ui/lib/schemas/providerForm.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
  • ui/lib/utils/validation.ts
  • ui/package.json
 ______________________________________________________________________________________________________________________________________________________________________
< Refactor early, refactor often. Just as you might weed and rearrange a garden, rewrite, rework, and re-architect code when it needs it. Fix the root of the problem. >
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  \
   \   \
        \ /\
        ( )
      .( o ).
📝 Walkthrough

Walkthrough

Responses and errors now carry OriginalModelRequested and ResolvedModelUsed instead of ModelRequested/ModelDeployment; keys gain top-level Aliases with resolution in workers; list-models filtering/backfill centralized into a ListModelsPipeline; many provider error/stream helpers were simplified to omit provider/request/model provenance.

Changes

Cohort / File(s) Summary
Core extra-fields & worker routing
core/bifrost.go, core/utils.go, core/schemas/bifrost.go, core/schemas/models.go
Introduce PopulateExtraFields; replace ModelRequested/ModelDeployment with OriginalModelRequested/ResolvedModelUsed; workers record original model, resolve via Key.Aliases.Resolve, set request model, and populate extra fields.
Key schema, persistence & migrations
core/schemas/account.go, framework/configstore/..., framework/configstore/tables/*, framework/configstore/migrations.go, transports/config.schema.json
Add Key.Aliases type/validation/Resolve; remove provider-specific deployments; add Replicate.UseDeploymentsEndpoint; DB persistence, migrations, and hashing/backfill updated to store aliases.
List-models pipeline util
core/providers/utils/models.go
New ListModelsPipeline (FilterModel, BackfillModels, ShouldEarlyExit, DefaultMatchFns, ToDisplayName) centralizes alias/allowlist/blacklist/backfill logic used by providers.
Providers: list-models & model mapping
core/providers/*/models.go, core/providers/*/types.go
Adopt pipeline: ToBifrostListModelsResponse(..., aliases map[string]string, ...); emit standardized IDs (providerKey/ResolvedID) and optional Alias; remove bespoke per-provider filtering/backfill.
Provider error plumbing & streaming
core/providers/*/*errors.go, core/providers/*/*stream.go, core/providers/*/utils.go
Removed provider/request/model params from many parse-error helpers and NewBifrost*Error calls; stop setting Provider/ModelRequested/RequestType in most ExtraFields; streaming chunks retain ChunkIndex/Latency/Raw only.
Streaming, large-response & helper signatures
core/providers/utils/large_response.go, core/providers/utils/*, core/providers/*/*stream.go, core/providers/utils/utils.go
Simplified helper signatures (dropped provider/model/requestType args); updated FinalizeResponseWithLargeDetection and streaming cancellation/timeout/error flows to omit provenance.
Streaming/tracing payloads & types
framework/streaming/types.go, core/schemas/tracer.go, framework/streaming/*
Replaced single Model with RequestedModel + ResolvedModel across streaming/tracing; ExtraFields populated as OriginalModelRequested/ResolvedModelUsed.
Transports, handlers & integrations
transports/*, transports/bifrost-http/integrations/*, transports/bifrost-http/handlers/*
Validate/accept aliases on key create/update; adapt Claude/passthrough detection, cursor model selection, and pricing fallback to prefer Alias/ResolvedModel fields.
Logging, telemetry & storage
plugins/logging/*, plugins/telemetry/*, framework/logstore/*, plugins/logging/writer.go
Log/telemetry extract requested+resolved models; add alias support, new GetAvailableAliases, applyModelAlias, alias column in logs, and alias-based filters/queries.
Tests, internal LLMTs & plugins
core/internal/llmtests/*, plugins/*, tests/*, many tests
Updated tests/fixtures and assertions to use OriginalModelRequested/ResolvedModelUsed and to use key Aliases where fixtures previously used deployments/ModelRequested.
Docs & schemas
docs/**, transports/config.schema.json, config.schema.json
Replace deployments with aliases in docs, OpenAPI/config schema and examples; update quickstarts and provider docs to alias-based config.
Misc refactors & signature trims
assorted provider files: images.go, files.go, s3.go, signer.go, videos.go, many *errors.go
Remove provider-name args from many error constructors, lowercase some messages, and simplify signatures without changing main behavior.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Gateway as Gateway
    participant KeyStore as KeyStore
    participant Worker as Worker
    participant Provider as Provider

    Client->>Gateway: Request(model = user-facing)
    Gateway->>KeyStore: Fetch key (includes Aliases)
    KeyStore-->>Gateway: Return key
    Gateway->>Worker: Enqueue request (OriginalModelRequested = user model)
    Worker->>Worker: original := OriginalModelRequested
    Worker->>Worker: resolved := key.Aliases.Resolve(original)
    Worker->>Worker: req.SetModel(resolved)
    Worker->>Provider: Send request (model = resolved)
    Provider-->>Worker: Response / Error
    Worker->>Worker: resp/err.PopulateExtraFields(reqType, provider, original, resolved)
    Worker-->>Gateway: Return (ExtraFields: original_model_requested, resolved_model_used)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

"I hopped through code with nimble paws,
Aliases mapped beneath my claws,
Requested and resolved — names paired anew,
Pipelines sift and listings grew,
🐇✨ — a rabbit's cheer for clearer view!"

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 03-27-feat_add_model_alias

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Collaborator Author

Pratham-Mishra04 commented Mar 28, 2026

@Pratham-Mishra04 Pratham-Mishra04 linked an issue Mar 28, 2026 that may be closed by this pull request
1 task
Comment thread core/providers/bedrock/bedrock.go Outdated
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from 0024488 to f45f847 Compare March 30, 2026 05:00
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-26-feat_adds_routing_tree_view branch from 4d4ec62 to 85e5c09 Compare March 30, 2026 06:45
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch 2 times, most recently from ea47e45 to 086f1ea Compare March 30, 2026 07:38
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-26-feat_adds_routing_tree_view branch from 85e5c09 to 8f18d92 Compare March 30, 2026 07:38
@Pratham-Mishra04 Pratham-Mishra04 marked this pull request as ready for review March 30, 2026 07:47
Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

✅ Actions performed

Full review triggered.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-26-feat_adds_routing_tree_view branch from 8f18d92 to ec8511b Compare March 30, 2026 08:14
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from 086f1ea to b617cb3 Compare March 30, 2026 08:14
Copy link
Copy Markdown
Collaborator Author

@greptile full review

Copy link
Copy Markdown
Contributor

@greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 30, 2026

Confidence Score: 5/5

  • Safe to merge; all remaining findings are non-blocking style/quality issues (P2).
  • All previously flagged P1 concerns (missing ReplicateKeyConfig in merge functions, OpenRouter alias value not prefix-normalized, migration backfill) are resolved. Only two P2 items remain: a dead conditional branch in selectKeyFromProviderForModel (both branches return the same error string), and a minor race in tryRequest's ctx.Done path where OriginalModelRequested could show the resolved model rather than the original alias in timeout errors — neither blocks correctness for the primary user flow. Per scoring guidance, an all-P2 set defaults to 5/5.
  • core/bifrost.go — the duplicate error-message conditional at the bottom of selectKeyFromProviderForModel, and the ctx.Done model-capture race in tryRequest.

Important Files Changed

Filename Overview
core/schemas/account.go Introduces KeyAliases (a unified map[string]string) replacing per-provider Deployments maps; removes AzureKeyConfig.Deployments, VertexKeyConfig.Deployments, BedrockKeyConfig.Deployments; adds ReplicateKeyConfig.UseDeploymentsEndpoint; validates alias keys case-insensitively.
core/bifrost.go Key-selection removes per-provider deployment checks; adds validateKey and alias resolution via key.Aliases.Resolve(model) followed by req.SetModel(resolvedModel) in requestWorker; PopulateExtraFields used uniformly for OriginalModelRequested/ResolvedModelUsed. Dead conditional in selectKeyFromProviderForModel and a race in tryRequest's ctx-done path noted.
core/schemas/bifrost.go Renames model_requestedoriginal_model_requested, drops model_deployment, adds resolved_model_used; adds PopulateExtraFields methods on BifrostResponse and BifrostError that do field-by-field assignment (preserving RawRequest/RawResponse/KeyStatuses).
framework/configstore/migrations.go Adds migrationDropDeploymentColumnsAndAddAliases (copies per-provider deployment JSON to unified aliases_json, drops old columns, recomputes hashes) and migrationAddReplicateKeyConfigColumn (adds replicate_use_deployments_endpoint, backfills existing Replicate keys that had deployments → true).
framework/configstore/tables/key.go Drops AzureDeploymentsJSON, VertexDeploymentsJSON, BedrockDeploymentsJSON, ReplicateDeploymentsJSON columns; adds unified AliasesJSON (encrypted at rest via BeforeSave/AfterFind) and ReplicateUseDeploymentsEndpoint.
core/providers/utils/models.go New shared ListModelsPipeline for filtering and backfilling in ListModels responses across all providers; replaces per-provider duplicate filtering logic with a unified FilterModel/BackfillModels API supporting configurable match functions.
core/providers/azure/azure.go Removes getModelDeployment/validateKeyConfig helpers; all deployment name lookups now use request.Model directly (alias already resolved upstream by requestWorker); completeRequest returns raw error body on non-2xx for richer EnrichError output.
transports/bifrost-http/lib/config.go Both mergeProviderKeys and reconcileProviderKeys now include Aliases: dbKey.Aliases; alias validation called during config load; previously noted ReplicateKeyConfig omission is confirmed fixed.
core/providers/openrouter/openrouter.go OpenRouter prefix normalization now strips the "openrouter/" prefix from both alias keys AND alias values before feeding into the pipeline, resolving the previously flagged mismatch between API-returned IDs and user-supplied alias values.

Reviews (6): Last reviewed commit: "feat: add model alias" | Re-trigger Greptile

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

✅ Actions performed

Full review triggered.

Comment thread framework/configstore/migrations.go
Comment thread transports/bifrost-http/lib/config.go
Comment thread core/providers/openrouter/openrouter.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (16)
core/providers/openai/videos.go (1)

135-208: ⚠️ Potential issue | 🔴 Critical

Remove unused providerName parameter from function signature and callsite.

The parseVideoGenerationFormDataBodyFromRequest function at core/providers/openai/videos.go:132 declares a providerName parameter that is never used in the function body, causing a Go compilation failure.

Fixes required at two locations

1. Function signature in core/providers/openai/videos.go:132:

-func parseVideoGenerationFormDataBodyFromRequest(writer *multipart.Writer, openaiReq *OpenAIVideoGenerationRequest, providerName schemas.ModelProvider) *schemas.BifrostError {
+func parseVideoGenerationFormDataBodyFromRequest(writer *multipart.Writer, openaiReq *OpenAIVideoGenerationRequest) *schemas.BifrostError {

2. Callsite in core/providers/openai/openai.go:3548:

-	if err := parseVideoGenerationFormDataBodyFromRequest(writer, reqBody, providerName); err != nil {
+	if err := parseVideoGenerationFormDataBodyFromRequest(writer, reqBody); err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/videos.go` around lines 135 - 208, The function
parseVideoGenerationFormDataBodyFromRequest currently takes an unused
providerName parameter; remove that parameter from the function signature in
core/providers/openai/videos.go (function
parseVideoGenerationFormDataBodyFromRequest) and update its internal references
accordingly, then remove the corresponding argument at the callsite in
core/providers/openai/openai.go (the call that invokes
parseVideoGenerationFormDataBodyFromRequest around the location currently at
line ~3548) so the call matches the new signature; rebuild to confirm no other
callsites pass providerName.
core/schemas/chatcompletions.go (1)

62-71: ⚠️ Potential issue | 🟠 Major

Carry ResolvedModelUsed through the chat→text compatibility mapper.

This method now copies OriginalModelRequested, but it still rebuilds ExtraFields without the new resolved-model field. Any alias-resolved chat response that goes through ToTextCompletionResponse() will lose that metadata.

Proposed fix
 			ExtraFields: BifrostResponseExtraFields{
 				RequestType:             TextCompletionRequest,
 				ChunkIndex:              cr.ExtraFields.ChunkIndex,
 				Provider:                cr.ExtraFields.Provider,
-				OriginalModelRequested:           cr.ExtraFields.OriginalModelRequested,
+				OriginalModelRequested:  cr.ExtraFields.OriginalModelRequested,
+				ResolvedModelUsed:       cr.ExtraFields.ResolvedModelUsed,
 				Latency:                 cr.ExtraFields.Latency,
 				RawResponse:             cr.ExtraFields.RawResponse,
 				CacheDebug:              cr.ExtraFields.CacheDebug,
 				ProviderResponseHeaders: cr.ExtraFields.ProviderResponseHeaders,
 			},

Apply the same addition in all four return paths in this method.

Also applies to: 95-104, 131-140, 151-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/chatcompletions.go` around lines 62 - 71, The
ToTextCompletionResponse method in core/schemas/chatcompletions.go rebuilds
BifrostResponseExtraFields but omits the ResolvedModelUsed field, causing
alias-resolved chat responses to lose that metadata; update every place that
constructs BifrostResponseExtraFields inside ToTextCompletionResponse (the four
return paths around lines where ExtraFields is set) to include
ResolvedModelUsed: cr.ExtraFields.ResolvedModelUsed (mirroring how
OriginalModelRequested is copied) so the resolved model carries through the
chat→text compatibility mapper.
core/internal/llmtests/response_validation.go (1)

849-854: ⚠️ Potential issue | 🟡 Minor

Use model fallback in Claude status check to avoid false failures.

At Line 850, status validation uses only response.ExtraFields.ResolvedModelUsed, but this field can be empty in some paths while response.Model is populated. That can misclassify Claude responses and incorrectly require response.Status.

💡 Suggested fix
-	model := response.ExtraFields.ResolvedModelUsed
+	model := strings.TrimSpace(response.ExtraFields.ResolvedModelUsed)
+	if model == "" {
+		model = strings.TrimSpace(response.Model)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/response_validation.go` around lines 849 - 854, The
Claude detection currently only checks response.ExtraFields.ResolvedModelUsed
which can be empty; change the logic where model is set so it falls back to
response.Model when ResolvedModelUsed is empty (i.e., set model :=
response.ExtraFields.ResolvedModelUsed; if model == "" then model =
response.Model) and use that combined value in the provider check (the clause
that tests provider == schemas.OpenAI || provider == schemas.Azure &&
!strings.Contains(strings.ToLower(model), "claude")) before validating
response.Status to avoid false failures.
core/providers/ollama/ollama.go (1)

175-183: ⚠️ Potential issue | 🔴 Critical

Ollama's Responses() method leaves metadata fields unpopulated, degrading downstream pricing and cache attribution.

The converter ToBifrostResponsesResponse() only copies existing ExtraFields from the chat response and sets RequestType. It does not populate Provider, OriginalModelRequested, or ResolvedModelUsed — fields that are non-optional per the schema definition (no omitempty tag). Since ChatCompletion() (via OpenAI handler) also skips these fields, the converted response carries zero-valued metadata, causing silent degradation in pricing calculations and cache operations.

Compare with Anthropic: it manually assigns Latency and ProviderResponseHeaders after conversion. Ollama should similarly set:

  • ExtraFields.Provider = provider.GetProviderKey()
  • ExtraFields.OriginalModelRequested = request.Model
  • ExtraFields.ResolvedModelUsed = resolved model (from context if aliasing is in use)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/ollama/ollama.go` around lines 175 - 183, The
OllamaProvider.Responses method must populate non-optional metadata fields after
converting chatResponse via ToBifrostResponsesResponse: set
response.ExtraFields.Provider = provider.GetProviderKey(),
response.ExtraFields.OriginalModelRequested = request.Model, and
response.ExtraFields.ResolvedModelUsed = the actual resolved model name (obtain
from the chatResponse/context where model aliasing is resolved or from
provider.ChatCompletion result); ensure these assignments occur before returning
response so Pricing and cache attribution receive Provider,
OriginalModelRequested, and ResolvedModelUsed values.
core/providers/anthropic/errors.go (1)

57-67: ⚠️ Potential issue | 🟠 Major

Keep error extra fields populated after the rename.

Removing meta here strips provider/request/model context from BifrostError, so alias-related failures lose the original-vs-resolved model breadcrumb and downstream logging/conversion no longer gets metadata-rich errors. The rename should switch these fields to OriginalModelRequested / ResolvedModelUsed, not drop them entirely.

As per coding guidelines, "Every provider implementation must include errors.go with an ErrorConverter function that parses provider-specific HTTP errors into *schemas.BifrostError with metadata."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/errors.go` around lines 57 - 67, parseAnthropicError
currently drops provider/model context when mapping AnthropicError to
*schemas.BifrostError; update parseAnthropicError to preserve the renamed
metadata fields by copying errorResp.OriginalModelRequested and
errorResp.ResolvedModelUsed into the corresponding fields on bifrostErr (e.g.,
bifrostErr.OriginalModelRequested and bifrostErr.ResolvedModelUsed) and, if the
provider error contains any additional meta map, copy it into
bifrostErr.Error.Extra (creating bifrostErr.Error if nil) so alias/model
breadcrumbs and extra context are retained when converting AnthropicError to
schemas.BifrostError.
core/providers/runway/errors.go (1)

12-37: ⚠️ Potential issue | 🟠 Major

Preserve provider/request metadata on Runway errors too.

This helper now returns a bare parsed error, so BifrostError.ExtraFields loses provider/model/request context exactly where the alias split makes that context more important. Please populate the renamed error extra fields here instead of returning only the message/status payload.

As per coding guidelines, "Every provider implementation must include errors.go with an ErrorConverter function that parses provider-specific HTTP errors into *schemas.BifrostError with metadata."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/runway/errors.go` around lines 12 - 37, parseRunwayError
currently returns only the parsed message/status and discards provider/request
metadata; update it to preserve and populate BifrostError.ExtraFields before
returning. After calling providerUtils.HandleProviderAPIError(resp, &errorResp)
ensure bifrostErr.ExtraFields is non-nil and add keys such as
"provider":"runway", "status_code" (from resp.StatusCode()), "method" and "url"
(from the request/resp if available), "request_id" or any provider-specific
identifiers from headers, and include the raw parsed payload (errorResp) under a
field like "provider_payload"; keep existing message trimming logic but do not
overwrite ExtraFields—use parseRunwayError, RunwayAPIError, and
providerUtils.HandleProviderAPIError as the insertion points to attach these
metadata fields.
core/providers/openai/files.go (1)

86-110: ⚠️ Potential issue | 🟡 Minor

Remove unused providerName parameter from ToBifrostFileRetrieveResponse.

The providerName parameter is declared but never used in the function body. The method only populates Latency in ExtraFields, consistent with the ToBifrostFileUploadResponse refactoring.

Note: Anthropic's ToBifrostFileRetrieveResponse already removed this parameter. Removing it from OpenAI requires updating call sites in openai/openai.go:4976 and azure/azure.go:1923.

Proposed fix

Remove parameter from function signature in core/providers/openai/files.go:86:

-func (r *OpenAIFileResponse) ToBifrostFileRetrieveResponse(providerName schemas.ModelProvider, latency time.Duration, sendBackRawRequest bool, sendBackRawResponse bool, rawRequest interface{}, rawResponse interface{}) *schemas.BifrostFileRetrieveResponse {
+func (r *OpenAIFileResponse) ToBifrostFileRetrieveResponse(latency time.Duration, sendBackRawRequest bool, sendBackRawResponse bool, rawRequest interface{}, rawResponse interface{}) *schemas.BifrostFileRetrieveResponse {

Update call sites:

  • openai/openai.go:4976: remove providerName argument
  • azure/azure.go:1923: remove providerName argument
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/files.go` around lines 86 - 110, Remove the unused
providerName parameter from the method signature of
ToBifrostFileRetrieveResponse in core/providers/openai/files.go and adjust its
callers to stop passing that argument; specifically delete the providerName
parameter from func (r *OpenAIFileResponse) ToBifrostFileRetrieveResponse(...)
and update the call sites that currently pass providerName (in openai/openai.go
and azure/azure.go) to call the method with the remaining arguments only.
core/providers/openai/errors.go (1)

51-52: ⚠️ Potential issue | 🟡 Minor

Remove stale comment.

This comment claims to "Set ExtraFields unconditionally" but there's no corresponding implementation. Per the PR changes, ExtraFields are now populated elsewhere. This orphaned comment is misleading.

🧹 Suggested fix
-	// Set ExtraFields unconditionally so provider/model/request metadata is always attached
-
 	return bifrostErr
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/errors.go` around lines 51 - 52, Remove the stale
comment in core/providers/openai/errors.go that states "Set ExtraFields
unconditionally" since ExtraFields are now populated elsewhere; locate the
comment near the errors.go error-handling code (where ExtraFields was
referenced) and delete that misleading comment so the file's comments match the
current implementation.
core/bifrost.go (1)

5085-5097: ⚠️ Potential issue | 🔴 Critical

Don't write stream success metadata through result.

Any successful streaming request reaches Line 5145 with result == nil, so this panics the worker before req.ResponseStream is sent. It also means successful chunks never get OriginalModelRequested / ResolvedModelUsed. Keep the result.GetExtraFields() mutation in the non-stream branch only, and attach stream metadata inside postHookRunner (or the chunk builder).

Minimal guard for the panic
-			result.GetExtraFields().Provider = provider.GetProviderKey()
-			result.GetExtraFields().RequestType = req.RequestType
-			result.GetExtraFields().OriginalModelRequested = originalModelRequested
-			result.GetExtraFields().ResolvedModelUsed = resolvedModel
-
-			if IsStreamRequestType(req.RequestType) {
+			if IsStreamRequestType(req.RequestType) {
 				// Send stream with context awareness to prevent deadlock
 				select {
 				case req.ResponseStream <- stream:
 					// Stream sent successfully
 				case <-req.Context.Done():
 					// Client no longer listening, log and continue
 					bifrost.logger.Debug("Client context cancelled while sending stream response")
 				case <-time.After(5 * time.Second):
 					// Timeout to prevent indefinite blocking
 					bifrost.logger.Warn("Timeout while sending stream response, client may have disconnected")
 				}
 			} else {
+				result.GetExtraFields().Provider = provider.GetProviderKey()
+				result.GetExtraFields().RequestType = req.RequestType
+				result.GetExtraFields().OriginalModelRequested = originalModelRequested
+				result.GetExtraFields().ResolvedModelUsed = resolvedModel
 				// Send response with context awareness to prevent deadlock
 				select {

Also applies to: 5145-5161

docs/quickstart/gateway/provider-configuration.mdx (3)

1282-1300: ⚠️ Potential issue | 🟡 Minor

Include project_number in the fine-tuned Vertex examples.

These examples now demonstrate fine-tuned model aliases, and the note below says fine-tuned models require Project Number. As written, the copied vertex_key_config blocks are incomplete.

Doc tweak
             "vertex_key_config": {
                 "project_id": "env.VERTEX_PROJECT_ID",
+                "project_number": "env.VERTEX_PROJECT_NUMBER",
                 "region": "us-central1",
                 "auth_credentials": "env.VERTEX_CREDENTIALS"
             }

Apply the same addition to both the API and config.json examples.

Also applies to: 1310-1326, 1339-1343

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/quickstart/gateway/provider-configuration.mdx` around lines 1282 - 1300,
Add the missing project_number field to the Vertex key configurations in the
examples so the fine-tuned model alias snippets are complete: update the
vertex_key_config objects (used in the curl POST to /api/providers and the
config.json examples) to include "project_number": "env.VERTEX_PROJECT_NUMBER"
alongside project_id, region and auth_credentials, and apply the same change to
the other occurrences referenced around lines 1310-1326 and 1339-1343 so the
fine-tuned model note is accurate.

865-877: ⚠️ Potential issue | 🟡 Minor

Refresh the raw payload examples for the slimmer extra_fields shape.

This PR removes provider/request metadata from response extra_fields, but both snippets still show extra_fields.provider. Readers enabling raw payloads will expect a field that no longer appears.

Doc tweak
 "extra_fields": {
-    "provider": "openai",
     "raw_response": {
         // Original OpenAI response here
     }
 }
...
 "extra_fields": {
-    "provider": "openai",
     "raw_request": {
         // Original request sent to OpenAI here
     }
 }

Also applies to: 941-953

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/quickstart/gateway/provider-configuration.mdx` around lines 865 - 877,
Update the example JSON shown for extra_fields.raw_response to match the slimmer
shape by removing the obsolete extra_fields.provider property and any
provider/request metadata; ensure both snippets that reference extra_fields
(including the second occurrence) only show extra_fields.raw_response (with the
original provider payload) and no extra_fields.provider field so the docs
reflect the new response shape.

1002-1020: ⚠️ Potential issue | 🟡 Minor

Update the Azure Web UI steps to use aliases.

The API and config.json examples in these sections have already switched to key-level aliases, but the Web UI steps still tell users to configure deployments. That leaves the UI path documenting the old mapping mechanism.

Doc tweak
-7. Configure **Deployments**: Map model names to deployment names
+7. Configure **Aliases** on the key: Map request model names to Azure deployment names
...
-4. Configure **Deployments**: Map model names to deployment names
+4. Configure **Aliases** on the key: Map request model names to Azure deployment names

Also applies to: 1091-1106

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/quickstart/gateway/provider-configuration.mdx` around lines 1002 - 1020,
The Azure Web UI instructions still instruct users to "Configure Deployments"
while the rest of the docs and config examples use key-level "aliases"; update
the Web UI step(s) in the Azure Entra ID section (the "Using Web UI" Tab under
the "Azure" model provider configuration) to replace any mention of "Configure
Deployments" or "Map model names to deployment names" with "Configure Aliases"
and instruct users to map model names to alias names (e.g., Model -> Alias), and
apply the same change to the other identical Web UI block referenced later in
the file (the repeated Azure UI steps).
core/providers/huggingface/huggingface.go (1)

1060-1064: ⚠️ Potential issue | 🟡 Minor

Start the fal-ai stream timer after the HTTP handshake.

Both handlers still derive chunk/completion latency from a startTime captured before client.Do, so the emitted values include connection/setup time. Move startTime and the initial lastChunkTime assignment into each streaming goroutine after the response stream is established.

Suggested fix
-	// Capture start time before making the HTTP request for latency calculation
-	startTime := time.Now()
-
 	// Make the request
 	err := provider.client.Do(req, resp)
 	if err != nil {
 		...
 	}

 	go func() {
 		defer providerUtils.ReleaseStreamingResponse(resp)
 		defer close(responseChan)
+
+		startTime := time.Now()
+		lastChunkTime := startTime

 		if resp.BodyStream() == nil {
 			...
 		}
-
-		lastChunkTime := startTime
 		chunkIndex := 0

Apply the same move in both ImageGenerationStream and ImageEditStream.

Based on learnings, "initialize the streaming startTime inside the streaming goroutine after client.Do and the stream is set up, so the final-chunk latency reflects post-handshake streaming time."

Also applies to: 1198-1200, 1239-1241, 1439-1443, 1577-1579, 1618-1620

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` around lines 1060 - 1064, The
stream latency timers are started before the HTTP handshake, causing
chunk/completion latency to include connection/setup time; for
ImageGenerationStream and ImageEditStream (and the other streaming handlers
noted), move the startTime := time.Now() and the initial lastChunkTime
assignment inside the streaming goroutine immediately after
provider.client.Do(req, resp) succeeds and the response stream is established so
all emitted latencies reflect post-handshake streaming time only. Locate the
startTime and lastChunkTime initializations in those handlers and relocate them
into the goroutine scope after the Do call and any stream-setup checks (e.g.,
after verifying resp/reader) so final-chunk and chunk latencies are computed
from that new start time.
core/providers/azure/azure.go (2)

203-243: ⚠️ Potential issue | 🔴 Critical

Guard key.AzureKeyConfig before dereferencing it.

completeRequest now reads key.AzureKeyConfig.Endpoint before proving the Azure config exists. TextCompletion, ChatCompletion, Responses, and Embedding all flow through this helper without their own nil guard, so a malformed or partially reconstructed Azure key now panics the request path instead of returning a configuration error. The same direct-deref pattern is also repeated in the new inline URL builders below.

🔧 Minimal fix for the shared helper
 func (provider *AzureProvider) completeRequest(
 	ctx *schemas.BifrostContext,
 	jsonData []byte,
 	path string,
 	key schemas.Key,
 	model string,
 ) ([]byte, time.Duration, map[string]string, *schemas.BifrostError) {
+	if key.AzureKeyConfig == nil {
+		return nil, 0, nil, providerUtils.NewConfigurationError("azure key config not set")
+	}
+
 	// Create the request with the JSON body
 	req := fasthttp.AcquireRequest()
 	resp := fasthttp.AcquireResponse()
 	defer fasthttp.ReleaseRequest(req)
@@
 	endpoint := key.AzureKeyConfig.Endpoint.GetValue()
 	if endpoint == "" {
 		return nil, 0, nil, providerUtils.NewConfigurationError("endpoint not set")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/azure/azure.go` around lines 203 - 243, completeRequest
currently dereferences key.AzureKeyConfig (e.g., key.AzureKeyConfig.Endpoint)
without guarding for nil which can panic; update completeRequest to first check
if key.AzureKeyConfig is nil and return a configuration error (use
providerUtils.NewConfigurationError) before calling getAzureAuthHeaders or
building any endpoint/URL strings, and apply the same nil-guard when
constructing any inline Azure URLs later in this function so all uses of
key.AzureKeyConfig (including Endpoint and other fields) are safe.

227-257: ⚠️ Potential issue | 🔴 Critical

Separate Azure deployment IDs from model-family checks.

These branches use the same string for two different jobs: deciding Anthropic vs OpenAI behavior and filling the Azure deployment path. Those values diverge as soon as a deployment is renamed. If you keep the canonical model ID here, custom deployments stop working; if you pass the resolved deployment name, schemas.IsAnthropicModel(...) becomes false for Anthropic deployments like claude-prod, and Chat/Responses get sent through the OpenAI URL/header path.

Also applies to: 523-528, 596-603, 643-647, 693-697, 762-768

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/azure/azure.go` around lines 227 - 257, The code mixes the
model identifier used to detect Anthropic vs OpenAI
(schemas.IsAnthropicModel(model)) with the Azure deployment name used to build
the request path, causing renamed/custom deployments to break routing; fix by
keeping the model-family check using the canonical model id (the existing model
variable) but build the Azure URL using the deployment identifier (e.g., fetch a
deploymentName from key.AzureKeyConfig.Deployment/GetValue() or similar) instead
of reusing model/path for both purposes; update the branches around
schemas.IsAnthropicModel(model), the url construction (fmt.Sprintf("%s/%s",
endpoint, path) and the api-version branches), and the same pattern in the other
affected blocks (lines referenced in the comment) so Anthropic detection uses
model while the request path uses deploymentName.
framework/streaming/images.go (1)

18-23: ⚠️ Potential issue | 🟠 Major

Do not short-circuit on control-only completed chunks.

This early return skips the reconstruction path below. If the terminal event only carries completion metadata, the accumulated image/model from earlier chunks is dropped and the final response comes back effectively empty.

💡 Suggested fix
-		if chunks[i].Delta != nil && (chunks[i].FinishReason != nil || chunks[i].Delta.Type == schemas.ImageGenerationEventTypeCompleted || chunks[i].Delta.Type == schemas.ImageEditEventTypeCompleted) {
+		if chunks[i].Delta != nil &&
+			(chunks[i].FinishReason != nil || chunks[i].Delta.Type == schemas.ImageGenerationEventTypeCompleted || chunks[i].Delta.Type == schemas.ImageEditEventTypeCompleted) &&
+			(chunks[i].Delta.B64JSON != "" || chunks[i].Delta.URL != "") {
 			finalResponse := &schemas.BifrostImageGenerationResponse{
 				ID:      chunks[i].Delta.ID,
 				Created: chunks[i].Delta.CreatedAt,
 				Model:   chunks[i].Delta.ExtraFields.OriginalModelRequested,

Also applies to: 56-57

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/streaming/images.go` around lines 18 - 23, The loop over chunks
currently treats any chunk whose Delta exists and indicates completion as a
terminal event and returns a final response immediately, which drops previously
accumulated image/model data when the completion chunk is control-only; instead,
in the chunk processing code (the loop handling chunks[] and the branch that
constructs schemas.BifrostImageGenerationResponse), detect whether the
completion chunk actually contains image/model payloads and only use its data if
present—otherwise synthesize the final response from the last non-nil
image/model encountered in earlier chunks (i.e., accumulate
lastSeenImage/lastSeenModel while iterating and fall back to them on
control-only completion); apply the same change to the analogous check at the
other location noted (lines ~56-57) so control-only completed chunks do not
short-circuit reconstruction.
🟠 Major comments (20)
core/providers/anthropic/responses.go-1433-1439 (1)

1433-1439: ⚠️ Potential issue | 🟠 Major

Model propagation is incorrectly gated and can drop Response.Model.

If bifrostResp.Response.Model is set but ExtraFields.OriginalModelRequested is empty, streamMessage.Model remains unset. Prefer Response.Model first, then fallback to OriginalModelRequested.

💡 Proposed fix
-			if bifrostResp.ExtraFields.OriginalModelRequested != "" {
-				if bifrostResp.Response != nil && bifrostResp.Response.Model != "" {
-					streamMessage.Model = bifrostResp.Response.Model
-				} else {
-					streamMessage.Model = bifrostResp.ExtraFields.OriginalModelRequested
-				}
-			}
+			if bifrostResp.Response != nil && bifrostResp.Response.Model != "" {
+				streamMessage.Model = bifrostResp.Response.Model
+			} else if bifrostResp.ExtraFields.OriginalModelRequested != "" {
+				streamMessage.Model = bifrostResp.ExtraFields.OriginalModelRequested
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/responses.go` around lines 1433 - 1439, The current
logic only sets streamMessage.Model when ExtraFields.OriginalModelRequested is
non-empty, which can drop a valid bifrostResp.Response.Model; update the
assignment in the response handling so you first check
bifrostResp.Response.Model and set streamMessage.Model to that if non-empty,
otherwise fall back to bifrostResp.ExtraFields.OriginalModelRequested; adjust
the condition around bifrostResp.Response and ExtraFields to ensure
streamMessage.Model is always set to Response.Model when available, using the
symbols bifrostResp.Response.Model,
bifrostResp.ExtraFields.OriginalModelRequested, and streamMessage.Model to
locate and change the code.
core/providers/huggingface/transcription.go-145-148 (1)

145-148: ⚠️ Potential issue | 🟠 Major

Finish the requestedModel refactor here.

requestedModel is still mandatory on Lines 140-142, but this change zeros ExtraFields, so the adapter no longer uses that value after validating it. If the core does not re-inject transcription metadata later, HuggingFace responses will also lose original_model_requested.

Proposed fix
  bifrostResponse := &schemas.BifrostTranscriptionResponse{
-		Text:        response.Text,
-		ExtraFields: schemas.BifrostResponseExtraFields{},
+		Text: response.Text,
+		ExtraFields: schemas.BifrostResponseExtraFields{
+			OriginalModelRequested: requestedModel,
+		},
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/transcription.go` around lines 145 - 148, The
refactor zeroes ExtraFields so the validated requestedModel gets dropped;
preserve the original requested model by populating bifrostResponse.ExtraFields
with original_model_requested (using the requestedModel variable) when
constructing schemas.BifrostTranscriptionResponse so the adapter/core can still
see it; update the bifrostResponse initialization (and any related constructors
of schemas.BifrostResponseExtraFields or the field name) to include
original_model_requested rather than leaving ExtraFields empty, ensuring
requestedModel remains available after validation.
core/schemas/models.go-141-141 (1)

141-141: ⚠️ Potential issue | 🟠 Major

Public JSON field rename is a breaking API change.

At Line 141, replacing deployment with alias changes the response contract for /v1/models. Existing clients parsing deployment will break unless a compatibility path is provided.

💡 Safer transition option
 type Model struct {
   ID            string   `json:"id"`
   CanonicalSlug *string  `json:"canonical_slug,omitempty"`
   Name          *string  `json:"name,omitempty"`
+  // Deprecated: use Alias. Kept temporarily for backward compatibility.
+  Deployment    *string  `json:"deployment,omitempty"`
   Alias         *string  `json:"alias,omitempty"`
   ...
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/models.go` at line 141, You renamed the public JSON field from
deployment to alias which is a breaking API change; revert compatibility by
keeping both JSON names: add a new Deployment *string
`json:"deployment,omitempty"` field (or restore the original field) alongside
Alias *string `json:"alias,omitempty"` in the same struct (the Model struct that
contains Alias) and update any constructors/marshal/unmarshal paths so both
fields are populated and kept in sync (when setting Alias also set Deployment
and vice versa) to preserve the existing /v1/models response contract while
still supporting the new alias name.
framework/modelcatalog/pricing.go-90-91 (1)

90-91: ⚠️ Potential issue | 🟠 Major

Ensure all streaming response producers populate ResolvedModelUsed for pricing lookup.

At lines 90-91, pricing depends on both OriginalModelRequested and ResolvedModelUsed for model resolution. Streaming responses created in framework/streaming/types.go (TextCompletion, Chat, Responses, Speech, Transcription, ImageGeneration) only populate OriginalModelRequested and leave ResolvedModelUsed empty. When CalculateCost is called on final stream chunks, the empty ResolvedModelUsed field causes the pricing lookup fallback (line 775–783 in pricing.go) to fail, and cost defaults to zero.

Add ResolvedModelUsed: p.Model to all ExtraFields initializations in framework/streaming/types.go alongside OriginalModelRequested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 90 - 91, Streaming
ExtraFields only set OriginalModelRequested so ResolvedModelUsed is empty during
pricing lookup; update each ExtraFields initialization in
framework/streaming/types.go (for TextCompletion, Chat, Responses, Speech,
Transcription, ImageGeneration producers) to include ResolvedModelUsed: p.Model
alongside OriginalModelRequested so CalculateCost and the pricing lookup use the
actual resolved model; locate the ExtraFields constructions in those producer
structs/methods and add the ResolvedModelUsed field with p.Model.
core/internal/llmtests/account.go-290-294 (1)

290-294: ⚠️ Potential issue | 🟠 Major

Map Azure Whisper requests from whisper-1, not whisper.

Line 982 still configures Azure transcription with whisper-1. With the alias keyed as whisper, alias resolution never matches, so Azure gets the unresolved model instead of the whisper deployment name.

💡 Suggested fix
 				Aliases: schemas.KeyAliases{
-					"whisper":                   "whisper",
+					"whisper-1":                 "whisper",
 					"gpt-4o-mini-tts":           "gpt-4o-mini-tts",
 					"gpt-4o-mini-audio-preview": "gpt-4o-mini-audio-preview",
 				},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/account.go` around lines 290 - 294, The alias map
currently keys the whisper alias as "whisper" so alias resolution never matches
requests for "whisper-1"; update the KeyAliases entry in the Aliases map (the
map literal shown) to use "whisper-1": "whisper" instead of "whisper": "whisper"
so requests for model name "whisper-1" resolve to the "whisper" deployment;
locate and edit the Aliases map (KeyAliases) in account.go accordingly.
core/providers/vllm/vllm.go-77-80 (1)

77-80: ⚠️ Potential issue | 🟠 Major

Keep vllm attached to these direct error returns.

These call sites now build/return errors without provider.GetProviderKey(). Because baseURLOrError, request-body validation, and the TranscriptionStream HTTP-error path return immediately, there is no later enrichment step to put vllm back on the error, which makes OpenAI-compatible backends much harder to distinguish in telemetry and client errors.

Also applies to: 334-340, 426-426, 482-494

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vllm/vllm.go` around lines 77 - 80, Several early-return
NewBifrostOperationError calls in vllm.go (e.g., the baseURLOrError check,
request-body validation paths, and the TranscriptionStream HTTP-error path) drop
the provider key, so change those error returns to include the provider
identifier by passing provider.GetProviderKey() (or the literal "vllm" if
provider is not in scope) into providerUtils.NewBifrostOperationError; update
every direct return that currently calls NewBifrostOperationError(..., nil) to
NewBifrostOperationError(..., nil, provider.GetProviderKey()) (or equivalent
overload) so the error retains the "vllm" provider tag for telemetry.
core/providers/azure/utils.go-51-58 (1)

51-58: ⚠️ Potential issue | 🟠 Major

Don't overwrite the caller's model on Line 52.

This helper mutates the shared request before conversion, so anything that inspects request.Model after body generation will now see the resolved deployment instead of the original alias/model name. With this PR adding alias resolution, that can clobber the value the caller needs for original_model_requested bookkeeping.

♻️ Proposed fix
-		request.Model = deployment
-		reqBody, convErr := anthropic.ToAnthropicResponsesRequest(ctx, request)
+		requestCopy := *request
+		requestCopy.Model = deployment
+		reqBody, convErr := anthropic.ToAnthropicResponsesRequest(ctx, &requestCopy)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/azure/utils.go` around lines 51 - 58, The code is overwriting
the caller's request.Model before conversion (request.Model = deployment) which
mutates the shared request; instead avoid mutating request by calling
anthropic.ToAnthropicResponsesRequest with a copy or by passing the resolved
deployment separately (e.g., clone request into a local variable or construct a
new request struct with Model set to deployment) so that request.Model remains
the original alias; update the block around ToAnthropicResponsesRequest and the
handling of convErr/reqBody to use the copied request and keep
providerUtils.NewBifrostOperationError usage unchanged.
core/providers/nebius/nebius.go-319-321 (1)

319-321: ⚠️ Potential issue | 🟠 Major

Honor request-scoped raw echo flags on the error path.

Line 321 still passes provider.sendBackRawRequest / provider.sendBackRawResponse into EnrichError, while Line 331 resolves ShouldSendBackRawRequest / ShouldSendBackRawResponse from ctx for the success path. That makes non-200 responses ignore request-level overrides and can leak raw prompt/image payloads when the provider default is more permissive than the current request.

🛡️ Proposed fix
+	sendBackRawRequest := providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest)
+	sendBackRawResponse := providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse)
+
 	// Handle error response
 	if resp.StatusCode() != fasthttp.StatusOK {
-		return nil, providerUtils.EnrichError(ctx, parseNebiusImageError(resp), jsonData, nil, provider.sendBackRawRequest, provider.sendBackRawResponse)
+		return nil, providerUtils.EnrichError(ctx, parseNebiusImageError(resp), jsonData, nil, sendBackRawRequest, sendBackRawResponse)
 	}
@@
-	sendBackRawRequest, sendBackRawResponse := providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest), providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse)

Also applies to: 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/nebius/nebius.go` around lines 319 - 321, The error handling
branch calls providerUtils.EnrichError with the provider-level flags
provider.sendBackRawRequest/provider.sendBackRawResponse, which ignores
per-request overrides; change the error path in nebius.go (the call that uses
parseNebiusImageError and EnrichError) to resolve and pass the request-scoped
flags via providerUtils.ShouldSendBackRawRequest(ctx) and
providerUtils.ShouldSendBackRawResponse(ctx) (the same approach used on the
success path) so request-level overrides are honored when enriching errors.
core/providers/gemini/gemini.go-1219-1220 (1)

1219-1220: ⚠️ Potential issue | 🟠 Major

Don't bypass the raw-response opt-in in error logs.

These debug logs dump resp.Body() verbatim. That leaks upstream payloads into logs even when SendBackRawResponse is off, and those payloads can contain prompts, filenames, or other tenant data. Log status/code or a sanitized parsed error instead.

🛡️ Safer pattern
- provider.logger.Debug(fmt.Sprintf("error from %s provider: %s", providerName, string(resp.Body())))
+ provider.logger.Debug("gemini upstream error from %s: status=%d", providerName, resp.StatusCode())

Also applies to: 1982-1983, 2073-2074

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 1219 - 1220, The debug call
that logs resp.Body() verbatim (provider.logger.Debug(fmt.Sprintf("error from %s
provider: %s", providerName, string(resp.Body())))) must be changed to avoid
dumping raw response payloads when SendBackRawResponse is false; instead log a
sanitized message containing the HTTP status/code and/or the parsed error.
Update the error-path around the parsedErr assignment (parsedErr :=
providerUtils.EnrichError(ctx, parseGeminiError(resp), jsonData, nil,
provider.sendBackRawRequest, provider.sendBackRawResponse)) to pass or log only
the sanitized parsed error (or status) unless provider.sendBackRawResponse is
true, ensuring no direct resp.Body() string is written to logs in those
locations (also apply same change to the other occurrences noted).
core/providers/elevenlabs/elevenlabs.go-245-246 (1)

245-246: ⚠️ Potential issue | 🟠 Major

Don't dump ElevenLabs error payloads into logs.

These debug statements stringify the full upstream body. That bypasses the raw-response opt-in and can leak transcript/audio-related data into logs; keep the log to status/code or a sanitized parse result.

🛡️ Safer pattern
- provider.logger.Debug("error from %s provider: %s", provider.GetProviderKey(), string(resp.Body()))
+ provider.logger.Debug("elevenlabs upstream error from %s: status=%d", provider.GetProviderKey(), resp.StatusCode())

Also applies to: 535-536

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/elevenlabs/elevenlabs.go` around lines 245 - 246, The debug
call is logging the full ElevenLabs response body and can leak sensitive
transcript/audio data; update the logging in the error path (where
provider.logger.Debug, provider.GetProviderKey, parseElevenlabsError and
resp.Body() are used) to avoid dumping resp.Body(): log only safe metadata such
as HTTP status code and provider key, or a sanitized/partial parse (e.g.,
parseElevenlabsError() summary or an error code) and remove any direct
string(resp.Body()) output, keeping raw-response behavior governed by
sendBackRawResponse/sendBackRawRequest flags.
docs/quickstart/go-sdk/provider-configuration.mdx-487-489 (1)

487-489: ⚠️ Potential issue | 🟠 Major

Alias names must be added to the Models allowlist.

The examples document requests using alias names (e.g., "claude-3-sonnet", "fine-tuned-gemini-2.5-pro"), but the Models allowlist still lists only provider-native names. Key selection in selectKeyFromProviderForModel checks the original model name against key.Models.IsAllowed(model) before alias resolution occurs. Requests using undeclared alias names will fail key selection.

Add the alias keys to Models for each example, or explicitly document that only provider-native model names can be requested (aliases are internal to deployments only).

Also applies to: 508–510, 527–529, 547–548

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/quickstart/go-sdk/provider-configuration.mdx` around lines 487 - 489,
The examples use alias names (e.g., "claude-3-sonnet",
"fine-tuned-gemini-2.5-pro") but the key selection logic
(selectKeyFromProviderForModel -> key.Models.IsAllowed(model)) checks the
original model name before alias resolution, so requests with alias names will
fail; fix by adding each alias to the corresponding key.Models allowlist
alongside the provider-native model string used in Aliases (i.e., include
"claude-3-sonnet" next to "us.anthropic.claude-3-sonnet-20240229-v1:0", and
similarly add "fine-tuned-gemini-2.5-pro" next to its provider name) in the
example blocks, or alternatively add a clear note that only provider-native
model names are allowed by selectKeyFromProviderForModel.
framework/configstore/migrations.go-3554-3621 (1)

3554-3621: ⚠️ Potential issue | 🟠 Major

Missing hash recomputation after aliases backfill.

The migration copies data from legacy deployment columns into aliases_json, but does not recompute config_hash for affected keys. Since Aliases is now part of the hash input (as seen in migrationAddConfigHashColumn and migrationBackfillAllowedModelsWildcard), keys with populated aliases will have stale hashes after this migration runs.

This will cause unnecessary config reconciliation on the next sync cycle, as the stored hash won't match the newly computed hash.

🔧 Proposed fix: Add hash recomputation after data migration
 			// Drop legacy deployment columns
 			for _, col := range []string{
 				"azure_deployments_json",
 				"vertex_deployments_json",
 				"bedrock_deployments_json",
 				"replicate_deployments_json",
 			} {
 				if m.HasColumn(&tables.TableKey{}, col) {
 					if err := tx.Exec("ALTER TABLE config_keys DROP COLUMN " + col).Error; err != nil {
 						return err
 					}
 				}
 			}
+
+			// Recompute config_hash for keys that now have aliases populated.
+			// Without this, hashes computed before aliases_json existed are stale.
+			var keys []tables.TableKey
+			if err := tx.Where("aliases_json IS NOT NULL AND aliases_json != '' AND aliases_json != '{}'").Find(&keys).Error; err != nil {
+				return fmt.Errorf("failed to fetch keys for hash recomputation: %w", err)
+			}
+			for _, key := range keys {
+				schemaKey := schemas.Key{
+					Name:               key.Name,
+					Value:              key.Value,
+					Models:             key.Models,
+					Weight:             getWeight(key.Weight),
+					AzureKeyConfig:     key.AzureKeyConfig,
+					VertexKeyConfig:    key.VertexKeyConfig,
+					BedrockKeyConfig:   key.BedrockKeyConfig,
+					Aliases:            key.Aliases,
+					VLLMKeyConfig:      key.VLLMKeyConfig,
+					ReplicateKeyConfig: key.ReplicateKeyConfig,
+					Enabled:            key.Enabled,
+					UseForBatchAPI:     key.UseForBatchAPI,
+				}
+				hash, err := GenerateKeyHash(schemaKey)
+				if err != nil {
+					return fmt.Errorf("failed to generate hash for key %s: %w", key.Name, err)
+				}
+				if err := tx.Model(&key).Update("config_hash", hash).Error; err != nil {
+					return fmt.Errorf("failed to update hash for key %s: %w", key.Name, err)
+				}
+			}
 			return nil
 		},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 3554 - 3621, The migration
migrationDropDeploymentColumnsAndAddAliases updates aliases_json but never
recomputes the dependent config_hash, leaving stale hashes; after copying legacy
deployment columns into aliases_json (and before dropping legacy columns)
iterate the affected rows in the config_keys table (rows where aliases_json was
just set) and recompute and persist their config_hash using the same hashing
logic used by
migrationAddConfigHashColumn/migrationBackfillAllowedModelsWildcard (i.e., call
the existing helper that computes the config hash for a tables.TableKey/row or
replicate that logic), updating config_hash for each affected key so stored
hashes match the new aliases_json values.
core/bifrost.go-5074-5078 (1)

5074-5078: ⚠️ Potential issue | 🟠 Major

Resolve aliases before supported-key filtering.

By the time these lines run, selectKeyFromProviderForModel() / getKeysForBatchAndFileOps() have already filtered keys against the unresolved model string. A key that only allows the resolved model — including VLLM keys matching via ModelName — can be rejected before alias resolution ever happens.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5074 - 5078, The code resolves model aliases
after keys have already been filtered, causing keys that permit the resolved
model (via key.Aliases.Resolve or VLLM ModelName) to be incorrectly excluded by
selectKeyFromProviderForModel/getKeysForBatchAndFileOps; fix by moving alias
resolution earlier so the filters operate on the resolved model: call
key.Aliases.Resolve(model) prior to any key-selection/filtering and use the
resolvedModel (e.g., set req.SetModel(resolvedModel) and pass resolvedModel into
selectKeyFromProviderForModel/getKeysForBatchAndFileOps) while retaining
originalModelRequested for logging or audit.
core/bifrost.go-6326-6356 (1)

6326-6356: ⚠️ Potential issue | 🟠 Major

Use baseProviderType for provider-specific key validation.

For custom providers, providerKey is the user-defined alias, so none of these cases match. That skips Azure/Bedrock/Vertex/Replicate/VLLM-specific validation and can admit invalid keys into supportedKeys.

Minimal fix
-			switch providerKey {
+			switch baseProviderType {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 6326 - 6356, The switch is using providerKey
(which may be a user alias) so provider-specific checks like key.AzureKeyConfig,
key.BedrockKeyConfig, key.VertexKeyConfig, key.ReplicateKeyConfig and
key.VLLMKeyConfig never run for custom aliases; change the switch to use
baseProviderType (the canonical provider enum) for the case dispatch while
keeping providerKey in the log messages and continue logic, i.e. replace "switch
providerKey" with "switch baseProviderType" and keep the existing nil/empty
checks and bifrost.logger.Warn("key %s is not valid for provider: %s", key.ID,
providerKey) behaviour so validation runs for the actual provider type while
logs still show the alias and add no other behavior changes to supportedKeys.
core/bifrost.go-4076-4083 (1)

4076-4083: ⚠️ Potential issue | 🟠 Major

Preserve ResolvedModelUsed when re-wrapping primary/fallback errors.

These wrappers copy RawRequest, RawResponse, and KeyStatuses, but they drop the ResolvedModelUsed value that requestWorker already populated. That strips alias-resolution metadata right before the error is returned to the caller.

Also applies to: 4127-4134, 4140-4147, 4191-4198, 4240-4247, 4253-4260

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4076 - 4083, The re-wrapping of errors (e.g.,
when assigning primaryErr.ExtraFields = schemas.BifrostErrorExtraFields{...}) is
dropping the existing ResolvedModelUsed field; update each such assignment
(primaryErr.ExtraFields and the other similar blocks at the noted locations) to
copy ResolvedModelUsed from the original ExtraFields (e.g.,
ExtraFields.ResolvedModelUsed or primaryErr.ExtraFields.ResolvedModelUsed
populated by requestWorker) into the new schemas.BifrostErrorExtraFields so
alias-resolution metadata is preserved when returning the error.
transports/config.schema.json-1767-1773 (1)

1767-1773: ⚠️ Potential issue | 🟠 Major

Preserve schema compatibility for legacy deployment maps during this migration.

Adding aliases here is fine, but the adjacent bedrock_key_config, azure_key_config, and vertex_key_config definitions no longer accept the old deployments shape. Because this schema is the authoritative validator, existing config.json files using those fields will fail before operators can migrate them. Please keep the deprecated fields accepted for at least one release or add a pre-validation migration.

As per coding guidelines, transports/config.schema.json is the authoritative definition for all config.json fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1767 - 1773, The schema change
removed acceptance of the legacy "deployments" shape from the provider key
objects; update the transports/config.schema.json to preserve compatibility by
allowing the old "deployments" property for bedrock_key_config,
azure_key_config, and vertex_key_config (e.g., add a oneOf/anyOf branch or
include an optional "deployments" object definition alongside the new shape and
mark it deprecated in the description) so existing config.json files still
validate; reference the properties by name (bedrock_key_config,
azure_key_config, vertex_key_config, deployments, aliases) and ensure the
deprecated branch is documented in the schema description for at least one
release.
transports/config.schema.json-1859-1883 (1)

1859-1883: ⚠️ Potential issue | 🟠 Major

replicate_key_config should stay optional for the default false path.

Requiring a nested replicate_key_config.use_deployments_endpoint block turns a zero-value false behavior into a schema-breaking rewrite for every existing Replicate key.

🛠️ Suggested schema relaxation
     "replicate_key": {
       "allOf": [
         {
           "$ref": "#/$defs/base_key"
         },
         {
           "type": "object",
           "properties": {
             "replicate_key_config": {
               "type": "object",
               "properties": {
                 "use_deployments_endpoint": {
                   "type": "boolean",
+                  "default": false,
                   "description": "Whether to use the deployments endpoint instead of the models endpoint (default: false)"
                 }
               },
-              "required": [
-                "use_deployments_endpoint"
-              ],
               "additionalProperties": false
             }
-          },
-          "required": [
-            "replicate_key_config"
-          ]
+          }
         }
       ]
     },
As per coding guidelines, `transports/config.schema.json` is the authoritative definition for all `config.json` fields.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1859 - 1883, The schema currently
makes replicate_key_config and its property use_deployments_endpoint required
under the replicate_key definition, which forces existing configs to include
this object; change the replicate_key schema so replicate_key_config is optional
and remove use_deployments_endpoint from the required array (leave its
type/description intact) so the default false path remains valid; update the
replicate_key allOf block that references replicate_key_config and required
arrays (symbols: replicate_key, replicate_key_config, use_deployments_endpoint)
accordingly.
core/providers/utils/models.go-149-150 (1)

149-150: ⚠️ Potential issue | 🟠 Major

Empty allowlists become wildcard as soon as any alias exists.

ShouldEarlyExit stops treating an empty whitelist as empty once Aliases is non-empty, and FilterModel then lets every provider model through because IsRestricted() is false. That widens the visible model set across every provider using this pipeline, instead of limiting exposure to the explicitly configured aliases.

Based on learnings, schemas.WhiteList uses an explicit wildcard/unrestricted state; an empty allowlist is not equivalent to *.

Also applies to: 210-221

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/models.go` around lines 149 - 150, The logic treats an
empty AllowedModels as a wildcard whenever Aliases is non-empty; update checks
in ShouldEarlyExit and the FilterModel-related block (the places using
p.AllowedModels.IsEmpty() and len(p.Aliases)) to treat an empty whitelist as
empty unless it is explicitly marked unrestricted. Concretely, replace uses of
p.AllowedModels.IsEmpty() in these conditions with a compound check that the
list is empty AND not unrestricted (e.g., p.AllowedModels.IsEmpty() &&
!p.AllowedModels.IsUnrestricted()), so the presence of Aliases no longer
implicitly makes the allowlist a wildcard.
core/providers/utils/models.go-223-229 (1)

223-229: ⚠️ Potential issue | 🟠 Major

Blacklist precedence is bypassed for aliased models.

Once resolveModelID rewrites a provider model to an alias key, the blacklist only checks that alias-facing name. A blacklist entry on the underlying provider ID will no longer suppress the model, and the same gap exists in alias backfill. That contradicts the “blacklist always wins” contract documented in this file.

Also applies to: 275-293, 305-319

core/providers/vllm/utils.go-15-17 (1)

15-17: ⚠️ Potential issue | 🟠 Major

Keep provider attribution on parsed vLLM errors.

Resetting ExtraFields to an empty struct drops the only provider context on this fallback branch. Downstream helpers read BifrostError.ExtraFields.Provider, so these errors will now be logged and surfaced as providerless.

💡 Suggested fix
-		errorResp.ExtraFields = schemas.BifrostErrorExtraFields{}
+		errorResp.ExtraFields = schemas.BifrostErrorExtraFields{
+			Provider: schemas.VLLM,
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vllm/utils.go` around lines 15 - 17, The code in
core/providers/vllm/utils.go is clearing the provider attribution by assigning
errorResp.ExtraFields = schemas.BifrostErrorExtraFields{}, which removes
BifrostError.ExtraFields.Provider used downstream; instead preserve the existing
ExtraFields (do not overwrite it) or explicitly set only missing fields while
keeping Provider intact (e.g., avoid assigning a new empty
schemas.BifrostErrorExtraFields{} on the parsed errorResp so
BifrostError.ExtraFields.Provider remains available to downstream helpers).

Comment thread core/providers/bedrock/invoke.go Outdated
Comment thread core/providers/bedrock/invoke.go Outdated
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (13)
plugins/mocker/main.go (1)

856-917: ⚠️ Potential issue | 🔴 Critical

Guard model override per response type to avoid nil dereference panic.

content.Model override currently assumes mockResponse.ChatResponse is always non-nil. For ResponsesRequest/ResponsesStreamRequest, this panics at runtime.

💡 Suggested fix
-	// Override model if specified
-	if content.Model != nil {
-		mockResponse.ChatResponse.Model = *content.Model
-	}
+	// Override model if specified
+	if content.Model != nil {
+		switch {
+		case mockResponse.ChatResponse != nil:
+			mockResponse.ChatResponse.Model = *content.Model
+		case mockResponse.ResponsesResponse != nil:
+			mockResponse.ResponsesResponse.Model = *content.Model
+		case mockResponse.ResponsesStreamResponse != nil && mockResponse.ResponsesStreamResponse.Response != nil:
+			mockResponse.ResponsesStreamResponse.Response.Model = *content.Model
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/mocker/main.go` around lines 856 - 917, The code unconditionally
writes to mockResponse.ChatResponse.Model when content.Model != nil, which can
panic for ResponsesRequest/ResponsesStreamRequest because ChatResponse may be
nil; change the override to first check the request type or the target struct
before dereferencing: e.g., if content.Model != nil { if
mockResponse.ChatResponse != nil { mockResponse.ChatResponse.Model =
*content.Model } else if mockResponse.ResponsesResponse != nil { /* set the
appropriate model field on ResponsesResponse if it exists */ } else if
mockResponse.ResponsesStreamResponse != nil &&
mockResponse.ResponsesStreamResponse.Response != nil { /* set on
ResponsesStreamResponse.Response if it has a Model field */ } } ensuring you
only assign to existing structs (reference: content.Model,
mockResponse.ChatResponse, mockResponse.ResponsesResponse,
mockResponse.ResponsesStreamResponse, req.RequestType).
transports/bifrost-http/lib/config.go (1)

791-804: ⚠️ Potential issue | 🔴 Critical

Migration backfill is sequenced incorrectly: migrationBackfillAllowedModelsWildcard recomputes key hashes before the aliases_json column is populated.

The backfill migration (line 329) does recompute hashes for ALL keys, not just empty ones. However, it runs before migrationDropDeploymentColumnsAndAddAliases (line 350), which adds and populates the aliases_json column. When backfillAllowedModelsWildcard computes GenerateKeyHash(schemaKey) with Aliases: key.Aliases, the aliases_json column doesn't yet exist, so all keys get hashed with empty/null Aliases. After the next migration populates the Aliases field, those recomputed hashes are immediately stale. Pre-alias rows with non-empty ConfigHash will appear "changed" on restart, triggering false reconciliation and risking dashboard edits being overwritten.

Fix: Reorder migrations so migrationDropDeploymentColumnsAndAddAliases runs before migrationBackfillAllowedModelsWildcard, or add an additional hash recomputation step after Aliases is populated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/lib/config.go` around lines 791 - 804, The backfill
recomputes key hashes in migrationBackfillAllowedModelsWildcard before
aliases_json is created/populated by
migrationDropDeploymentColumnsAndAddAliases, causing stale hashes; fix by
reordering migrations so migrationDropDeploymentColumnsAndAddAliases runs before
migrationBackfillAllowedModelsWildcard OR add a follow-up step that re-runs
configstore.GenerateKeyHash(schema.Key{... Aliases: key.Aliases ...}) (i.e.,
recompute via GenerateKeyHash/GenerateKeyHash usage) after aliases_json is
populated so hashes reflect the populated Aliases field and prevent false
"changed" detection.
core/schemas/models.go (1)

137-143: ⚠️ Potential issue | 🟠 Major

Renaming deployment to alias is a wire-level breaking change.

schemas.Model is part of the exported response contract, so swapping the JSON key here will break existing /models consumers that still read deployment. Unless this ships behind a versioned API boundary, please keep accepting/emitting the old key for at least one compatibility window.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/models.go` around lines 137 - 143, The Model struct's wire name
change from "deployment" to "alias" is breaking; keep backwards compatibility by
implementing custom JSON marshaling/unmarshaling on Model: add MarshalJSON and
UnmarshalJSON methods that read either "deployment" or "alias" into the Alias
field on decode and that emit both "deployment" and "alias" keys on encode (or
at minimum emit the original "deployment" key alongside the new "alias" key).
Target the Model type and its Alias field in your implementations so consumers
of /models continue to accept the old key while the new key is supported.
core/providers/gemini/errors.go (1)

39-79: ⚠️ Potential issue | 🟠 Major

Preserve metadata in Gemini error conversion.

After Line 39, parsed provider errors no longer attach request/provider/model context in this conversion path. Please restore metadata enrichment in the error-conversion flow (or enforce equivalent caller-side backfill with tests), otherwise alias-aware debugging/attribution can regress.

As per coding guidelines: core/providers/*/errors.go: “Every provider implementation must include errors.go with an ErrorConverter function that parses provider-specific HTTP errors into *schemas.BifrostError with metadata.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/errors.go` around lines 39 - 79, parseGeminiError
currently overwrites/returns bifrostErr without preserving the
request/provider/model metadata added by providerUtils.HandleProviderAPIError;
update the function so after each HandleProviderAPIError call (the ones that
populate bifrostErr) you preserve or copy any metadata/context fields (e.g.,
bifrostErr.RequestID, bifrostErr.Provider, bifrostErr.Model or a
bifrostErr.Metadata map) into the final returned bifrostErr before returning
from parseGeminiError (both in the []GeminiGenerationError path and single
GeminiGenerationError path), ensuring the parsed Code/Message are set on the
existing bifrostErr rather than replacing it so caller-side attribution and
alias-aware debugging remain intact.
core/providers/huggingface/errors.go (1)

13-57: ⚠️ Potential issue | 🟠 Major

Keep metadata enrichment in HuggingFace image error conversion.

With Line 13 signature simplification, this parser no longer carries request/provider/model context through the conversion path. Please restore metadata enrichment in this error-conversion flow (or guarantee equivalent caller-side backfill with coverage).

As per coding guidelines: core/providers/*/errors.go: “Every provider implementation must include errors.go with an ErrorConverter function that parses provider-specific HTTP errors into *schemas.BifrostError with metadata.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/errors.go` around lines 13 - 57,
parseHuggingFaceImageError currently strips request/provider/model context when
converting errors; restore metadata enrichment by ensuring the returned
*schemas.BifrostError contains provider metadata (e.g., Provider, Model,
RequestID) either by: 1) changing parseHuggingFaceImageError signature to accept
context params (e.g., provider, model, requestID) and set bifrostErr.Metadata or
bifrostErr.Provider/Model fields accordingly after calling
providerUtils.HandleProviderAPIError, or 2) if leaving signature unchanged,
extract and populate equivalent metadata into bifrostErr from resp headers or
the errorResp body before returning; update references to
parseHuggingFaceImageError callers if you change the signature so metadata is
passed through.
core/internal/llmtests/response_validation.go (1)

849-854: ⚠️ Potential issue | 🟠 Major

Use response.Model as fallback for Claude detection.

The status-gating logic only inspects ExtraFields.ResolvedModelUsed. If that field is empty on a valid Claude response, this path can incorrectly require response.Status and fail validation.

💡 Suggested fix
  provider := response.ExtraFields.Provider
- model := response.ExtraFields.ResolvedModelUsed
+ model := strings.TrimSpace(response.ExtraFields.ResolvedModelUsed)
+ if model == "" {
+   model = strings.TrimSpace(response.Model)
+ }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/response_validation.go` around lines 849 - 854, The
current check only uses response.ExtraFields.ResolvedModelUsed (assigned to
variable model) to detect "claude" and can miss cases where ResolvedModelUsed is
empty; change the logic so model is set to
response.ExtraFields.ResolvedModelUsed but if that's empty use response.Model as
a fallback before evaluating the provider and the
!strings.Contains(strings.ToLower(model), "claude") condition (the surrounding
condition and the response.Status check remain the same).
framework/configstore/clientconfig.go (1)

557-564: ⚠️ Potential issue | 🟡 Minor

Hash empty alias maps the same as omitted aliases.

Every other collection field here normalizes empty vs omitted values, but key.Aliases != nil makes aliases: {} change the key hash while a missing aliases field does not. That can cause needless config churn when serializers round-trip an empty object.

Suggested fix
 	// Hash Aliases
-	if key.Aliases != nil {
+	if len(key.Aliases) > 0 {
 		data, err := sonic.Marshal(key.Aliases)
 		if err != nil {
 			return "", err
 		}
 		hash.Write(data)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/clientconfig.go` around lines 557 - 564, The code
currently hashes key.Aliases when it is non-nil, which makes an empty map
(aliases: {}) change the hash while an omitted aliases field does not; update
the conditional to only marshal-and-write aliases when they are non-empty (e.g.,
if len(key.Aliases) > 0) so empty maps are treated the same as omitted, leaving
the sonic.Marshal and hash.Write usage intact for non-empty maps.
docs/providers/provider-routing.mdx (1)

471-499: ⚠️ Potential issue | 🟠 Major

The Bedrock inference-profile example is missing the arn half of the config.

The alias values shown here are only model/profile IDs, but Step 5 says Bifrost sends a full ARN. Without bedrock_key_config.arn, this example contradicts the Bedrock provider page in the same stack and points readers toward a broken setup.

Proposed doc fix
 {
   "providers": {
     "bedrock": {
       "keys": [
         {
           "name": "bedrock-key",
           "aliases": {
             "claude-sonnet": "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
             "claude-opus": "us.anthropic.claude-3-opus-20240229-v1:0"
           },
           "bedrock_key_config": {
             "access_key": "your-access-key",
             "secret_key": "your-secret-key",
-            "region": "us-east-1"
+            "region": "us-east-1",
+            "arn": "arn:aws:bedrock:us-east-1:123456789012:inference-profile"
           }
         }
       ]
     }
   }
 }
@@
-5. **Sent to Bedrock**: Full ARN used in API call
+5. **Sent to Bedrock**: `arn + "/" + resolved alias` is used in the Bedrock runtime path

As per coding guidelines, "always check the stack if there is one for the current PR."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/provider-routing.mdx` around lines 471 - 499, The Bedrock
example is missing the ARN in the bedrock_key_config, so update the JSON snippet
to include the ARN/role identifier used when making API calls; specifically add
an "arn" (or "role_arn" if that matches our provider schema) field into the
bedrock_key_config object and ensure the aliases (e.g., "claude-sonnet" ->
"us.anthropic.claude-3-5-sonnet-20241022-v2:0") remain model IDs while the
bedrock_key_config.arn supplies the full ARN sent to Bedrock; modify the example
near the "bedrock" -> "keys" -> "bedrock_key_config" block so it aligns with the
Bedrock provider page and the Step 5 description.
docs/providers/supported-providers/azure.mdx (1)

157-165: ⚠️ Potential issue | 🟠 Major

Use models here, not allowed_models.

This section is documenting a key object, but allowed_models is the provider-config/governance field. Leaving it here will push readers to a key setting that won't participate in key-level model filtering.

Proposed doc fix
 - `api_version` - API version to use (default: `2024-10-21`)
 - `aliases` - Map of model names to Azure deployment IDs (optional, set at key level)
-- `allowed_models` - List of allowed models to use from this key (optional)
+- `models` - List of allowed models to use from this key (optional)

As per coding guidelines, "always check the stack if there is one for the current PR."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/supported-providers/azure.mdx` around lines 157 - 165, Update
the configuration docs to use the key-level field name "models" instead of
"allowed_models": replace the bullet "- `allowed_models` - List of allowed
models to use from this key (optional)" with "- `models` - List of allowed
models to use from this key (optional)", ensuring the documentation refers to
the key object field "models" (not the provider/governance field
"allowed_models") so key-level model filtering is correctly described.
core/providers/openai/errors.go (1)

13-53: ⚠️ Potential issue | 🟠 Major

Keep metadata on provider API errors.

After this signature change, ParseOpenAIError has no way to attach provider/request/model context, so OpenAI-compatible providers now return bare status/code/message errors only. Either keep the metadata inputs here or have each caller reattach the provider, request type, and original/resolved model fields before returning.

As per coding guidelines, "core/providers/*/errors.go: Every provider implementation must include errors.go with an ErrorConverter function that parses provider-specific HTTP errors into *schemas.BifrostError with metadata."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/errors.go` around lines 13 - 53, ParseOpenAIError
currently has no way to attach provider/request/model metadata (per the
ErrorConverter contract) so callers get bare errors; update ParseOpenAIError (or
restore the ErrorConverter signature) to accept metadata inputs (e.g., provider
string, requestType string, originalModel and resolvedModel strings or a
metadata struct) and ensure it populates bifrostErr.ExtraFields (or the
appropriate fields on *schemas.BifrostError) with provider, request type, and
model info before returning; reference the ParseOpenAIError function and the
ErrorConverter type to locate where to add parameters and set metadata, or
alternatively require callers to call a helper that reattaches
provider/request/model to the returned *schemas.BifrostError.
docs/quickstart/gateway/provider-configuration.mdx (1)

1017-1019: ⚠️ Potential issue | 🟡 Minor

Use one term for Azure mappings in this section.

These examples now use aliases, but the walkthrough steps above still tell readers to configure “Deployments”. Please rename the steps or explicitly explain the UI/API naming difference.

Also applies to: 1037-1040, 1103-1105, 1123-1126

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/quickstart/gateway/provider-configuration.mdx` around lines 1017 - 1019,
The section mixes two terms—“Deployments” (used in the step list like items 6–8:
Endpoint, Deployments, API Version) and “aliases” (used in examples); pick one
term and make them consistent across the walkthrough: either rename the step
label “Configure Deployments” to “Configure aliases” or, better, add a single
clarifying parenthetical to the step (e.g., “Configure Deployments (called
‘aliases’ in the example/API)”) and update the other occurrences that the
reviewer flagged (the similar blocks around the other ranges) so every reference
uses the chosen term or the exact parenthetical explanation.
core/bifrost.go (1)

5145-5161: ⚠️ Potential issue | 🔴 Critical

Avoid dereferencing result on successful stream requests.

result is only populated in the non-stream branch above. When a stream succeeds, this path panics before the worker can send req.ResponseStream <- stream, so the request never gets a response. Keep the result.GetExtraFields() writes inside the non-stream branch; if the same metadata is needed for streams, stamp it in postHookRunner per chunk.

🛠️ Minimal fix for the panic
-			result.GetExtraFields().Provider = provider.GetProviderKey()
-			result.GetExtraFields().RequestType = req.RequestType
-			result.GetExtraFields().OriginalModelRequested = originalModelRequested
-			result.GetExtraFields().ResolvedModelUsed = resolvedModel
-
 			if IsStreamRequestType(req.RequestType) {
 				// Send stream with context awareness to prevent deadlock
 				select {
 				case req.ResponseStream <- stream:
@@
 				case <-time.After(5 * time.Second):
 					// Timeout to prevent indefinite blocking
 					bifrost.logger.Warn("Timeout while sending stream response, client may have disconnected")
 				}
 			} else {
+				result.GetExtraFields().Provider = provider.GetProviderKey()
+				result.GetExtraFields().RequestType = req.RequestType
+				result.GetExtraFields().OriginalModelRequested = originalModelRequested
+				result.GetExtraFields().ResolvedModelUsed = resolvedModel
+
 				// Send response with context awareness to prevent deadlock
 				select {
 				case req.Response <- result:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 5145 - 5161, The code is dereferencing result
for stream requests (inside the IsStreamRequestType branch) which causes a panic
because result is only set for non-streams; move the result.GetExtraFields()
assignments (Provider, RequestType, OriginalModelRequested, ResolvedModelUsed)
into the non-stream branch where result is guaranteed to be populated, and
remove any result.* writes from the block that sends req.ResponseStream (and
related select cases) so you never access result for streams; if you need the
same metadata for streaming responses, record/stamp it inside postHookRunner
per-chunk instead of using result during the initial stream send.
core/providers/gemini/gemini.go (1)

738-752: ⚠️ Potential issue | 🟠 Major

Drain the large upload body in this response-threshold path.

This helper is the one large-payload request path here that never calls providerUtils.DrainLargePayloadRemainder(ctx) after Gemini responds. When BifrostContextKeyLargePayloadMode and BifrostContextKeyLargeResponseThreshold are enabled together, an early upstream response can still leave unread bytes on the client body and surface proxy-side broken-pipe/502 errors.

Suggested fix
  latency, bifrostErr, wait := providerUtils.MakeRequestWithContext(ctx, streamingClient, req, resp)
  if bifrostErr != nil {
  	wait()
  	fasthttp.ReleaseResponse(resp)
  	return nil, bifrostErr
  }
+ if bodyReader != nil {
+ 	providerUtils.DrainLargePayloadRemainder(ctx)
+ }

  // Handle error response — materialize stream body for error parsing
  if resp.StatusCode() != fasthttp.StatusOK {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 738 - 752, The error path that
handles non-200 responses after calling providerUtils.BuildLargeResponseClient
and providerUtils.MakeRequestWithContext never drains any remaining large-upload
bytes, so add a call to providerUtils.DrainLargePayloadRemainder(ctx) before
calling wait() and fasthttp.ReleaseResponse(resp) in the block that checks
resp.StatusCode() != fasthttp.StatusOK; ensure DrainLargePayloadRemainder is
invoked regardless of parseGeminiError outcome so the connection/body is fully
consumed before providerUtils.EnrichError is returned.
🟠 Major comments (18)
transports/bifrost-http/integrations/anthropic.go-46-46 (1)

46-46: ⚠️ Potential issue | 🟠 Major

These fields are no longer reliable for Claude detection on aliased non-Anthropic routes.

OriginalModelRequested is now the user input and ResolvedModelUsed is the provider API identifier. For alias-backed Claude deployments, those can be values like "my-claude" and "prod-sonnet", so isClaudeModel(...) returns false even though the underlying model is Anthropic. That disables raw-response passthrough for Claude Code. Feed isClaudeModel a canonical resolved model name, or persist an explicit Claude/Anthropic marker during resolution.

Also applies to: 88-88, 116-116

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/integrations/anthropic.go` at line 46, The check in
shouldUsePassthrough is relying on resp.ExtraFields.OriginalModelRequested and
resp.ExtraFields.ResolvedModelUsed which are not reliable for alias-backed
Claude routes; feed isClaudeModel a canonical resolved provider model or attach
an explicit Anthropic/Claude marker during resolution instead. Update
shouldUsePassthrough (and the calls at the other occurrences referencing
resp.ExtraFields.OriginalModelRequested / ResolvedModelUsed) to use
resp.ExtraFields.Provider and a new canonicalResolvedModel or
resp.ExtraFields.IsAnthropic flag produced at model-resolution time, and change
the isClaudeModel invocation to use that canonicalResolvedModel or the explicit
marker so alias names like "my-claude" still detect Anthropic/Claude correctly.
plugins/logging/operations.go-981-985 (1)

981-985: ⚠️ Potential issue | 🟠 Major

Don't rebuild OriginalModelRequested from the mutable model column.

For streamed logs, this same file overwrites logstore.Log.Model with streamResponse.Data.Model on Lines 408-410 and 548-550. After alias resolution, that value is the resolved provider model/deployment, not the caller’s original model. Reusing it as OriginalModelRequested here will misprice RecalculateCosts for aliased streaming requests. Persist the original and resolved model separately, then populate OriginalModelRequested / ResolvedModelUsed from the correct source during reconstruction.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 981 - 985, The code rebuilds
OriginalModelRequested from the mutable logstore.Log.Model (which gets
overwritten from streamResponse.Data.Model), causing mispricing; persist the
caller's original model separately and track the resolved provider
model/deployment as ResolvedModelUsed instead of reusing logstore.Log.Model.
Update the reconstruction in plugins/logging/operations.go to read
OriginalModelRequested from the original-model field you add (or an existing
unmodified field) and populate ResolvedModelUsed from the
resolved/model-from-stream (streamResponse.Data.Model) so extraFields
(OriginalModelRequested and ResolvedModelUsed) passed into RecalculateCosts use
the correct sources.
core/providers/bedrock/text.go-167-171 (1)

167-171: ⚠️ Potential issue | 🟠 Major

Prefer resolved model identity for Bedrock response-shape selection.

Using OriginalModelRequested first at Line 169 can pick the wrong response format when the original model is an alias without anthropic./mistral. hints. This branching should prioritize resolved model identity.

Proposed fix
-	// Use OriginalModelRequested from ExtraFields if available, otherwise use Model
+	// Prefer resolved provider model identity for format detection
 	model := bifrostResp.Model
-	if bifrostResp.ExtraFields.OriginalModelRequested != "" {
+	if bifrostResp.ExtraFields.ResolvedModelUsed != "" {
+		model = bifrostResp.ExtraFields.ResolvedModelUsed
+	} else if bifrostResp.ExtraFields.OriginalModelRequested != "" {
 		model = bifrostResp.ExtraFields.OriginalModelRequested
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/text.go` around lines 167 - 171, The current logic
prefers ExtraFields.OriginalModelRequested over the resolved bifrostResp.Model
which can select the wrong response shape for aliases; change the branching to
prefer bifrostResp.Model first (the resolved identity) and only fall back to
bifrostResp.ExtraFields.OriginalModelRequested when bifrostResp.Model is
empty—update the assignment around model := bifrostResp.Model and the subsequent
if-check to use a guard like if model == "" &&
bifrostResp.ExtraFields.OriginalModelRequested != "" { model =
bifrostResp.ExtraFields.OriginalModelRequested } so response-shape selection
uses the resolved model identity.
framework/configstore/rdb.go-51-53 (1)

51-53: ⚠️ Potential issue | 🟠 Major

Composite key reads still drop Aliases.

schemaKeyFromTableKey and the write paths now persist Aliases, but GetVirtualKeys, GetVirtualKeysPaginated, GetVirtualKey, GetVirtualKeyByValue, and GetGovernanceConfig later preload ProviderConfigs.Keys with Select("id, name, key_id, models_json, provider"). That projection never hydrates aliases_json, so associated keys come back with an empty alias map. Please add aliases_json there or switch those preloads to config_keys.*.

Also applies to: 75-77, 380-382, 552-554, 676-678

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/rdb.go` around lines 51 - 53, The composite key reads
are dropping Aliases because the GORM preload uses Select("id, name, key_id,
models_json, provider") which omits aliases_json; update each preload call in
GetVirtualKeys, GetVirtualKeysPaginated, GetVirtualKey, GetVirtualKeyByValue,
and GetGovernanceConfig (and the other similar occurrences around the referenced
regions) to either include "aliases_json" in the Select projection or change the
preload to use the full config_keys.* projection so ProviderConfigs.Keys are
hydrated with aliases; adjust the Select strings accordingly to ensure
dbKey.Aliases is populated when schemaKeyFromTableKey and write paths persist
it.
core/providers/openrouter/openrouter.go-190-208 (1)

190-208: ⚠️ Potential issue | 🟠 Major

Strip the openrouter/ prefix from alias targets too.

FilterModel compares raw API IDs like openai/gpt-4 against alias targets, but normalizedAliases only normalizes the alias key. If a key is configured as Aliases{"gpt-4":"openrouter/openai/gpt-4"}, this path will miss the returned model and BackfillModels can emit a doubly prefixed ID.

Suggested fix
 normalizedAliases := make(map[string]string, len(key.Aliases))
 for k, v := range key.Aliases {
-	normalizedAliases[stripPrefix(k)] = v
+	normalizedAliases[stripPrefix(k)] = stripPrefix(v)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openrouter/openrouter.go` around lines 190 - 208, The alias
targets in normalizedAliases are not being normalized, so alias entries like
key.Aliases (e.g., "gpt-4":"openrouter/openai/gpt-4") miss matching in
FilterModel and can cause BackfillModels to emit double-prefixed IDs; update the
loop that builds normalizedAliases to apply stripPrefix to both the alias key
and the alias value (use stripPrefix(k) and stripPrefix(v)) so both sides are
normalized before storing in normalizedAliases.
core/providers/elevenlabs/elevenlabs.go-245-246 (1)

245-246: ⚠️ Potential issue | 🟠 Major

Stop logging raw ElevenLabs error bodies.

These debug lines dump the full provider response, which can include request-derived text or other user data. Log status/error summaries instead of resp.Body().

Also applies to: 535-536

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/elevenlabs/elevenlabs.go` around lines 245 - 246, The debug
log currently prints the full provider response body
(provider.logger.Debug(fmt.Sprintf("error from %s provider: %s",
provider.GetProviderKey(), string(resp.Body())))), which may expose user data;
replace that call to avoid logging resp.Body() and instead log a safe summary
such as the provider key and HTTP status or the sanitized error message returned
by parseElevenlabsError(resp) (e.g., provider.GetProviderKey() and
resp.StatusCode or parsedErr.Message). Update both occurrences (around
provider.logger.Debug at the shown location and the similar lines at 535-536) to
call logger.Debug or logger.Warn with the provider key plus status/parsed error
text, and continue to return providerUtils.EnrichError(...) unchanged.
core/providers/mistral/mistral.go-307-308 (1)

307-308: ⚠️ Potential issue | 🟠 Major

Avoid logging raw Mistral error bodies.

These debug lines emit the full provider error payload. Transcription errors can echo request-derived text or metadata, so this creates an unnecessary sensitive-data trail in logs.

Also applies to: 439-440

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/mistral/mistral.go` around lines 307 - 308, The debug call
currently logs the full response body (provider.logger.Debug("error from %s
provider: %s", providerName, string(resp.Body()))) which can leak sensitive
transcription data; update both occurrences (around provider.logger.Debug at the
shown block and the similar call at lines ~439-440) to avoid logging resp.Body()
directly — instead log non-sensitive context like providerName and
resp.StatusCode() or the error type/length, or pass the structured error from
openai.ParseOpenAIError(resp) into the log without the raw body; ensure you
remove or redact any raw resp.Body() usage in provider.logger.Debug while
keeping the call to return openai.ParseOpenAIError(resp).
framework/configstore/migrations.go-3557-3604 (1)

3557-3604: ⚠️ Potential issue | 🟠 Major

Recompute config_hash after populating aliases_json.

In this stack, upgraded installs have already run the older key-hash backfills before this new migration exists. This migration changes the effective key config by copying legacy deployment mappings into aliases_json, but it never refreshes config_hash, so those keys stay permanently dirty after upgrade.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 3557 - 3604,
migrationDropDeploymentColumnsAndAddAliases currently copies legacy deployment
columns into aliases_json but never updates the derived config_hash, leaving
rows out-of-sync; after the loop that updates aliases_json (and before dropping
the legacy columns) add logic inside the same transaction to recompute and
persist config_hash for every affected config_keys row (e.g., rows where
aliases_json was set by the migration or where any of the legacy columns were
non-null). Use the existing application helper that computes a key's config hash
(or re-serialize the key config and hash it) and run an UPDATE on config_keys
within tx to set config_hash to the new value (refer to
migrationDropDeploymentColumnsAndAddAliases, tables.TableKey, aliases_json,
config_hash, and config_keys) so hashes reflect the new aliases_json before
dropping legacy columns.
framework/configstore/migrations.go-3606-3614 (1)

3606-3614: ⚠️ Potential issue | 🟠 Major

Rollback is currently destructive.

The up migration drops the legacy deployment columns after copying their contents into aliases_json, but the rollback only drops aliases_json. That turns a downgrade into data loss. If reverse reconstruction is not implemented, this rollback should be a no-op instead of deleting the migrated data.

💡 Safer rollback shape
 		Rollback: func(tx *gorm.DB) error {
-			tx = tx.WithContext(ctx)
-			m := tx.Migrator()
-			if m.HasColumn(&tables.TableKey{}, "aliases_json") {
-				if err := m.DropColumn(&tables.TableKey{}, "aliases_json"); err != nil {
-					return err
-				}
-			}
+			// One-way migration: the provider-specific source columns have already
+			// been dropped, so removing aliases_json here would lose user data.
 			return nil
 		},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 3606 - 3614, The rollback
in the migration's Rollback func currently drops the new "aliases_json" column
(see Rollback func, tx/Migrator m and m.DropColumn(&tables.TableKey{},
"aliases_json")), which will lose data because the up migration removed the
legacy deployment columns; change the Rollback to be a no-op (simply return nil)
unless you implement a reverse migration that reconstructs the original legacy
columns from aliases_json—so remove or comment out the HasColumn/DropColumn
logic for "aliases_json" in Rollback and leave the function returning nil.
core/bifrost.go-4076-4083 (1)

4076-4083: ⚠️ Potential issue | 🟠 Major

Don’t rebuild ExtraFields on the final error paths.

This stack is wiring ResolvedModelUsed out of requestWorker, but these wrappers replace the whole ExtraFields struct and only copy three legacy fields back. That drops the new alias-resolution metadata again on every bubbled error from handleRequest/handleStreamRequest.

🩹 Preserve the existing struct instead of clobbering it
-			primaryErr.ExtraFields = schemas.BifrostErrorExtraFields{
-				RequestType:            req.RequestType,
-				Provider:               provider,
-				OriginalModelRequested: model,
-				RawRequest:             primaryErr.ExtraFields.RawRequest,
-				RawResponse:            primaryErr.ExtraFields.RawResponse,
-				KeyStatuses:            primaryErr.ExtraFields.KeyStatuses,
-			}
+			primaryErr.ExtraFields.RequestType = req.RequestType
+			primaryErr.ExtraFields.Provider = provider
+			primaryErr.ExtraFields.OriginalModelRequested = model

Apply the same pattern to the fallback and streaming branches in this file.

Also applies to: 4127-4134, 4140-4147, 4191-4198, 4240-4247, 4253-4260

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4076 - 4083, The current error-wrapping code in
the final error paths (e.g., where primaryErr.ExtraFields is reassigned)
replaces the entire ExtraFields struct and drops alias-resolution metadata like
ResolvedModelUsed; instead preserve the existing ExtraFields and only update the
specific legacy fields. Modify the error handling in the fallback and streaming
branches (the same pattern used elsewhere) to merge into primaryErr.ExtraFields
rather than reassigning it — e.g., access primaryErr.ExtraFields and set
RequestType, Provider, OriginalModelRequested, RawRequest, RawResponse,
KeyStatuses as needed while leaving other fields (such as ResolvedModelUsed)
intact; apply this change to the other listed blocks (around lines shown:
4127-4134, 4140-4147, 4191-4198, 4240-4247, 4253-4260) so wrapped errors retain
alias-resolution metadata.
core/schemas/bifrost.go-815-816 (1)

815-816: ⚠️ Potential issue | 🟠 Major

Avoid an in-place rename of these serialized fields.

This changes the JSON contract for every success and error payload in the stack. Existing clients and any persisted payloads that still read model_requested / model_deployment will silently lose data unless this ships additively or behind versioning.

Also applies to: 982-984

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/bifrost.go` around lines 815 - 816, Restore the original JSON
field names instead of renaming them in-place: change the json tags on the
Existing struct fields OriginalModelRequested and ResolvedModelUsed back to the
previous names (e.g., "model_requested" and "model_deployment") so existing
clients/payloads aren’t broken, and if you need the new key names add new
distinct fields (e.g., OriginalModelRequestedNew / ResolvedModelUsedNew) with
the new json tags and populate both the old and new fields where objects are
created/serialized; apply the same treatment to the other affected fields
referenced around lines 982-984 to preserve backward compatibility.
core/providers/replicate/utils.go-179-184 (1)

179-184: ⚠️ Potential issue | 🟠 Major

Deployment endpoint routing lacks validation of model parameter.

buildPredictionURL now checks useDeploymentsEndpoint before isVersionID(model), but there is no upstream validation ensuring that when UseDeploymentsEndpoint=true, the model parameter is a deployment slug. If a request reaches here with a raw version ID (64-char hex) or owner/model reference while UseDeploymentsEndpoint=true, it will incorrectly generate /v1/deployments/<raw-value>/predictions URLs.

Add validation to ensure only normalized deployment slugs are passed to the deployments endpoint, or perform model slug normalization before calling this function.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/replicate/utils.go` around lines 179 - 184, The
buildPredictionURL function may be called with raw version IDs or owner/model
refs while useDeploymentsEndpoint is true, producing invalid
/v1/deployments/<raw>/predictions paths; update buildPredictionURL to validate
and normalize the model parameter before using it for the deployments endpoint:
inside buildPredictionURL (and/or call sites), check that model matches the
expected deployment slug format (not a 64-char hex via isVersionID and not an
owner/model ref) and either return an error or convert/normalize the model to a
deployment slug; ensure the validation references the useDeploymentsEndpoint
flag and the isVersionID helper so only normalized slugs are inserted into the
deployments path.
core/providers/gemini/gemini.go-1980-1983 (1)

1980-1983: ⚠️ Potential issue | 🟠 Major

Don't log raw upstream error bodies here.

These debug lines can capture prompt/file content because provider validation errors often echo invalid input. That bypasses the existing raw-request/raw-response gating and turns debug logging into a quiet PII leak.

Suggested fix
- provider.logger.Debug(fmt.Sprintf("error from %s provider: %s", provider.GetProviderKey(), string(resp.Body())))
+ provider.logger.Debug("error from %s provider: status=%d", provider.GetProviderKey(), resp.StatusCode())

Also applies to: 2071-2074

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 1980 - 1983, The debug log
currently prints the raw upstream error body (using resp.Body()) which can leak
prompt/file content; update the error handling in the block that calls
providerUtils.MaterializeStreamErrorBody and returns providerUtils.EnrichError
so it no longer logs string(resp.Body()). Instead log a sanitized message with
provider.GetProviderKey() and relevant non-sensitive metadata (e.g.,
resp.StatusCode() or a fixed "upstream error" tag) or remove the Debug call
entirely, and apply the same change to the analogous error-handling block that
mirrors these lines (the later block around the parseGeminiError/EnrichError
usage).
transports/config.schema.json-1767-1772 (1)

1767-1772: ⚠️ Potential issue | 🟠 Major

Keep the alias migration backward-compatible in the schema.

aliases is the right unified shape, but this schema now rejects existing configs in two ways: legacy Azure/Bedrock/Vertex deployments maps are no longer accepted, and Replicate keys now require replicate_key_config.use_deployments_endpoint even though absence can safely mean false. Because schema validation runs before any runtime migration logic, this becomes a startup-breaking config change for existing installs.

As per coding guidelines, transports/config.schema.json is the authoritative definition for all config.json fields.

Also applies to: 1859-1886, 2084-2124

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1767 - 1772, The schema change
rejects existing configs by replacing legacy provider-specific maps; update the
schema so "aliases" remains the unified shape but also accepts legacy forms:
allow a top-level "deployments" object with additionalProperties:string (to
accept Azure/Bedrock/Vertex legacy maps) and relax Replicate-related validation
so "replicate_key_config.use_deployments_endpoint" is optional (absence = false)
or accept Replicate keys under the old key shape; modify the schema blocks
around the "aliases" definition and the Replicate key config (the sections
referencing "aliases", "deployments", and
"replicate_key_config.use_deployments_endpoint") to permit both the new and
legacy shapes without breaking startup validation.
core/providers/utils/models.go-203-237 (1)

203-237: ⚠️ Potential issue | 🟠 Major

Deduplicate backfills with the same equivalence rules used during filtering.

FilterModel resolves equivalence via MatchFns, but BackfillModels only checks included[strings.ToLower(...)]. Providers with normalization-based matchers can therefore include a model during the filter pass and then backfill the same logical model again under an equivalent spelling.

Also applies to: 266-273, 299-303

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/models.go` around lines 203 - 237, FilterModel uses
equivalence via resolveModelID and the MatchFns but BackfillModels (and the
other affected blocks around the 266-273 and 299-303 regions) deduplicate by raw
lowercased strings, so logically-equivalent names slip through; change the
backfill/dedup logic to use the same equivalence check as filtering by either
storing and checking the resolvedName/alias from resolveModelID or calling
matches(resolvedNameOrCandidate, entry, p.MatchFns) when populating/consulting
the included map (i.e., replace uses of strings.ToLower(...) keying with a
lookup that uses resolveModelID and/or calls matches against existing included
entries using MatchFns so equivalent spellings are treated as duplicates).
core/schemas/account.go-129-129 (1)

129-129: ⚠️ Potential issue | 🟠 Major

Ensure Key.Aliases is deep-copied when keys are selected or filtered.

The Aliases field is a map (reference type) that is shared across shallow copies when keys are appended to slices during filtering in selectKeyFromProviderForModel() and similar functions. Add defensive copying using maps.Copy() in the key selection/filtering paths or implement a mechanism (such as a CheckAndSetDefaults() on Key, or explicit deep-copy in GetKeysForProvider implementations) to prevent concurrent requests from sharing the same backing map, following the same pattern already applied to NetworkConfig.ExtraHeaders in ProviderConfig.CheckAndSetDefaults().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/account.go` at line 129, The Aliases map on Key must be
deep-copied wherever Key instances are chosen or returned to avoid shared
backing maps; update selectKeyFromProviderForModel(), GetKeysForProvider
implementations and any key-filtering/append paths to make a defensive copy of
Key.Aliases (e.g., use maps.Copy() into a new map) before storing or returning
the Key, or implement Key.CheckAndSetDefaults() to clone Aliases and call it in
those selection paths; follow the same pattern used for
NetworkConfig.ExtraHeaders in ProviderConfig.CheckAndSetDefaults() so concurrent
requests never share the original Aliases map.
core/providers/anthropic/anthropic.go-1334-1337 (1)

1334-1337: ⚠️ Potential issue | 🟠 Major

Apply the normal EnrichError policy on batch/file HTTP failures.

These branches return ParseAnthropicError(resp) directly, so they skip the same raw-payload handling the completion paths use via providerUtils.EnrichError(...). In practice, raw requests are never attached here, and whatever payload policy parseAnthropicError applies is forwarded without the usual sendBackRaw* gating.

🔧 Representative fix
 	if resp.StatusCode() != fasthttp.StatusOK {
 		provider.logger.Debug("error from %s provider: %s", providerName, string(resp.Body()))
-		return nil, ParseAnthropicError(resp)
+		decodedBody, _ := providerUtils.CheckAndDecodeBody(resp)
+		return nil, providerUtils.EnrichError(
+			ctx,
+			ParseAnthropicError(resp),
+			jsonData,
+			decodedBody,
+			sendBackRawRequest,
+			sendBackRawResponse,
+		)
 	}

Use nil for the request bytes on GET-only routes that do not build a request body.

Also applies to: 1423-1426, 1514-1517, 1596-1599, 1708-1711, 1893-1896, 1983-1986, 2083-2086, 2165-2168, 2275-2278

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/anthropic.go` around lines 1334 - 1337, The
error-handling branches currently return ParseAnthropicError(resp) directly
(seen around resp.StatusCode() != fasthttp.StatusOK and
provider.logger.Debug(..., providerName, string(resp.Body()))), bypassing the
standard EnrichError logic; change these returns to call
providerUtils.EnrichError(provider.logger, providerName, resp,
ParseAnthropicError(resp), nil) (use nil for request bytes on GET-only routes)
so the normal sendBackRaw* gating and payload policies are applied; update every
similar branch (e.g., the ones calling ParseAnthropicError at lines noted) to
use providerUtils.EnrichError with nil request bytes instead of returning
ParseAnthropicError(resp) directly.
core/providers/anthropic/anthropic.go-1190-1192 (1)

1190-1192: ⚠️ Potential issue | 🟠 Major

Preserve converter-populated error extra fields.

Line 1191 resets the entire BifrostErrorExtraFields. That also strips any non-metadata context ToBifrostResponsesStream already attached before hooks/logging see the error.

🔧 Proposed fix
-				bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{}

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 75d6cd7b-ae9c-406a-8827-566f5d05c7c8

📥 Commits

Reviewing files that changed from the base of the PR and between ec8511b and b617cb3.

📒 Files selected for processing (150)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/image_edit.go
  • core/internal/llmtests/image_generation.go
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/speech_synthesis.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/internal/llmtests/transcription_stream.go
  • core/internal/llmtests/video.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/batch.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/errors.go
  • core/providers/anthropic/models.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/text.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/utils.go
  • core/providers/azure/azure.go
  • core/providers/azure/files.go
  • core/providers/azure/models.go
  • core/providers/azure/utils.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/invoke.go
  • core/providers/bedrock/models.go
  • core/providers/bedrock/rerank_test.go
  • core/providers/bedrock/s3.go
  • core/providers/bedrock/signer.go
  • core/providers/bedrock/text.go
  • core/providers/cerebras/cerebras.go
  • core/providers/cohere/chat.go
  • core/providers/cohere/cohere.go
  • core/providers/cohere/errors.go
  • core/providers/cohere/models.go
  • core/providers/elevenlabs/elevenlabs.go
  • core/providers/elevenlabs/errors.go
  • core/providers/elevenlabs/models.go
  • core/providers/gemini/batch.go
  • core/providers/gemini/errors.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/models.go
  • core/providers/gemini/videos.go
  • core/providers/groq/groq.go
  • core/providers/huggingface/errors.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/responses.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/huggingface/utils.go
  • core/providers/mistral/mistral.go
  • core/providers/mistral/models.go
  • core/providers/mistral/transcription.go
  • core/providers/nebius/errors.go
  • core/providers/nebius/nebius.go
  • core/providers/ollama/ollama.go
  • core/providers/openai/batch.go
  • core/providers/openai/errors.go
  • core/providers/openai/errors_test.go
  • core/providers/openai/files.go
  • core/providers/openai/images.go
  • core/providers/openai/large_payload.go
  • core/providers/openai/models.go
  • core/providers/openai/openai.go
  • core/providers/openai/transcription.go
  • core/providers/openai/videos.go
  • core/providers/openrouter/openrouter.go
  • core/providers/parasail/parasail.go
  • core/providers/perplexity/chat.go
  • core/providers/perplexity/perplexity.go
  • core/providers/replicate/errors.go
  • core/providers/replicate/files.go
  • core/providers/replicate/images.go
  • core/providers/replicate/models.go
  • core/providers/replicate/replicate.go
  • core/providers/replicate/utils.go
  • core/providers/replicate/videos.go
  • core/providers/runway/errors.go
  • core/providers/runway/runway.go
  • core/providers/runway/videos.go
  • core/providers/sgl/sgl.go
  • core/providers/utils/large_response.go
  • core/providers/utils/models.go
  • core/providers/utils/utils.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/errors.go
  • core/providers/vertex/models.go
  • core/providers/vertex/rerank.go
  • core/providers/vertex/rerank_test.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/utils.go
  • core/providers/vllm/vllm.go
  • core/providers/xai/errors.go
  • core/providers/xai/xai.go
  • core/schemas/account.go
  • core/schemas/bifrost.go
  • core/schemas/chatcompletions.go
  • core/schemas/models.go
  • core/schemas/models_test.go
  • core/utils.go
  • docs/features/litellm-compat.mdx
  • docs/openapi/schemas/management/providers.yaml
  • docs/providers/provider-routing.mdx
  • docs/providers/supported-providers/azure.mdx
  • docs/providers/supported-providers/bedrock.mdx
  • docs/providers/supported-providers/replicate.mdx
  • docs/providers/supported-providers/vertex.mdx
  • docs/quickstart/gateway/provider-configuration.mdx
  • docs/quickstart/go-sdk/provider-configuration.mdx
  • framework/configstore/clientconfig.go
  • framework/configstore/encryption_test.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/encryption_test.go
  • framework/configstore/tables/key.go
  • framework/configstore/tables/virtualkey.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/streaming/accumulator_test.go
  • framework/streaming/images.go
  • framework/streaming/types.go
  • plugins/governance/model_provider_governance_test.go
  • plugins/litellmcompat/texttochat.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/mocker/main.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • tests/governance/e2e_test.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/provider_keys.go
  • transports/bifrost-http/handlers/wsresponses.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/cursor.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/config_test.go
  • transports/config.schema.json
  • transports/schema_test/config_schema_test.go
  • ui/app/workspace/providers/fragments/apiKeysFormFragment.tsx
  • ui/lib/constants/config.ts
  • ui/lib/schemas/providerForm.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
  • ui/lib/utils/validation.ts
💤 Files with no reviewable changes (13)
  • core/providers/cohere/chat.go
  • core/providers/vertex/embedding.go
  • core/providers/perplexity/chat.go
  • core/providers/groq/groq.go
  • core/providers/bedrock/chat.go
  • core/providers/cerebras/cerebras.go
  • core/providers/sgl/sgl.go
  • core/providers/parasail/parasail.go
  • framework/configstore/tables/virtualkey.go
  • core/providers/huggingface/responses.go
  • core/providers/anthropic/text.go
  • core/providers/vertex/rerank_test.go
  • core/providers/replicate/files.go

Comment thread core/providers/azure/azure.go
Comment thread core/providers/bedrock/invoke.go Outdated
Comment thread framework/configstore/migrations.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from b617cb3 to 706f58a Compare March 30, 2026 08:59
Copy link
Copy Markdown
Collaborator Author

@greptile

Comment thread core/schemas/bifrost.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
framework/configstore/clientconfig.go (1)

585-592: ⚠️ Potential issue | 🟠 Major

Don't hash an empty alias map as a real config change.

Lines 585-592 hash Aliases whenever the map is non-nil, so nil and {} generate different config_hash values. The rest of this file treats empty collections as hash-noops, and framework/configstore/migrations.go also treats aliases_json = '{}' as empty, so a key that round-trips through the DB with an empty alias object can look perpetually dirty.

🛠️ Minimal fix
-	if key.Aliases != nil {
+	if len(key.Aliases) > 0 {
 		data, err := sonic.Marshal(key.Aliases)
 		if err != nil {
 			return "", err
 		}
 		hash.Write(data)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/clientconfig.go` around lines 585 - 592, The code
currently hashes key.Aliases whenever the map is non-nil, causing nil and empty
maps to produce different config_hash values; update the Hash Aliases block in
the function that builds the config hash so it only marshals and writes aliases
when the map is non-nil AND contains entries (e.g., check len(key.Aliases) > 0),
keeping the existing sonic.Marshal error handling and hash.Write usage (refer to
the key.Aliases, sonic.Marshal, and hash.Write symbols).
framework/streaming/types.go (1)

399-409: ⚠️ Potential issue | 🟡 Minor

Always overwrite the image stream model.

Line 407 only assigns imageResp.Model inside the imageResp == nil branch. On the normal final-chunk path p.Data.ImageGenerationOutput is already populated, so alias-backed image streams keep the stale/provider model while the other stream branches now surface RequestedModel.

🛠️ Minimal fix
 		if imageResp == nil {
 			imageResp = &schemas.BifrostImageGenerationResponse{
 				Data: make([]schemas.ImageData, 0),
 			}
 			if p.RequestID != "" {
 				imageResp.ID = p.RequestID
 			}
-			if p.RequestedModel != "" {
-				imageResp.Model = p.RequestedModel
-			}
 		}
+		if p.RequestedModel != "" {
+			imageResp.Model = p.RequestedModel
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/streaming/types.go` around lines 399 - 409, The code only assigns
imageResp.Model when imageResp is nil, so existing p.Data.ImageGenerationOutput
can keep a stale provider model; after obtaining or creating imageResp (the
variable assigned from p.Data.ImageGenerationOutput), always overwrite its Model
with p.RequestedModel when p.RequestedModel is non-empty (and likewise ensure ID
is set from p.RequestID if needed) — i.e., move or add the assignment
imageResp.Model = p.RequestedModel (guarded by p.RequestedModel != "") outside
the imageResp == nil branch so both newly created and existing imageResp objects
get the requested model.
♻️ Duplicate comments (9)
core/providers/huggingface/huggingface.go (1)

870-873: ⚠️ Potential issue | 🟠 Major

Preserve the raw-audio guard on this decode-error path.

When isHFInferenceAudioRequest is true, jsonData is the uploaded audio bytes. Passing it into EnrichError(...) here can attach the raw audio to RawRequest and downstream logs on converter failures. Use nil for the request body on this branch, the same way the other hf-inference transcription error paths already do.

♻️ Proposed fix
  bifrostResponse, convErr := response.ToBifrostTranscriptionResponse(request.Model)
  if convErr != nil {
-		return nil, providerUtils.EnrichError(ctx, providerUtils.NewBifrostOperationError(schemas.ErrProviderResponseDecode, convErr), jsonData, responseBody, provider.sendBackRawRequest, provider.sendBackRawResponse)
+		requestBody := jsonData
+		if isHFInferenceAudioRequest {
+			requestBody = nil
+		}
+		return nil, providerUtils.EnrichError(
+			ctx,
+			providerUtils.NewBifrostOperationError(schemas.ErrProviderResponseDecode, convErr),
+			requestBody,
+			responseBody,
+			provider.sendBackRawRequest,
+			provider.sendBackRawResponse,
+		)
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` around lines 870 - 873, The
decode-error path after calling response.ToBifrostTranscriptionResponse
currently passes jsonData into providerUtils.EnrichError which can expose raw
audio when isHFInferenceAudioRequest is true; update this branch to pass nil as
the request body when isHFInferenceAudioRequest is true (otherwise keep
jsonData), i.e. call providerUtils.EnrichError(ctx,
providerUtils.NewBifrostOperationError(...), nil, responseBody,
provider.sendBackRawRequest, provider.sendBackRawResponse) when
isHFInferenceAudioRequest is true, referencing
response.ToBifrostTranscriptionResponse, isHFInferenceAudioRequest, jsonData,
responseBody and provider.sendBackRawRequest/sendBackRawResponse to locate the
change.
plugins/maxim/main.go (1)

614-619: ⚠️ Potential issue | 🟡 Minor

Use requested-model fallback when setting the Maxim model tag.
On Line 614 and Line 615, resolvedModel can be empty in some paths; this writes empty model tags even when originalModel is available.

Suggested patch
-		logger.AddTagToGeneration(generationID, "model", string(resolvedModel))
-		logger.AddTagToTrace(traceID, "model", string(resolvedModel))
-		if originalModel != "" && originalModel != resolvedModel {
+		modelTag := strings.TrimSpace(resolvedModel)
+		if modelTag == "" {
+			modelTag = strings.TrimSpace(originalModel)
+		}
+		if modelTag != "" {
+			logger.AddTagToGeneration(generationID, "model", modelTag)
+			logger.AddTagToTrace(traceID, "model", modelTag)
+		}
+		if originalModel != "" && resolvedModel != "" && originalModel != resolvedModel {
 			logger.AddTagToGeneration(generationID, "alias", originalModel)
 			logger.AddTagToTrace(traceID, "alias", originalModel)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/maxim/main.go` around lines 614 - 619, The model tag is written using
resolvedModel even when it's empty, causing empty tags; update the logic around
logger.AddTagToGeneration and logger.AddTagToTrace to use originalModel as a
fallback: when resolvedModel is non-empty, set the "model" tag to resolvedModel
(and only add "alias" if originalModel is non-empty and different from
resolvedModel); when resolvedModel is empty but originalModel is present, set
the "model" tag to originalModel and do not add an "alias". Adjust the code that
references generationID, traceID, resolvedModel, originalModel,
logger.AddTagToGeneration and logger.AddTagToTrace accordingly.
core/providers/bedrock/invoke.go (2)

451-460: ⚠️ Potential issue | 🔴 Critical

Prefer the resolved model before resp.Model for family dispatch.

core/providers/bedrock/bedrock.go:1369 writes request.Model back into resp.Model, so aliases like my-nova arrive here alias-shaped. Preferring resp.Model reopens the Nova/AI21 misrouting and can send non-Anthropic responses through toBedrockInvokeAnthropicResponse(...).

Suggested fix
-	model := ""
-	if resp.Model != "" {
-		model = resp.Model
-	} else {
-		extraFields := resp.ExtraFields
-		if extraFields.ResolvedModelUsed != "" {
-			model = extraFields.ResolvedModelUsed
-		} else if extraFields.OriginalModelRequested != "" {
-			model = extraFields.OriginalModelRequested
-		}
-	}
+	model := resp.ExtraFields.ResolvedModelUsed
+	if model == "" {
+		model = resp.Model
+	}
+	if model == "" {
+		model = resp.ExtraFields.OriginalModelRequested
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 451 - 460, The model selection
currently prefers resp.Model which may contain an alias and cause misrouting;
update the logic in the model resolution block (the local variable model,
resp.ExtraFields, and related code path that later calls
toBedrockInvokeAnthropicResponse) to prefer resp.ExtraFields.ResolvedModelUsed
first, then resp.ExtraFields.OriginalModelRequested, and finally fall back to
resp.Model; ensure any downstream family-dispatch decisions use this resolved
model variable instead of resp.Model directly.

628-643: ⚠️ Potential issue | 🔴 Critical

Early stream chunks still ignore the resolved model on the event itself.

The comment above this block is right: chunk-level resp.ExtraFields is the only model source that exists on every event. core/providers/bedrock/responses.go:177-201 only populates resp.Response.Model on Created/InProgress events, so Nova/AI21 delta chunks will still hit the Anthropic path here.

Suggested fix
-	model := ""
-	if resp.Response != nil {
-		if resp.Response.Model != "" {
-			model = resp.Response.Model
-		} else {
-			extraFields := resp.Response.ExtraFields
-			if extraFields.ResolvedModelUsed != "" {
-				model = extraFields.ResolvedModelUsed
-			} else if extraFields.OriginalModelRequested != "" {
-				model = extraFields.OriginalModelRequested
-			}
-		}
-	}
+	model := resp.ExtraFields.ResolvedModelUsed
+	if model == "" {
+		model = resp.ExtraFields.OriginalModelRequested
+	}
+	if model == "" && resp.Response != nil {
+		if resp.Response.ExtraFields.ResolvedModelUsed != "" {
+			model = resp.Response.ExtraFields.ResolvedModelUsed
+		} else if resp.Response.Model != "" {
+			model = resp.Response.Model
+		} else if resp.Response.ExtraFields.OriginalModelRequested != "" {
+			model = resp.Response.ExtraFields.OriginalModelRequested
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 628 - 643, The current logic
reads model only from resp.Response.ExtraFields, but chunk-level model info
lives on resp.ExtraFields for every streaming event; update the model selection
to first check resp.ExtraFields.ResolvedModelUsed then
resp.ExtraFields.OriginalModelRequested, then fall back to resp.Response.Model
and resp.Response.ExtraFields as before (modify the block that sets model,
referencing the resp variable, resp.ExtraFields, resp.Response,
ExtraFields.ResolvedModelUsed, and ExtraFields.OriginalModelRequested) so early
delta chunks correctly capture the resolved model and are routed to the right
provider.
core/providers/elevenlabs/elevenlabs.go (1)

267-268: ⚠️ Potential issue | 🟡 Minor

These parse/shape-error branches still bypass EnrichError(...).

After a 200 response, these direct NewBifrostOperationError(...) returns drop the raw response bytes and forwarded headers again. The transport/decode branches nearby already enrich errors; these remaining parse/shape branches need the same treatment. For transcription, the request bytes can stay nil if you do not want to echo multipart audio.

Suggested pattern
-	return nil, providerUtils.NewBifrostOperationError("failed to parse with-timestamps response", err)
+	return nil, providerUtils.EnrichError(
+		ctx,
+		providerUtils.NewBifrostOperationError("failed to parse with-timestamps response", err),
+		jsonData,
+		body,
+		provider.sendBackRawRequest,
+		provider.sendBackRawResponse,
+	)

Use the same shape for transcription with responseBody as the response payload and nil request bytes where you do not want to surface uploaded audio.

Also applies to: 536-539, 552-558, 567-570

core/providers/bedrock/bedrock.go (2)

3424-3431: ⚠️ Potential issue | 🟠 Major

This still needs the plain-model escaping branch in getModelPath.

The helper still formats the plain model branch verbatim, so alias-resolved ARNs or any model identifier containing / generate the wrong /model/... route. It also prepends key.BedrockKeyConfig.ARN even when model is already an ARN.

🐛 Possible fix
 func (provider *BedrockProvider) getModelPath(basePath string, model string, key schemas.Key) string {
-	path := fmt.Sprintf("%s/%s", model, basePath)
-	// If ARN is present, Bedrock expects the ARN-scoped identifier
-	if key.BedrockKeyConfig != nil && key.BedrockKeyConfig.ARN != nil && key.BedrockKeyConfig.ARN.GetValue() != "" {
-		encodedModelIdentifier := url.PathEscape(fmt.Sprintf("%s/%s", key.BedrockKeyConfig.ARN.GetValue(), model))
-		path = fmt.Sprintf("%s/%s", encodedModelIdentifier, basePath)
-	}
-	return path
+	modelIdentifier := model
+	if key.BedrockKeyConfig != nil && key.BedrockKeyConfig.ARN != nil && key.BedrockKeyConfig.ARN.GetValue() != "" && !strings.HasPrefix(model, "arn:") {
+		modelIdentifier = fmt.Sprintf("%s/%s", key.BedrockKeyConfig.ARN.GetValue(), model)
+	}
+	return fmt.Sprintf("%s/%s", url.PathEscape(modelIdentifier), basePath)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 3424 - 3431, The getModelPath
function currently constructs the path without escaping the plain model branch
and may incorrectly prepend key.BedrockKeyConfig.ARN when the provided model is
already an ARN; update getModelPath to (1) detect if model is an ARN (e.g.,
starts with "arn:") and treat it as the scoped identifier without prepending the
key ARN, (2) otherwise url.PathEscape the plain model before inserting into the
"%s/%s" format, and (3) when using key.BedrockKeyConfig.ARN, url.PathEscape the
combined ARN/model identifier instead of inserting raw values; modify the logic
in getModelPath to choose between escaped plain model, escaped model ARN, or
escaped combined key ARN + model accordingly.

2160-2163: ⚠️ Potential issue | 🟡 Minor

This still has uppercase S3 ... error strings.

These provider API messages should start with s3, s3 head, s3 delete, etc.

🔧 Suggested casing fixes
- return nil, providerUtils.NewProviderAPIError(fmt.Sprintf("S3 upload failed: %s", string(body)), nil, resp.StatusCode, nil, nil)
+ return nil, providerUtils.NewProviderAPIError(fmt.Sprintf("s3 upload failed: %s", string(body)), nil, resp.StatusCode, nil, nil)

- return nil, providerUtils.NewProviderAPIError(fmt.Sprintf("S3 list failed: %s", string(body)), nil, resp.StatusCode, nil, nil)
+ return nil, providerUtils.NewProviderAPIError(fmt.Sprintf("s3 list failed: %s", string(body)), nil, resp.StatusCode, nil, nil)

- lastErr = providerUtils.NewProviderAPIError(fmt.Sprintf("S3 HEAD failed with status %d", resp.StatusCode), nil, resp.StatusCode, nil, nil)
+ lastErr = providerUtils.NewProviderAPIError(fmt.Sprintf("s3 head failed with status %d", resp.StatusCode), nil, resp.StatusCode, nil, nil)

- lastErr = providerUtils.NewProviderAPIError(fmt.Sprintf("S3 DELETE failed: %s", string(body)), nil, resp.StatusCode, nil, nil)
+ lastErr = providerUtils.NewProviderAPIError(fmt.Sprintf("s3 delete failed: %s", string(body)), nil, resp.StatusCode, nil, nil)

- lastErr = providerUtils.NewProviderAPIError(fmt.Sprintf("S3 GET failed: %s", string(body)), nil, resp.StatusCode, nil, nil)
+ lastErr = providerUtils.NewProviderAPIError(fmt.Sprintf("s3 get failed: %s", string(body)), nil, resp.StatusCode, nil, nil)
As per coding guidelines: "Error strings should be lowercase with no trailing punctuation (Go convention)."

Also applies to: 2297-2298, 2409-2411, 2513-2516, 2598-2601

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 2160 - 2163, The error
messages use uppercase "S3 ..." and punctuation; update the s3-related error
strings to be lowercase and without trailing punctuation: change the
provider.logger.Error call and the providerUtils.NewProviderAPIError message in
the s3 upload handling (the provider.logger.Error("s3 upload failed: %d",
resp.StatusCode) and NewProviderAPIError(fmt.Sprintf("s3 upload failed: %s",
...))) and apply the same lowercase, no-trailing-punctuation fix to the other s3
error messages in the file (the similar provider.logger.Error /
NewProviderAPIError occurrences for s3 head, s3 delete, etc.).
core/providers/azure/azure.go (1)

428-439: ⚠️ Potential issue | 🟠 Major

Forward the returned error body into EnrichError.

completeRequest now returns a copied error body, but these branches still call EnrichError(..., nil, ...). That makes sendBackRawResponse ineffective again for every failure through this helper, and because these calls still pass the provider defaults, request-level raw payload overrides are ignored too.

Suggested fix
 if err != nil {
-	return nil, providerUtils.EnrichError(ctx, err, jsonData, nil, provider.sendBackRawRequest, provider.sendBackRawResponse)
+	return nil, providerUtils.EnrichError(
+		ctx,
+		err,
+		jsonData,
+		responseBody,
+		providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest),
+		providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse),
+	)
 }

Also applies to: 544-555, 714-725, 849-860

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/azure/azure.go` around lines 428 - 439, The calls to
providerUtils.EnrichError after provider.completeRequest are passing nil for the
error body, which prevents sendBackRawResponse and request-level raw overrides
from receiving the returned error payload; update the EnrichError invocations in
the completion handling paths (e.g., the block after provider.completeRequest in
azure.go and the similar blocks around the other occurrences) to pass the copied
error body returned by completeRequest instead of nil (use the local variable
holding the returned error body), ensuring provider.sendBackRawResponse and any
request-level raw payload overrides are honored.
core/providers/replicate/replicate.go (1)

252-255: ⚠️ Potential issue | 🟠 Major

Keep polling expiration classified as a timeout.

This still returns NewBifrostOperationError(...), which downgrades a real polling timeout into a generic provider failure. Use NewBifrostTimeoutError(...) so timeout-specific handling and metrics keep working.

♻️ Suggested fix
-			return nil, nil, providerResponseHeaders, providerUtils.NewBifrostOperationError(
+			return nil, nil, providerResponseHeaders, providerUtils.NewBifrostTimeoutError(
 				schemas.ErrProviderRequestTimedOut,
 				fmt.Errorf("prediction polling timed out after %d seconds", timeoutSeconds))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/replicate/replicate.go` around lines 252 - 255, Replace the
generic operation error returned when polling expires with a timeout-specific
error so timeout handling/metrics remain accurate: in the pollCtx.Done() branch
(the block returning providerResponseHeaders and calling
providerUtils.NewBifrostOperationError), change the call to
providerUtils.NewBifrostTimeoutError(...) and pass the same
schemas.ErrProviderRequestTimedOut and the fmt.Errorf("prediction polling timed
out after %d seconds", timeoutSeconds) message so the semantics remain but the
error is classified as a timeout.
🧹 Nitpick comments (3)
docs/providers/supported-providers/azure.mdx (1)

171-179: Clarify field scope in “Configuration Details” to avoid config-shape confusion.

This section currently mixes azure_key_config fields and top-level key fields in one list. Consider splitting them into separate sublists for readability.

✏️ Suggested docs refactor
-**Configuration Details**:
-- `endpoint` - Azure OpenAI resource endpoint (required)
-- `client_id` - Azure Entra ID client ID (optional, for Service Principal auth)
-- `client_secret` - Azure Entra ID client secret (optional, for Service Principal auth)
-- `tenant_id` - Azure Entra ID tenant ID (optional, for Service Principal auth)
-- `scopes` - OAuth scopes for token requests (default: `["https://cognitiveservices.azure.com/.default"]`)
-- `api_version` - API version to use (default: `2024-10-21`)
-- `aliases` - Map of model names to Azure deployment IDs (optional, set at key level)
-- `allowed_models` - List of allowed models to use from this key (optional)
+**Configuration Details**:
+
+**`azure_key_config` fields**
+- `endpoint` - Azure OpenAI resource endpoint (required)
+- `client_id` - Azure Entra ID client ID (optional, for Service Principal auth)
+- `client_secret` - Azure Entra ID client secret (optional, for Service Principal auth)
+- `tenant_id` - Azure Entra ID tenant ID (optional, for Service Principal auth)
+- `scopes` - OAuth scopes for token requests (default: `["https://cognitiveservices.azure.com/.default"]`)
+- `api_version` - API version to use (default: `2024-10-21`)
+
+**Top-level key fields**
+- `aliases` - Map of model names to Azure deployment IDs (optional)
+- `allowed_models` - List of allowed models to use from this key (optional)

Based on learnings: When reviewing documentation PRs in a Graphite-managed stack, verify docs against stack-wide implementations before flagging unsupported behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/supported-providers/azure.mdx` around lines 171 - 179, The
"Configuration Details" list mixes fields that belong to the top-level provider
key and fields that belong to per-key credentials (azure_key_config), causing
config-shape confusion; split this single list into two sublists labeled
"Top-level key fields" and "azure_key_config fields" and move each symbol into
the correct list (e.g., keep provider-level settings like aliases,
allowed_models and api_version under top-level key fields if applicable, and
place credential-specific entries like endpoint, client_id, client_secret,
tenant_id, scopes under azure_key_config), update the section headings and brief
notes to clarify which fields are required vs optional and which object each
field belongs to.
core/schemas/bifrost.go (1)

801-1006: Avoid maintaining a second response-dispatch switch here.

PopulateExtraFields now has to stay in lockstep with GetExtraFields() across every concrete response type. A shared private selector/helper would make future response additions much less likely to miss metadata wiring in one of the two paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/bifrost.go` around lines 801 - 1006, PopulateExtraFields
duplicates the response-type dispatch from GetExtraFields; extract the common
selector into a single private helper (e.g., selectExtraFieldsFromResponse or
responseExtraFields) that returns a pointer/reference to the appropriate
ExtraFields struct for the first non-nil concrete response (or nil if none),
then update PopulateExtraFields to call that helper and set
RequestType/Provider/OriginalModelRequested/ResolvedModelUsed (compute
resolvedModel same as before). Also replace the duplicated switch in
GetExtraFields to call the same helper so both code paths use the single
selector; keep nil-safety and preserve existing behavior for all response types
(use the existing type identifiers like BifrostResponse, PopulateExtraFields,
GetExtraFields, and the various XxxResponse fields to locate code).
core/providers/utils/utils.go (1)

2087-2109: Remove the unused requestType parameter.

This parameter is not referenced in the function body and both callers (OpenAI and Replicate providers) pass the identical fixed value schemas.TextCompletionStreamRequest. Removing it would eliminate dead code while requiring updates to only two call sites.

♻️ Proposed signature cleanup
 func CreateBifrostTextCompletionChunkResponse(
 	id string,
 	usage *schemas.BifrostLLMUsage,
 	finishReason *string,
 	currentChunkIndex int,
-	requestType schemas.RequestType,
 ) *schemas.BifrostTextCompletionResponse {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/utils.go` around lines 2087 - 2109, The
CreateBifrostTextCompletionChunkResponse function has an unused requestType
parameter; remove the requestType parameter from the function signature and body
in utils.go (function CreateBifrostTextCompletionChunkResponse) and update all
call sites that pass schemas.TextCompletionStreamRequest (e.g., the OpenAI and
Replicate provider call sites) to stop passing that extra argument; ensure
compile by adjusting any imports or tests referencing the old signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/azure/azure.go`:
- Around line 428-434: The Azure OpenAI calls are interpolating request.Model
into paths (e.g., the provider.completeRequest call) but must use the actual
Azure deployment name; add a model->deployment resolution step (e.g., a helper
like resolveAzureDeploymentName(model, providerConfig) or a deploymentMap) and
use the resolved deployment ID when building URLs instead of request.Model;
apply this change before calls in the TextCompletion, Responses (streaming),
ChatCompletion, ChatCompletionStream, Embedding, SpeechGenerate,
SpeechGenerateStream, TranscribeAudio, GenerateImages, and EditImages code paths
(similar to the Anthropic handling in getRequestBodyForAnthropicResponses) so
all fmt.Sprintf("openai/deployments/%s/...", ...) use the resolved deployment
name.

In `@core/providers/elevenlabs/models.go`:
- Around line 47-48: Backfilled model IDs returned by
ListModelsPipeline.BackfillModels are bare (e.g. "<id>") and lose the provider
prefix emitted by FilterModel (e.g. "elevenlabs/<id>"); update the append site
where bifrostResponse.Data = append(... pipeline.BackfillModels(included) ...)
to re-prefix each returned ID with the provider name "elevenlabs/" (or
alternatively fix ListModelsPipeline.BackfillModels to always return
fully-qualified IDs) so that IDs remain consistent with FilterModel and other
callers; locate the append call and either map over BackfillModels(included) to
prefix missing provider tokens or modify BackfillModels to emit the prefixed
form.

In `@core/providers/runway/runway.go`:
- Around line 212-214: The error branches that handle a successful HTTP status
but a malformed body must preserve the response payload when calling
EnrichError; update the CheckAndDecodeBody error path in Runway (where body, err
:= providerUtils.CheckAndDecodeBody(resp) is used) to pass the raw response
bytes instead of nil into providerUtils.EnrichError, and make the analogous
changes in the VideoRetrieve and VideoDownload flows so malformed-body branches
call EnrichError (not bypass it) with sendBackRawResponse enabled and with nil
for request bytes only where appropriate; ensure you reference the resp/raw
response byte variable used by CheckAndDecodeBody and pass that into EnrichError
so the response payload surfaces in logs and error payloads.

In `@framework/configstore/tables/key.go`:
- Around line 64-65: The ReplicateUseDeploymentsEndpoint pointer can be NULL and
AfterFind rebuilds ReplicateKeyConfig as nil, causing dereference panics when
code accesses key.ReplicateKeyConfig.UseDeploymentsEndpoint; fix by ensuring a
non-nil false default is persisted or reconstructed: either change the
ReplicateUseDeploymentsEndpoint field handling so it never round-trips as nil
(persist a false value by default or switch to a non-pointer bool in
ReplicateKeyConfig) or update AfterFind to allocate ReplicateKeyConfig and set
ReplicateUseDeploymentsEndpoint = ptr(false) when the DB value is NULL; update
references in ReplicateKeyConfig, AfterFind, and any code that constructs or
reads ReplicateUseDeploymentsEndpoint accordingly (e.g.,
ReplicateUseDeploymentsEndpoint, ReplicateKeyConfig, AfterFind).

In `@framework/streaming/transcription.go`:
- Around line 196-204: The returned ProcessedStreamResponse structs are missing
the legacy Model field that the logging pipeline still reads; populate that
legacy field (e.g., Model) with the resolved model value (resolvedModel) in each
return (the block that builds &ProcessedStreamResponse and the similar block
later) so the old logging code continues to get model metadata until consumers
migrate; update both return sites that construct ProcessedStreamResponse to set
Model: resolvedModel.

---

Outside diff comments:
In `@framework/configstore/clientconfig.go`:
- Around line 585-592: The code currently hashes key.Aliases whenever the map is
non-nil, causing nil and empty maps to produce different config_hash values;
update the Hash Aliases block in the function that builds the config hash so it
only marshals and writes aliases when the map is non-nil AND contains entries
(e.g., check len(key.Aliases) > 0), keeping the existing sonic.Marshal error
handling and hash.Write usage (refer to the key.Aliases, sonic.Marshal, and
hash.Write symbols).

In `@framework/streaming/types.go`:
- Around line 399-409: The code only assigns imageResp.Model when imageResp is
nil, so existing p.Data.ImageGenerationOutput can keep a stale provider model;
after obtaining or creating imageResp (the variable assigned from
p.Data.ImageGenerationOutput), always overwrite its Model with p.RequestedModel
when p.RequestedModel is non-empty (and likewise ensure ID is set from
p.RequestID if needed) — i.e., move or add the assignment imageResp.Model =
p.RequestedModel (guarded by p.RequestedModel != "") outside the imageResp ==
nil branch so both newly created and existing imageResp objects get the
requested model.

---

Duplicate comments:
In `@core/providers/azure/azure.go`:
- Around line 428-439: The calls to providerUtils.EnrichError after
provider.completeRequest are passing nil for the error body, which prevents
sendBackRawResponse and request-level raw overrides from receiving the returned
error payload; update the EnrichError invocations in the completion handling
paths (e.g., the block after provider.completeRequest in azure.go and the
similar blocks around the other occurrences) to pass the copied error body
returned by completeRequest instead of nil (use the local variable holding the
returned error body), ensuring provider.sendBackRawResponse and any
request-level raw payload overrides are honored.

In `@core/providers/bedrock/bedrock.go`:
- Around line 3424-3431: The getModelPath function currently constructs the path
without escaping the plain model branch and may incorrectly prepend
key.BedrockKeyConfig.ARN when the provided model is already an ARN; update
getModelPath to (1) detect if model is an ARN (e.g., starts with "arn:") and
treat it as the scoped identifier without prepending the key ARN, (2) otherwise
url.PathEscape the plain model before inserting into the "%s/%s" format, and (3)
when using key.BedrockKeyConfig.ARN, url.PathEscape the combined ARN/model
identifier instead of inserting raw values; modify the logic in getModelPath to
choose between escaped plain model, escaped model ARN, or escaped combined key
ARN + model accordingly.
- Around line 2160-2163: The error messages use uppercase "S3 ..." and
punctuation; update the s3-related error strings to be lowercase and without
trailing punctuation: change the provider.logger.Error call and the
providerUtils.NewProviderAPIError message in the s3 upload handling (the
provider.logger.Error("s3 upload failed: %d", resp.StatusCode) and
NewProviderAPIError(fmt.Sprintf("s3 upload failed: %s", ...))) and apply the
same lowercase, no-trailing-punctuation fix to the other s3 error messages in
the file (the similar provider.logger.Error / NewProviderAPIError occurrences
for s3 head, s3 delete, etc.).

In `@core/providers/bedrock/invoke.go`:
- Around line 451-460: The model selection currently prefers resp.Model which
may contain an alias and cause misrouting; update the logic in the model
resolution block (the local variable model, resp.ExtraFields, and related code
path that later calls toBedrockInvokeAnthropicResponse) to prefer
resp.ExtraFields.ResolvedModelUsed first, then
resp.ExtraFields.OriginalModelRequested, and finally fall back to resp.Model;
ensure any downstream family-dispatch decisions use this resolved model variable
instead of resp.Model directly.
- Around line 628-643: The current logic reads model only from
resp.Response.ExtraFields, but chunk-level model info lives on resp.ExtraFields
for every streaming event; update the model selection to first check
resp.ExtraFields.ResolvedModelUsed then resp.ExtraFields.OriginalModelRequested,
then fall back to resp.Response.Model and resp.Response.ExtraFields as before
(modify the block that sets model, referencing the resp variable,
resp.ExtraFields, resp.Response, ExtraFields.ResolvedModelUsed, and
ExtraFields.OriginalModelRequested) so early delta chunks correctly capture the
resolved model and are routed to the right provider.

In `@core/providers/huggingface/huggingface.go`:
- Around line 870-873: The decode-error path after calling
response.ToBifrostTranscriptionResponse currently passes jsonData into
providerUtils.EnrichError which can expose raw audio when
isHFInferenceAudioRequest is true; update this branch to pass nil as the request
body when isHFInferenceAudioRequest is true (otherwise keep jsonData), i.e. call
providerUtils.EnrichError(ctx, providerUtils.NewBifrostOperationError(...), nil,
responseBody, provider.sendBackRawRequest, provider.sendBackRawResponse) when
isHFInferenceAudioRequest is true, referencing
response.ToBifrostTranscriptionResponse, isHFInferenceAudioRequest, jsonData,
responseBody and provider.sendBackRawRequest/sendBackRawResponse to locate the
change.

In `@core/providers/replicate/replicate.go`:
- Around line 252-255: Replace the generic operation error returned when polling
expires with a timeout-specific error so timeout handling/metrics remain
accurate: in the pollCtx.Done() branch (the block returning
providerResponseHeaders and calling providerUtils.NewBifrostOperationError),
change the call to providerUtils.NewBifrostTimeoutError(...) and pass the same
schemas.ErrProviderRequestTimedOut and the fmt.Errorf("prediction polling timed
out after %d seconds", timeoutSeconds) message so the semantics remain but the
error is classified as a timeout.

In `@plugins/maxim/main.go`:
- Around line 614-619: The model tag is written using resolvedModel even when
it's empty, causing empty tags; update the logic around
logger.AddTagToGeneration and logger.AddTagToTrace to use originalModel as a
fallback: when resolvedModel is non-empty, set the "model" tag to resolvedModel
(and only add "alias" if originalModel is non-empty and different from
resolvedModel); when resolvedModel is empty but originalModel is present, set
the "model" tag to originalModel and do not add an "alias". Adjust the code that
references generationID, traceID, resolvedModel, originalModel,
logger.AddTagToGeneration and logger.AddTagToTrace accordingly.

---

Nitpick comments:
In `@core/providers/utils/utils.go`:
- Around line 2087-2109: The CreateBifrostTextCompletionChunkResponse function
has an unused requestType parameter; remove the requestType parameter from the
function signature and body in utils.go (function
CreateBifrostTextCompletionChunkResponse) and update all call sites that pass
schemas.TextCompletionStreamRequest (e.g., the OpenAI and Replicate provider
call sites) to stop passing that extra argument; ensure compile by adjusting any
imports or tests referencing the old signature.

In `@core/schemas/bifrost.go`:
- Around line 801-1006: PopulateExtraFields duplicates the response-type
dispatch from GetExtraFields; extract the common selector into a single private
helper (e.g., selectExtraFieldsFromResponse or responseExtraFields) that returns
a pointer/reference to the appropriate ExtraFields struct for the first non-nil
concrete response (or nil if none), then update PopulateExtraFields to call that
helper and set RequestType/Provider/OriginalModelRequested/ResolvedModelUsed
(compute resolvedModel same as before). Also replace the duplicated switch in
GetExtraFields to call the same helper so both code paths use the single
selector; keep nil-safety and preserve existing behavior for all response types
(use the existing type identifiers like BifrostResponse, PopulateExtraFields,
GetExtraFields, and the various XxxResponse fields to locate code).

In `@docs/providers/supported-providers/azure.mdx`:
- Around line 171-179: The "Configuration Details" list mixes fields that belong
to the top-level provider key and fields that belong to per-key credentials
(azure_key_config), causing config-shape confusion; split this single list into
two sublists labeled "Top-level key fields" and "azure_key_config fields" and
move each symbol into the correct list (e.g., keep provider-level settings like
aliases, allowed_models and api_version under top-level key fields if
applicable, and place credential-specific entries like endpoint, client_id,
client_secret, tenant_id, scopes under azure_key_config), update the section
headings and brief notes to clarify which fields are required vs optional and
which object each field belongs to.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 80addae7-19c2-41c4-b068-ea5273eda736

📥 Commits

Reviewing files that changed from the base of the PR and between 327339d and 612a1f1.

📒 Files selected for processing (177)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/image_edit.go
  • core/internal/llmtests/image_generation.go
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/speech_synthesis.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/internal/llmtests/transcription_stream.go
  • core/internal/llmtests/video.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/batch.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/errors.go
  • core/providers/anthropic/models.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/text.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/utils.go
  • core/providers/azure/azure.go
  • core/providers/azure/files.go
  • core/providers/azure/models.go
  • core/providers/azure/utils.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/invoke.go
  • core/providers/bedrock/models.go
  • core/providers/bedrock/rerank_test.go
  • core/providers/bedrock/s3.go
  • core/providers/bedrock/signer.go
  • core/providers/bedrock/text.go
  • core/providers/cerebras/cerebras.go
  • core/providers/cohere/chat.go
  • core/providers/cohere/cohere.go
  • core/providers/cohere/errors.go
  • core/providers/cohere/models.go
  • core/providers/elevenlabs/elevenlabs.go
  • core/providers/elevenlabs/errors.go
  • core/providers/elevenlabs/models.go
  • core/providers/gemini/batch.go
  • core/providers/gemini/errors.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/models.go
  • core/providers/gemini/videos.go
  • core/providers/groq/groq.go
  • core/providers/huggingface/errors.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/responses.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/huggingface/utils.go
  • core/providers/mistral/mistral.go
  • core/providers/mistral/models.go
  • core/providers/mistral/transcription.go
  • core/providers/nebius/errors.go
  • core/providers/nebius/nebius.go
  • core/providers/ollama/ollama.go
  • core/providers/openai/batch.go
  • core/providers/openai/errors.go
  • core/providers/openai/errors_test.go
  • core/providers/openai/files.go
  • core/providers/openai/images.go
  • core/providers/openai/large_payload.go
  • core/providers/openai/models.go
  • core/providers/openai/openai.go
  • core/providers/openai/transcription.go
  • core/providers/openai/videos.go
  • core/providers/openrouter/openrouter.go
  • core/providers/parasail/parasail.go
  • core/providers/perplexity/chat.go
  • core/providers/perplexity/perplexity.go
  • core/providers/replicate/errors.go
  • core/providers/replicate/files.go
  • core/providers/replicate/images.go
  • core/providers/replicate/models.go
  • core/providers/replicate/replicate.go
  • core/providers/replicate/utils.go
  • core/providers/replicate/videos.go
  • core/providers/runway/errors.go
  • core/providers/runway/runway.go
  • core/providers/runway/videos.go
  • core/providers/sgl/sgl.go
  • core/providers/utils/large_response.go
  • core/providers/utils/make_request_test.go
  • core/providers/utils/models.go
  • core/providers/utils/utils.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/errors.go
  • core/providers/vertex/models.go
  • core/providers/vertex/rerank.go
  • core/providers/vertex/rerank_test.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/utils.go
  • core/providers/vllm/vllm.go
  • core/providers/xai/errors.go
  • core/providers/xai/xai.go
  • core/schemas/account.go
  • core/schemas/bifrost.go
  • core/schemas/chatcompletions.go
  • core/schemas/models.go
  • core/schemas/models_test.go
  • core/schemas/tracer.go
  • core/utils.go
  • docs/features/litellm-compat.mdx
  • docs/openapi/schemas/management/providers.yaml
  • docs/providers/provider-routing.mdx
  • docs/providers/supported-providers/azure.mdx
  • docs/providers/supported-providers/bedrock.mdx
  • docs/providers/supported-providers/replicate.mdx
  • docs/providers/supported-providers/vertex.mdx
  • docs/quickstart/gateway/provider-configuration.mdx
  • docs/quickstart/go-sdk/provider-configuration.mdx
  • framework/configstore/clientconfig.go
  • framework/configstore/encryption_test.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/encryption_test.go
  • framework/configstore/tables/key.go
  • framework/configstore/tables/virtualkey.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/sync.go
  • framework/streaming/accumulator_test.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/streaming/types.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/governance/model_provider_governance_test.go
  • plugins/governance/store.go
  • plugins/litellmcompat/texttochat.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/logging/utils.go
  • plugins/logging/writer.go
  • plugins/maxim/main.go
  • plugins/mocker/main.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/telemetry/main.go
  • tests/governance/e2e_test.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/provider_keys.go
  • transports/bifrost-http/handlers/wsresponses.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/cursor.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/config_test.go
  • transports/config.schema.json
  • transports/schema_test/config_schema_test.go
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/columns.tsx
  • ui/app/workspace/providers/dialogs/addNewKeySheet.tsx
  • ui/app/workspace/providers/fragments/apiKeysFormFragment.tsx
  • ui/components/filters/filterPopover.tsx
  • ui/components/ui/headersTable.tsx
  • ui/lib/constants/config.ts
  • ui/lib/schemas/providerForm.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
  • ui/lib/utils/validation.ts
💤 Files with no reviewable changes (19)
  • core/providers/replicate/videos.go
  • core/providers/bedrock/chat.go
  • framework/modelcatalog/sync.go
  • core/providers/anthropic/text.go
  • core/providers/cerebras/cerebras.go
  • core/providers/huggingface/transcription.go
  • core/providers/vllm/utils.go
  • core/providers/replicate/errors.go
  • core/providers/huggingface/speech.go
  • core/providers/groq/groq.go
  • core/providers/replicate/images.go
  • core/providers/vertex/rerank_test.go
  • core/providers/cohere/chat.go
  • framework/configstore/tables/virtualkey.go
  • core/providers/replicate/files.go
  • core/providers/vertex/embedding.go
  • core/providers/perplexity/chat.go
  • core/providers/huggingface/responses.go
  • core/providers/parasail/parasail.go
✅ Files skipped from review due to trivial changes (25)
  • docs/providers/supported-providers/replicate.mdx
  • core/internal/llmtests/speech_synthesis.go
  • framework/streaming/accumulator_test.go
  • core/schemas/models_test.go
  • core/internal/llmtests/transcription_stream.go
  • docs/features/litellm-compat.mdx
  • framework/modelcatalog/pricing_test.go
  • plugins/mocker/main.go
  • core/providers/huggingface/utils.go
  • core/providers/mistral/transcription.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/providers/openai/images.go
  • plugins/governance/store.go
  • docs/providers/supported-providers/bedrock.mdx
  • docs/providers/provider-routing.mdx
  • core/providers/anthropic/responses.go
  • core/providers/perplexity/perplexity.go
  • docs/providers/supported-providers/vertex.mdx
  • core/providers/anthropic/utils.go
  • docs/quickstart/go-sdk/provider-configuration.mdx
  • core/providers/cohere/cohere.go
  • core/providers/openai/models.go
  • docs/quickstart/gateway/provider-configuration.mdx
  • core/providers/gemini/gemini.go
  • core/bifrost.go
🚧 Files skipped from review as they are similar to previous changes (57)
  • core/internal/llmtests/image_edit.go
  • core/schemas/tracer.go
  • core/internal/llmtests/image_variation.go
  • core/providers/bedrock/signer.go
  • core/providers/anthropic/chat.go
  • core/providers/xai/xai.go
  • core/providers/sgl/sgl.go
  • plugins/semanticcache/plugin_integration_test.go
  • core/internal/llmtests/response_validation.go
  • framework/logstore/store.go
  • core/providers/openai/videos.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/ollama/ollama.go
  • core/providers/anthropic/errors.go
  • framework/streaming/chat.go
  • core/providers/openai/errors_test.go
  • core/providers/azure/utils.go
  • core/providers/azure/files.go
  • framework/configstore/rdb.go
  • core/providers/utils/large_response.go
  • core/providers/mistral/models.go
  • core/providers/gemini/errors.go
  • core/providers/runway/errors.go
  • core/providers/utils/make_request_test.go
  • core/providers/bedrock/text.go
  • core/providers/elevenlabs/errors.go
  • plugins/logging/writer.go
  • core/providers/nebius/errors.go
  • framework/configstore/encryption_test.go
  • core/providers/xai/errors.go
  • plugins/governance/model_provider_governance_test.go
  • core/providers/openrouter/openrouter.go
  • core/providers/cohere/errors.go
  • core/providers/bedrock/rerank_test.go
  • framework/logstore/migrations.go
  • core/providers/huggingface/errors.go
  • core/providers/openai/large_payload.go
  • framework/configstore/tables/encryption_test.go
  • plugins/governance/main.go
  • core/providers/openai/files.go
  • core/providers/bedrock/models.go
  • core/providers/openai/errors.go
  • core/providers/mistral/mistral.go
  • core/providers/vertex/errors.go
  • core/providers/gemini/models.go
  • core/providers/cohere/models.go
  • core/providers/anthropic/types.go
  • core/internal/llmtests/account.go
  • core/providers/anthropic/models.go
  • core/providers/nebius/nebius.go
  • core/schemas/account.go
  • framework/logstore/rdb.go
  • core/providers/vertex/models.go
  • plugins/logging/main.go
  • framework/configstore/migrations.go
  • core/providers/anthropic/anthropic.go
  • core/providers/utils/models.go

Comment thread core/providers/azure/azure.go
Comment thread core/providers/elevenlabs/models.go
Comment thread core/providers/runway/runway.go Outdated
Comment thread framework/configstore/tables/key.go
Comment thread framework/streaming/transcription.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from 612a1f1 to 15283ab Compare April 2, 2026 07:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
core/providers/utils/make_request_test.go (1)

297-314: ⚠️ Potential issue | 🟠 Major

Remove the impossible provider assertion from this constructor test.

NewBifrostTimeoutError no longer populates ExtraFields.Provider; that happens later in the core enrichment path. The "openai" expectation now asserts a state this helper never sets, so the test will fail even when the constructor is correct. If you still want coverage for provider enrichment, add a separate test around the enrichment path instead.

✂️ Suggested test update
 func TestNewBifrostTimeoutError(t *testing.T) {
 	err := NewBifrostTimeoutError("test timeout", context.DeadlineExceeded)
@@
-	if err.ExtraFields.Provider != "openai" {
-		t.Fatalf("expected provider openai, got %s", err.ExtraFields.Provider)
-	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/make_request_test.go` around lines 297 - 314, The test
TestNewBifrostTimeoutError is asserting that NewBifrostTimeoutError populates
ExtraFields.Provider with "openai", but NewBifrostTimeoutError no longer sets
provider; remove the impossible assertion (the check comparing
err.ExtraFields.Provider to "openai") from the test so it only verifies fields
the constructor actually sets, or alternatively add a new separate test that
exercises the enrichment path which populates ExtraFields.Provider; locate this
change in the TestNewBifrostTimeoutError function and remove the provider
expectation assertion.
core/schemas/chatcompletions.go (1)

62-72: ⚠️ Potential issue | 🟡 Minor

Preserve RawRequest in chat→text conversions.

ToTextCompletionResponse() still drops cr.ExtraFields.RawRequest in every branch, so send_back_raw_request disappears whenever a chat-backed response is surfaced through the text-completions compatibility path.

Suggested fix
 ExtraFields: BifrostResponseExtraFields{
 	RequestType:             TextCompletionRequest,
 	ChunkIndex:              cr.ExtraFields.ChunkIndex,
 	Provider:                cr.ExtraFields.Provider,
 	OriginalModelRequested:  cr.ExtraFields.OriginalModelRequested,
 	ResolvedModelUsed:       cr.ExtraFields.ResolvedModelUsed,
 	Latency:                 cr.ExtraFields.Latency,
+	RawRequest:              cr.ExtraFields.RawRequest,
 	RawResponse:             cr.ExtraFields.RawResponse,
 	CacheDebug:              cr.ExtraFields.CacheDebug,
 	ProviderResponseHeaders: cr.ExtraFields.ProviderResponseHeaders,
 },

Also applies to: 96-106, 133-143, 154-163

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/chatcompletions.go` around lines 62 - 72, The conversion
function ToTextCompletionResponse is dropping cr.ExtraFields.RawRequest; update
every branch where BifrostResponseExtraFields is constructed (inside
ToTextCompletionResponse) to copy RawRequest: cr.ExtraFields.RawRequest into the
new ExtraFields so send_back_raw_request is preserved; locate the
BifrostResponseExtraFields constructions in ToTextCompletionResponse and add the
RawRequest field alongside RawResponse/CacheDebug/ProviderResponseHeaders (same
change also required in the other similar blocks noted around lines 96-106,
133-143, 154-163).
framework/configstore/clientconfig.go (1)

585-608: ⚠️ Potential issue | 🟠 Major

Canonicalize Aliases before hashing to ensure deterministic config_hash.

sonic.Marshal uses non-deterministic map iteration order by default, making config_hash unstable across builds. Additionally, the condition key.Aliases != nil treats empty maps {} differently from nil, which conflicts with the migration logic that considers them equivalent. Use len(key.Aliases) > 0 to match the slice pattern and migration semantics, and sort keys before marshaling.

Suggested fix
 	// Hash Aliases
-	if key.Aliases != nil {
-		data, err := sonic.Marshal(key.Aliases)
+	if len(key.Aliases) > 0 {
+		aliasKeys := make([]string, 0, len(key.Aliases))
+		for k := range key.Aliases {
+			aliasKeys = append(aliasKeys, k)
+		}
+		sort.Strings(aliasKeys)
+
+		aliases := make([][2]string, 0, len(aliasKeys))
+		for _, k := range aliasKeys {
+			aliases = append(aliases, [2]string{k, key.Aliases[k]})
+		}
+
+		data, err := sonic.Marshal(aliases)
 		if err != nil {
 			return "", err
 		}
 		hash.Write(data)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/clientconfig.go` around lines 585 - 608, The hashing
code uses sonic.Marshal on key.Aliases which yields non-deterministic output and
treats nil vs empty map differently; change the check from key.Aliases != nil to
len(key.Aliases) > 0, and canonicalize Aliases before hashing by extracting
keys, sorting them, building a deterministically ordered representation (e.g.,
slice of {key,value} pairs or an ordered map), then marshal that deterministic
structure (instead of raw key.Aliases) with sonic.Marshal so config_hash is
stable across runs.
framework/streaming/types.go (1)

399-409: ⚠️ Potential issue | 🟡 Minor

Backfill streamed image Model when the accumulator omits it.

ImageGenerationOutput can be non-nil while still carrying Model == "", because the accumulator only copies model metadata when it was present on the chunks. This branch currently falls back to RequestedModel only when imageResp == nil, so the final streamed image response can lose its top-level model.

🛠️ Suggested fix
 		if imageResp == nil {
 			imageResp = &schemas.BifrostImageGenerationResponse{
 				Data: make([]schemas.ImageData, 0),
 			}
 			if p.RequestID != "" {
 				imageResp.ID = p.RequestID
 			}
-			if p.RequestedModel != "" {
-				imageResp.Model = p.RequestedModel
-			}
+		}
+		if imageResp.Model == "" && p.RequestedModel != "" {
+			imageResp.Model = p.RequestedModel
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/streaming/types.go` around lines 399 - 409, The current logic only
backfills image metadata when p.Data.ImageGenerationOutput is nil, so when
imageResp exists but imageResp.Model == "" the final response can lose the
top-level model; update the handling for p.Data.ImageGenerationOutput (variable
imageResp) to check and populate missing fields: if imageResp.Model == "" set
imageResp.Model = p.RequestedModel, and if imageResp.ID == "" set imageResp.ID =
p.RequestID (in addition to the existing nil-branch behavior), so both existing
and newly created imageResp get the requested model and request ID filled in.
core/providers/azure/azure.go (1)

223-269: ⚠️ Potential issue | 🔴 Critical

Don't infer Azure Anthropic routing from the resolved deployment ID.

After this refactor, request.Model / model in the Azure provider is already the alias-resolved deployment name. For Anthropic-on-Azure configs with custom deployment names or aliases, schemas.IsAnthropicModel(request.Model) will return false, so chat/responses requests fall onto the OpenAI URL/auth path instead of /anthropic/v1/messages with Anthropic headers. Please preserve the original model family (or pass an explicit isAnthropic flag from pre-resolution logic) and branch on that instead of the resolved deployment string. This regresses the alias flow this PR is adding.

Based on learnings: model alias resolution occurs centrally before provider dispatch, so request.Model in Azure handlers is already the resolved deployment identifier.

Also applies to: 519-542, 609-662, 692-711, 777-812

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/azure/azure.go` around lines 223 - 269, The Azure provider is
incorrectly detecting Anthropic by inspecting the resolved `model`
(alias/deployment name) via `schemas.IsAnthropicModel(model)`, which breaks
alias flows; change the logic to rely on an explicit Anthropic indicator carried
from pre-resolution (e.g., a boolean parameter named `isAnthropic` passed into
the Azure request path) instead of calling `schemas.IsAnthropicModel(model)`
here, update uses in the request build/auth flow (the block that currently calls
`schemas.IsAnthropicModel(model)`, the `getAzureAuthHeaders` call usage, and the
Anthropic-specific header/url branches) to use that explicit flag, and propagate
that flag from the caller sites that perform model-family resolution so all
references (including the other similar blocks you noted) branch on the original
model family rather than the resolved deployment string.
♻️ Duplicate comments (4)
core/providers/utils/models.go (1)

327-348: ⚠️ Potential issue | 🟡 Minor

Sort wildcard alias backfills before appending them.

Case B still iterates p.Aliases directly, so the backfilled tail of /models can reshuffle between identical calls whenever wildcard allowlists rely on aliases. Extract the alias keys, sort them, then iterate the sorted slice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/models.go` around lines 327 - 348, The loop over
p.Aliases in the wildcard allowlist branch can produce non-deterministic
ordering; extract the alias keys into a slice, sort that slice (e.g., using
sort.Strings), then iterate the sorted keys instead of ranging p.Aliases
directly so backfilled entries appended to result are deterministic; preserve
the same blacklist check using matches(aliasKey, bl, p.MatchFns) and append the
same schemas.Model (ID: string(p.ProviderKey)+"/"+aliasKey, Name:
schemas.Ptr(ToDisplayName(aliasKey)), Alias: schemas.Ptr(providerID)).
core/providers/replicate/models.go (1)

66-72: ⚠️ Potential issue | 🟠 Major

Don't backfill from a partial Replicate page.

NextPageToken is preserved here, so pipeline.BackfillModels(included) is still running on a single provider page. That can inject allowlisted/aliased models from later pages into the current response and then duplicate them when the real page arrives. Only backfill once the provider listing is complete.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/replicate/models.go` around lines 66 - 72, The code calls
pipeline.BackfillModels(included) even when deploymentsResponse.Next is non-nil,
causing backfilled models from future pages to be inserted into a partial page;
change the logic so BackfillModels is only executed when the provider page
sequence is complete (i.e., when deploymentsResponse.Next is nil /
bifrostResponse.NextPageToken is empty). Concretely, wrap or move the call to
pipeline.BackfillModels(included) so it runs only if deploymentsResponse.Next ==
nil (or bifrostResponse.NextPageToken == ""), keeping the existing assignments
to bifrostResponse.NextPageToken and bifrostResponse.Data otherwise.
core/providers/bedrock/bedrock.go (1)

3424-3431: ⚠️ Potential issue | 🟠 Major

Escape the plain-model branch in getModelPath too.

Only the key.BedrockKeyConfig.ARN branch is url.PathEscaped. With the alias-resolution work in this stack, request.Model can already be a resolved ARN or other slash-containing identifier, so fmt.Sprintf("%s/%s", model, basePath) still produces malformed Bedrock routes for valid resolved models. Based on learnings: alias resolution happens centrally in core before provider dispatch, so request.Model may already be the resolved Bedrock identifier here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 3424 - 3431, The getModelPath
function builds path := fmt.Sprintf("%s/%s", model, basePath) but only escapes
the ARN branch; escape the plain-model branch as well to handle cases where
request.Model is already a resolved ARN or contains slashes. Update getModelPath
to call url.PathEscape on the model when constructing the default path (same way
as in the key.BedrockKeyConfig.ARN branch) so the returned path variable is
always properly URL-encoded regardless of whether key.BedrockKeyConfig.ARN is
present.
framework/configstore/migrations.go (1)

3588-3615: ⚠️ Potential issue | 🟠 Major

Keep the alias copy and hash-recompute predicates aligned.

The UPDATE at Lines 3588-3590 still copies legacy JSON equal to '{}', but the recompute query at Lines 3613-3615 skips aliases_json = '{}'. Keys migrated from an empty-object deployment map can retain a stale config_hash after upgrade.

🩹 Proposed fix
 				if err := tx.Exec(
 					"UPDATE config_keys SET aliases_json = " + col +
-						" WHERE aliases_json IS NULL AND " + col + " IS NOT NULL AND " + col + " != ''",
+						" WHERE aliases_json IS NULL AND " + col + " IS NOT NULL AND " + col + " != '' AND " + col + " != '{}'",
 				).Error; err != nil {
 					return err
 				}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 3588 - 3615, The recompute
step filters out aliases_json == '{}' while the earlier UPDATE can copy '{}'
into aliases_json, leaving those rows with stale config_hash; update the
tx.Where clause used to populate affectedKeys in migrations.go (the query that
currently reads "aliases_json IS NOT NULL AND aliases_json != ? AND aliases_json
!= ?") to only exclude NULL and empty string (remove the != '{}' predicate) so
the recompute includes rows where aliases_json == '{}', keeping the alias copy
and hash-recompute predicates aligned.
🧹 Nitpick comments (2)
core/providers/openai/files.go (2)

86-104: Avoid setting ExtraFields.RawRequest for retrieve (GET) responses.

On Line 102-Line 104, retrieve responses still attach raw request payloads. OpenAI retrieve/GET endpoints should skip RawRequest even when sendBackRawRequest is enabled.

Suggested change
 func (r *OpenAIFileResponse) ToBifrostFileRetrieveResponse(providerName schemas.ModelProvider, latency time.Duration, sendBackRawRequest bool, sendBackRawResponse bool, rawRequest interface{}, rawResponse interface{}) *schemas.BifrostFileRetrieveResponse {
@@
-	if sendBackRawRequest {
-		resp.ExtraFields.RawRequest = rawRequest
-	}
+	// Intentionally do not populate RawRequest for retrieve/GET endpoints.

Based on learnings: for all GET/retrieve endpoints in core/providers/openai, do not populate ExtraFields.RawRequest when sendBackRawRequest is enabled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/files.go` around lines 86 - 104, The
ToBifrostFileRetrieveResponse method is incorrectly attaching
ExtraFields.RawRequest for GET/retrieve responses; remove or disable the block
that sets resp.ExtraFields.RawRequest (the if sendBackRawRequest {
resp.ExtraFields.RawRequest = rawRequest } logic) so that RawRequest is never
populated in OpenAI retrieve responses while leaving other fields (e.g.,
RawResponse, Latency) intact; update the function
OpenAIFileResponse.ToBifrostFileRetrieveResponse accordingly.

86-99: Remove stale providerName parameter from retrieve converter.

providerName is no longer used after Provider was removed from ExtraFields; dropping it will keep this API consistent with ToBifrostFileUploadResponse and reduce call-site noise.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/files.go` around lines 86 - 99, The
ToBifrostFileRetrieveResponse function still accepts an unused providerName
parameter; remove providerName from the function signature of
ToBifrostFileRetrieveResponse in core/providers/openai/files.go, delete any
references to that parameter inside the function body (none expected), and
update all call sites to stop passing providerName so the signature matches
ToBifrostFileUploadResponse and avoids unused parameter warnings; ensure the
function name ToBifrostFileRetrieveResponse and its return type remain
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/anthropic/models.go`:
- Around line 23-27: The converter currently sets bifrostResponse.NextPageToken
when response.HasMore is true, which causes pipeline.BackfillModels(included) to
run on a partial Anthropic listing; change the logic so you only emit
NextPageToken after the provider listing is complete (i.e., set
bifrostResponse.NextPageToken only when response.HasMore is false and
response.LastID != nil), and apply the same fix to the other occurrence (lines
referenced around the second occurrence) so BackfillModels(included) is deferred
until the full listing is drained.

In `@core/providers/bedrock/bedrock.go`:
- Around line 2657-2660: The role_arn validation is checking/using the wrong key
field: change the code that assigns and validates roleArn to read from
key.BedrockKeyConfig.RoleARN (not key.BedrockKeyConfig.ARN) so the required-role
check and any value sent to Bedrock are the IAM role ARN, and update the
fallback assignment that currently uses ARN to use RoleARN; keep the same error
branch (provider.logger.Error / NewBifrostOperationError) but ensure it
complains about missing RoleARN and stays consistent with the signing callsites
(which reference key.BedrockKeyConfig.RoleARN).
- Around line 2069-2071: The code path currently lets ensureBedrockKeyConfig
create an empty BedrockKeyConfig and then calls signAWSRequest which falls back
to config.LoadDefaultConfig, implicitly switching auth to ambient AWS
credentials; update the logic in ensureBedrockKeyConfig and all
FileList/Retrieve/Delete/Content and batch helper usages so that if key.Value is
present but key.BedrockKeyConfig is nil/empty you return a configuration error
instead of constructing an empty config, and modify signAWSRequest callers to
explicitly require either AWS credentials (access key/secret) or an explicit
role config before attempting any call; in short, refuse to auto-fallback to
LoadDefaultConfig—validate key.BedrockKeyConfig and error out when missing so
auth mode remains the selected provider key.

In `@framework/streaming/types.go`:
- Around line 272-280: The ExtraFields currently store the pointer p.RawRequest
itself, causing a *interface{} to be carried and making pointer-to-nil appear as
present; change each assignment to dereference the pointer after a nil check
(e.g., if p.RawRequest != nil {
resp.TextCompletionResponse.ExtraFields.RawRequest = *p.RawRequest } else {
resp.TextCompletionResponse.ExtraFields.RawRequest = nil }) so the actual
payload or nil is stored. Apply the same pattern wherever ExtraFields.RawRequest
is set (e.g., resp.TextCompletionResponse.ExtraFields,
resp.ChatResponse.ExtraFields, resp.ResponsesResponse.ExtraFields,
resp.SpeechResponse.ExtraFields, resp.TranscriptionResponse.ExtraFields,
resp.ImageResponse.ExtraFields) and ensure you reference p.RawRequest and
schemas.BifrostResponseExtraFields when making the change.

---

Outside diff comments:
In `@core/providers/azure/azure.go`:
- Around line 223-269: The Azure provider is incorrectly detecting Anthropic by
inspecting the resolved `model` (alias/deployment name) via
`schemas.IsAnthropicModel(model)`, which breaks alias flows; change the logic to
rely on an explicit Anthropic indicator carried from pre-resolution (e.g., a
boolean parameter named `isAnthropic` passed into the Azure request path)
instead of calling `schemas.IsAnthropicModel(model)` here, update uses in the
request build/auth flow (the block that currently calls
`schemas.IsAnthropicModel(model)`, the `getAzureAuthHeaders` call usage, and the
Anthropic-specific header/url branches) to use that explicit flag, and propagate
that flag from the caller sites that perform model-family resolution so all
references (including the other similar blocks you noted) branch on the original
model family rather than the resolved deployment string.

In `@core/providers/utils/make_request_test.go`:
- Around line 297-314: The test TestNewBifrostTimeoutError is asserting that
NewBifrostTimeoutError populates ExtraFields.Provider with "openai", but
NewBifrostTimeoutError no longer sets provider; remove the impossible assertion
(the check comparing err.ExtraFields.Provider to "openai") from the test so it
only verifies fields the constructor actually sets, or alternatively add a new
separate test that exercises the enrichment path which populates
ExtraFields.Provider; locate this change in the TestNewBifrostTimeoutError
function and remove the provider expectation assertion.

In `@core/schemas/chatcompletions.go`:
- Around line 62-72: The conversion function ToTextCompletionResponse is
dropping cr.ExtraFields.RawRequest; update every branch where
BifrostResponseExtraFields is constructed (inside ToTextCompletionResponse) to
copy RawRequest: cr.ExtraFields.RawRequest into the new ExtraFields so
send_back_raw_request is preserved; locate the BifrostResponseExtraFields
constructions in ToTextCompletionResponse and add the RawRequest field alongside
RawResponse/CacheDebug/ProviderResponseHeaders (same change also required in the
other similar blocks noted around lines 96-106, 133-143, 154-163).

In `@framework/configstore/clientconfig.go`:
- Around line 585-608: The hashing code uses sonic.Marshal on key.Aliases which
yields non-deterministic output and treats nil vs empty map differently; change
the check from key.Aliases != nil to len(key.Aliases) > 0, and canonicalize
Aliases before hashing by extracting keys, sorting them, building a
deterministically ordered representation (e.g., slice of {key,value} pairs or an
ordered map), then marshal that deterministic structure (instead of raw
key.Aliases) with sonic.Marshal so config_hash is stable across runs.

In `@framework/streaming/types.go`:
- Around line 399-409: The current logic only backfills image metadata when
p.Data.ImageGenerationOutput is nil, so when imageResp exists but
imageResp.Model == "" the final response can lose the top-level model; update
the handling for p.Data.ImageGenerationOutput (variable imageResp) to check and
populate missing fields: if imageResp.Model == "" set imageResp.Model =
p.RequestedModel, and if imageResp.ID == "" set imageResp.ID = p.RequestID (in
addition to the existing nil-branch behavior), so both existing and newly
created imageResp get the requested model and request ID filled in.

---

Duplicate comments:
In `@core/providers/bedrock/bedrock.go`:
- Around line 3424-3431: The getModelPath function builds path :=
fmt.Sprintf("%s/%s", model, basePath) but only escapes the ARN branch; escape
the plain-model branch as well to handle cases where request.Model is already a
resolved ARN or contains slashes. Update getModelPath to call url.PathEscape on
the model when constructing the default path (same way as in the
key.BedrockKeyConfig.ARN branch) so the returned path variable is always
properly URL-encoded regardless of whether key.BedrockKeyConfig.ARN is present.

In `@core/providers/replicate/models.go`:
- Around line 66-72: The code calls pipeline.BackfillModels(included) even when
deploymentsResponse.Next is non-nil, causing backfilled models from future pages
to be inserted into a partial page; change the logic so BackfillModels is only
executed when the provider page sequence is complete (i.e., when
deploymentsResponse.Next is nil / bifrostResponse.NextPageToken is empty).
Concretely, wrap or move the call to pipeline.BackfillModels(included) so it
runs only if deploymentsResponse.Next == nil (or bifrostResponse.NextPageToken
== ""), keeping the existing assignments to bifrostResponse.NextPageToken and
bifrostResponse.Data otherwise.

In `@core/providers/utils/models.go`:
- Around line 327-348: The loop over p.Aliases in the wildcard allowlist branch
can produce non-deterministic ordering; extract the alias keys into a slice,
sort that slice (e.g., using sort.Strings), then iterate the sorted keys instead
of ranging p.Aliases directly so backfilled entries appended to result are
deterministic; preserve the same blacklist check using matches(aliasKey, bl,
p.MatchFns) and append the same schemas.Model (ID:
string(p.ProviderKey)+"/"+aliasKey, Name: schemas.Ptr(ToDisplayName(aliasKey)),
Alias: schemas.Ptr(providerID)).

In `@framework/configstore/migrations.go`:
- Around line 3588-3615: The recompute step filters out aliases_json == '{}'
while the earlier UPDATE can copy '{}' into aliases_json, leaving those rows
with stale config_hash; update the tx.Where clause used to populate affectedKeys
in migrations.go (the query that currently reads "aliases_json IS NOT NULL AND
aliases_json != ? AND aliases_json != ?") to only exclude NULL and empty string
(remove the != '{}' predicate) so the recompute includes rows where aliases_json
== '{}', keeping the alias copy and hash-recompute predicates aligned.

---

Nitpick comments:
In `@core/providers/openai/files.go`:
- Around line 86-104: The ToBifrostFileRetrieveResponse method is incorrectly
attaching ExtraFields.RawRequest for GET/retrieve responses; remove or disable
the block that sets resp.ExtraFields.RawRequest (the if sendBackRawRequest {
resp.ExtraFields.RawRequest = rawRequest } logic) so that RawRequest is never
populated in OpenAI retrieve responses while leaving other fields (e.g.,
RawResponse, Latency) intact; update the function
OpenAIFileResponse.ToBifrostFileRetrieveResponse accordingly.
- Around line 86-99: The ToBifrostFileRetrieveResponse function still accepts an
unused providerName parameter; remove providerName from the function signature
of ToBifrostFileRetrieveResponse in core/providers/openai/files.go, delete any
references to that parameter inside the function body (none expected), and
update all call sites to stop passing providerName so the signature matches
ToBifrostFileUploadResponse and avoids unused parameter warnings; ensure the
function name ToBifrostFileRetrieveResponse and its return type remain
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 12e1f177-788b-4328-8374-4af14f42cfd4

📥 Commits

Reviewing files that changed from the base of the PR and between 612a1f1 and 15283ab.

📒 Files selected for processing (177)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/image_edit.go
  • core/internal/llmtests/image_generation.go
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/speech_synthesis.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/internal/llmtests/transcription_stream.go
  • core/internal/llmtests/video.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/batch.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/errors.go
  • core/providers/anthropic/models.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/text.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/utils.go
  • core/providers/azure/azure.go
  • core/providers/azure/files.go
  • core/providers/azure/models.go
  • core/providers/azure/utils.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/invoke.go
  • core/providers/bedrock/models.go
  • core/providers/bedrock/rerank_test.go
  • core/providers/bedrock/s3.go
  • core/providers/bedrock/signer.go
  • core/providers/bedrock/text.go
  • core/providers/cerebras/cerebras.go
  • core/providers/cohere/chat.go
  • core/providers/cohere/cohere.go
  • core/providers/cohere/errors.go
  • core/providers/cohere/models.go
  • core/providers/elevenlabs/elevenlabs.go
  • core/providers/elevenlabs/errors.go
  • core/providers/elevenlabs/models.go
  • core/providers/gemini/batch.go
  • core/providers/gemini/errors.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/models.go
  • core/providers/gemini/videos.go
  • core/providers/groq/groq.go
  • core/providers/huggingface/errors.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/responses.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/huggingface/utils.go
  • core/providers/mistral/mistral.go
  • core/providers/mistral/models.go
  • core/providers/mistral/transcription.go
  • core/providers/nebius/errors.go
  • core/providers/nebius/nebius.go
  • core/providers/ollama/ollama.go
  • core/providers/openai/batch.go
  • core/providers/openai/errors.go
  • core/providers/openai/errors_test.go
  • core/providers/openai/files.go
  • core/providers/openai/images.go
  • core/providers/openai/large_payload.go
  • core/providers/openai/models.go
  • core/providers/openai/openai.go
  • core/providers/openai/transcription.go
  • core/providers/openai/videos.go
  • core/providers/openrouter/openrouter.go
  • core/providers/parasail/parasail.go
  • core/providers/perplexity/chat.go
  • core/providers/perplexity/perplexity.go
  • core/providers/replicate/errors.go
  • core/providers/replicate/files.go
  • core/providers/replicate/images.go
  • core/providers/replicate/models.go
  • core/providers/replicate/replicate.go
  • core/providers/replicate/utils.go
  • core/providers/replicate/videos.go
  • core/providers/runway/errors.go
  • core/providers/runway/runway.go
  • core/providers/runway/videos.go
  • core/providers/sgl/sgl.go
  • core/providers/utils/large_response.go
  • core/providers/utils/make_request_test.go
  • core/providers/utils/models.go
  • core/providers/utils/utils.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/errors.go
  • core/providers/vertex/models.go
  • core/providers/vertex/rerank.go
  • core/providers/vertex/rerank_test.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/utils.go
  • core/providers/vllm/vllm.go
  • core/providers/xai/errors.go
  • core/providers/xai/xai.go
  • core/schemas/account.go
  • core/schemas/bifrost.go
  • core/schemas/chatcompletions.go
  • core/schemas/models.go
  • core/schemas/models_test.go
  • core/schemas/tracer.go
  • core/utils.go
  • docs/features/litellm-compat.mdx
  • docs/openapi/schemas/management/providers.yaml
  • docs/providers/provider-routing.mdx
  • docs/providers/supported-providers/azure.mdx
  • docs/providers/supported-providers/bedrock.mdx
  • docs/providers/supported-providers/replicate.mdx
  • docs/providers/supported-providers/vertex.mdx
  • docs/quickstart/gateway/provider-configuration.mdx
  • docs/quickstart/go-sdk/provider-configuration.mdx
  • framework/configstore/clientconfig.go
  • framework/configstore/encryption_test.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/encryption_test.go
  • framework/configstore/tables/key.go
  • framework/configstore/tables/virtualkey.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/sync.go
  • framework/streaming/accumulator_test.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/streaming/types.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/governance/model_provider_governance_test.go
  • plugins/governance/store.go
  • plugins/litellmcompat/texttochat.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/logging/utils.go
  • plugins/logging/writer.go
  • plugins/maxim/main.go
  • plugins/mocker/main.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/telemetry/main.go
  • tests/governance/e2e_test.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/provider_keys.go
  • transports/bifrost-http/handlers/wsresponses.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/cursor.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/config_test.go
  • transports/config.schema.json
  • transports/schema_test/config_schema_test.go
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/columns.tsx
  • ui/app/workspace/providers/dialogs/addNewKeySheet.tsx
  • ui/app/workspace/providers/fragments/apiKeysFormFragment.tsx
  • ui/components/filters/filterPopover.tsx
  • ui/components/ui/headersTable.tsx
  • ui/lib/constants/config.ts
  • ui/lib/schemas/providerForm.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
  • ui/lib/utils/validation.ts
💤 Files with no reviewable changes (19)
  • framework/configstore/tables/virtualkey.go
  • core/providers/vertex/rerank_test.go
  • core/providers/bedrock/chat.go
  • core/providers/replicate/files.go
  • core/providers/replicate/videos.go
  • core/providers/groq/groq.go
  • core/providers/replicate/images.go
  • core/providers/huggingface/responses.go
  • core/providers/parasail/parasail.go
  • framework/modelcatalog/sync.go
  • core/providers/cerebras/cerebras.go
  • core/providers/replicate/errors.go
  • core/providers/cohere/chat.go
  • core/providers/anthropic/text.go
  • core/providers/vllm/utils.go
  • core/providers/perplexity/chat.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/vertex/embedding.go
✅ Files skipped from review due to trivial changes (21)
  • core/providers/xai/xai.go
  • core/schemas/models_test.go
  • docs/features/litellm-compat.mdx
  • docs/providers/supported-providers/replicate.mdx
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/transcription_stream.go
  • core/providers/bedrock/bedrock_test.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • core/internal/llmtests/speech_synthesis.go
  • plugins/governance/store.go
  • framework/logstore/migrations.go
  • core/providers/openai/videos.go
  • framework/streaming/accumulator_test.go
  • framework/modelcatalog/pricing_test.go
  • docs/providers/supported-providers/vertex.mdx
  • docs/providers/provider-routing.mdx
  • plugins/semanticcache/plugin_integration_test.go
  • core/providers/openai/errors.go
  • docs/providers/supported-providers/bedrock.mdx
  • core/providers/vertex/models.go
  • core/providers/vllm/vllm.go
🚧 Files skipped from review as they are similar to previous changes (61)
  • core/providers/huggingface/utils.go
  • core/internal/llmtests/video.go
  • core/internal/llmtests/image_edit.go
  • core/internal/llmtests/image_generation.go
  • core/providers/elevenlabs/errors.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/providers/gemini/videos.go
  • framework/streaming/transcription.go
  • docs/providers/supported-providers/azure.mdx
  • core/providers/runway/videos.go
  • plugins/litellmcompat/texttochat.go
  • core/providers/sgl/sgl.go
  • core/schemas/tracer.go
  • plugins/mocker/main.go
  • plugins/maxim/main.go
  • core/providers/gemini/errors.go
  • core/providers/bedrock/signer.go
  • framework/streaming/chat.go
  • core/providers/azure/utils.go
  • core/schemas/models.go
  • core/utils.go
  • framework/tracing/tracer.go
  • core/providers/openai/errors_test.go
  • core/providers/openai/transcription.go
  • plugins/logging/writer.go
  • core/providers/anthropic/errors.go
  • core/providers/openai/images.go
  • core/providers/bedrock/text.go
  • core/providers/xai/errors.go
  • core/providers/cohere/errors.go
  • core/providers/bedrock/rerank_test.go
  • core/providers/utils/large_response.go
  • docs/openapi/schemas/management/providers.yaml
  • core/providers/huggingface/errors.go
  • framework/configstore/encryption_test.go
  • core/providers/bedrock/invoke.go
  • core/providers/ollama/ollama.go
  • framework/modelcatalog/pricing.go
  • core/providers/cohere/models.go
  • core/internal/llmtests/response_validation.go
  • core/providers/mistral/mistral.go
  • plugins/governance/model_provider_governance_test.go
  • plugins/logging/operations.go
  • core/providers/anthropic/utils.go
  • docs/quickstart/gateway/provider-configuration.mdx
  • core/providers/gemini/models.go
  • framework/configstore/rdb.go
  • core/providers/vertex/errors.go
  • framework/configstore/tables/encryption_test.go
  • core/providers/openrouter/openrouter.go
  • core/providers/huggingface/huggingface.go
  • core/providers/nebius/nebius.go
  • core/internal/llmtests/account.go
  • core/providers/gemini/gemini.go
  • core/providers/anthropic/types.go
  • core/providers/runway/runway.go
  • core/providers/anthropic/anthropic.go
  • core/providers/mistral/models.go
  • framework/configstore/tables/key.go
  • core/bifrost.go
  • core/providers/replicate/replicate.go

Comment thread core/providers/anthropic/models.go
Comment thread core/providers/bedrock/bedrock.go Outdated
Comment thread core/providers/bedrock/bedrock.go
Comment thread framework/streaming/types.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from 15283ab to 52c1367 Compare April 2, 2026 08:13
Copy link
Copy Markdown
Collaborator Author

@greptile

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
plugins/mocker/main.go (2)

1058-1087: ⚠️ Potential issue | 🟠 Major

Default success should respect the incoming request type.

PreLLMHook() sends schemas.ResponsesRequest and schemas.ResponsesStreamRequest through handleDefaultBehavior(), but this branch always short-circuits with a chat-completions payload and RequestType: schemas.ChatCompletionRequest. That returns the wrong schema for responses callers.

Reuse the same request-type branching as generateSuccessShortCircuit(), or delegate to that helper, so default-success mocking returns a schemas.BifrostResponsesResponse / schemas.BifrostResponsesStreamResponse for responses requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/mocker/main.go` around lines 1058 - 1087, The DefaultBehaviorSuccess
branch in handleDefaultBehavior() always returns a chat-completions
short-circuit payload even when PreLLMHook() passed a schemas.ResponsesRequest
or schemas.ResponsesStreamRequest; update this branch to honor the incoming
request type by reusing the same request-type branching as
generateSuccessShortCircuit() (or call/delegate to
generateSuccessShortCircuit()) so that when the input is
schemas.ResponsesRequest you return a schemas.BifrostResponsesResponse and when
it is schemas.ResponsesStreamRequest you return a
schemas.BifrostResponsesStreamResponse, and set the corresponding
ExtraFields.RequestType accordingly instead of always using
schemas.ChatCompletionRequest.

905-917: ⚠️ Potential issue | 🔴 Critical

Don't unconditionally write through mockResponse.ChatResponse.

When this helper builds a schemas.BifrostResponsesResponse or schemas.BifrostResponsesStreamResponse, mockResponse.ChatResponse stays nil. A configured content.model will panic here instead of returning the mocked response.

Minimal crash fix
-	if content.Model != nil {
-		mockResponse.ChatResponse.Model = *content.Model
-	}
+	if content.Model != nil && mockResponse.ChatResponse != nil {
+		mockResponse.ChatResponse.Model = *content.Model
+	}

At minimum, guard the nil dereference. Ideally, mirror the override onto the active responses variant too.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/mocker/main.go` around lines 905 - 917, The code unconditionally
writes to mockResponse.ChatResponse.Model which can be nil for
schemas.BifrostResponsesResponse or schemas.BifrostResponsesStreamResponse and
causes a panic when content.Model is set; change the assignment to first check
content.Model and then guard the nil dereference by updating the active response
variant—e.g., if mockResponse.ChatResponse != nil set
mockResponse.ChatResponse.Model = *content.Model, else if
mockResponse.ChatStreamResponse (or the stream variant used in this struct) !=
nil set its Model = *content.Model; if neither exists, do nothing. Ensure you
reference content.Model, mockResponse.ChatResponse, and the stream response
field (schemas.BifrostResponsesStreamResponse) when making this change.
framework/configstore/rdb.go (1)

538-612: ⚠️ Potential issue | 🟠 Major

Don't overwrite the recomputed key hash on update.

This block computes a fresh keyHash at Lines 539-543, but Line 607 replaces it with the old database value. Because GenerateKeyHash now includes Aliases and ReplicateKeyConfig, editing either field on an existing key leaves config_hash stale and breaks hash-based reconciliation for that row.

💡 Suggested fix
 		// Check if this key already exists
 		if existingKey, exists := existingKeysMap[key.ID]; exists {
 			dbKey.ID = existingKey.ID                             // Keep the same database ID
-			dbKey.ConfigHash = existingKey.ConfigHash             // Preserve config hash
+			dbKey.ConfigHash = keyHash                            // Reflect the current key shape
 			dbKey.Status = existingKey.Status                     // Preserve status (UI-managed)
 			dbKey.Description = existingKey.Description           // Preserve description (UI-managed)
 			dbKey.EncryptionStatus = existingKey.EncryptionStatus // Preserve encryption status
 			dbKey.CreatedAt = existingKey.CreatedAt               // Preserve original creation timestamp
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/rdb.go` around lines 538 - 612, The update loop
currently recomputes a fresh keyHash via GenerateKeyHash but then overwrites it
with existingKey.ConfigHash when an existing key is found; change the merge
logic in the block that handles existingKeysMap[key.ID] so that dbKey.ConfigHash
retains the newly computed keyHash (from GenerateKeyHash) instead of assigning
existingKey.ConfigHash, while continuing to preserve other existing fields (ID,
Status, Description, EncryptionStatus, CreatedAt) and then saving with
txDB.WithContext(ctx).Save(&dbKey). Ensure no other code path later replaces
dbKey.ConfigHash with the old value.
framework/configstore/clientconfig.go (1)

585-592: ⚠️ Potential issue | 🟠 Major

Apply deterministic hashing to aliases map.

GenerateKeyHash sorts all other collections (Models, BlacklistedModels, Labels, Origins, Headers, etc.) before marshaling, but key.Aliases is a map[string]string marshaled directly. Sonic does not sort map keys by default, resulting in non-deterministic output based on Go's randomized map iteration. This breaks the reconciliation hash for identical alias sets. Convert to a sorted slice before hashing:

🛠️ Sort aliases before hashing
 	// Hash Aliases
 	if key.Aliases != nil {
-		data, err := sonic.Marshal(key.Aliases)
+		aliasKeys := make([]string, 0, len(key.Aliases))
+		for alias := range key.Aliases {
+			aliasKeys = append(aliasKeys, alias)
+		}
+		sort.Strings(aliasKeys)
+		type aliasEntry struct {
+			Alias  string `json:"alias"`
+			Target string `json:"target"`
+		}
+		entries := make([]aliasEntry, 0, len(aliasKeys))
+		for _, alias := range aliasKeys {
+			entries = append(entries, aliasEntry{
+				Alias:  alias,
+				Target: key.Aliases[alias],
+			})
+		}
+		data, err := sonic.Marshal(entries)
 		if err != nil {
 			return "", err
 		}
 		hash.Write(data)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/clientconfig.go` around lines 585 - 592,
GenerateKeyHash currently marshals key.Aliases (a map) directly which yields
nondeterministic output; change the Aliases handling in the block that checks
key.Aliases != nil to produce a deterministic, sorted representation before
hashing (e.g., build a slice of key/value pairs or "k=v" entries from
key.Aliases, sort by key, then marshal that sorted slice and write it to hash)
so the reconciliation hash is stable across runs.
♻️ Duplicate comments (5)
core/providers/bedrock/invoke.go (1)

633-643: ⚠️ Potential issue | 🔴 Critical

Restore chunk-level model lookup for invoke stream routing.

Lines 633-643 only read resp.Response, but early stream events usually do not have that populated. That leaves model == "" and sends Nova chunks through the Anthropic serializer until the completed event. A small regression test around a Created chunk for a Nova alias would lock this down.

🐛 Suggested fix
 	model := ""
-	if resp.Response != nil {
-		if resp.Response.Model != "" {
-			model = resp.Response.Model
-		} else {
-			extraFields := resp.Response.ExtraFields
-			if extraFields.ResolvedModelUsed != "" {
-				model = extraFields.ResolvedModelUsed
-			} else if extraFields.OriginalModelRequested != "" {
-				model = extraFields.OriginalModelRequested
-			}
-		}
+	if resp.ExtraFields.ResolvedModelUsed != "" {
+		model = resp.ExtraFields.ResolvedModelUsed
+	} else if resp.Response != nil && resp.Response.ExtraFields.ResolvedModelUsed != "" {
+		model = resp.Response.ExtraFields.ResolvedModelUsed
+	} else if resp.Response != nil && resp.Response.Model != "" {
+		model = resp.Response.Model
+	} else if resp.ExtraFields.OriginalModelRequested != "" {
+		model = resp.ExtraFields.OriginalModelRequested
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/invoke.go` around lines 633 - 643, The current
model-selection only inspects resp.Response, which is often nil for early stream
events; update the logic in invoke stream handling to also inspect chunk-level
fields (e.g., resp.Chunk and resp.Chunk.ExtraFields) for model hints
(ResolvedModelUsed, OriginalModelRequested) and for a model string on the chunk
itself, falling back to resp.Response only if chunk info is absent—this restores
correct routing of early "Created"/chunk events (so Nova alias chunks are
serialized correctly).
core/providers/gemini/gemini.go (1)

1217-1220: ⚠️ Potential issue | 🟠 Major

Stop logging raw Gemini error bodies.

Line 1219 still dumps the full upstream payload into logs. Gemini error bodies can echo prompt or file content, so this reopens the same log-leak path we already cleaned up elsewhere in this file.

🩹 Proposed fix
-		provider.logger.Debug(fmt.Sprintf("error from %s provider: %s", providerName, string(resp.Body())))
+		provider.logger.Debug("error from %s provider: status=%d", providerName, resp.StatusCode())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 1217 - 1220, The debug log in
the error path (provider.logger.Debug(fmt.Sprintf(...))) is printing the raw
Gemini response body and must be removed or replaced with a non-sensitive
summary; update the error-handling branch where resp.StatusCode() !=
fasthttp.StatusOK (after calling providerUtils.MaterializeStreamErrorBody and
before/after parseGeminiError) to stop logging string(resp.Body()) — either
remove the provider.logger.Debug call entirely or log a sanitized message (e.g.,
status code and a short error code or truncated/hashed summary) so that
prompt/file content from Gemini is never written to logs; keep the surrounding
calls to parseGeminiError, providerUtils.EnrichError, and the existing
sendBackRaw* flags unchanged.
core/providers/elevenlabs/elevenlabs.go (1)

248-250: ⚠️ Potential issue | 🟡 Minor

Re-enrich these post-response parse failures.

These branches run after a provider response has already been buffered, but they now return bare NewBifrostOperationError(...) values or enrich with responseBody=nil. That drops raw response data for timestamp/transcription decode and parse failures. Wrap them with providerUtils.EnrichError(...) while resp.Body()/responseBody is still in scope.

Based on learnings, provider-local fields like RawResponse remain the provider layer's responsibility, while identity/model fields are populated centrally via PopulateExtraFields(...).

Also applies to: 267-268, 536-539, 552-559, 567-570

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/elevenlabs/elevenlabs.go` around lines 248 - 250, The error
branches that currently return NewBifrostOperationError or call EnrichError with
responseBody=nil should instead call providerUtils.EnrichError while the
buffered response body (resp.Body() / responseBody returned by
CheckAndDecodeBody) is still in scope so the RawResponse is preserved; update
the code paths around CheckAndDecodeBody (and the other listed locations at
lines near 267, 536-539, 552-559, 567-570) to capture the decoded body into a
local variable and pass that body into providerUtils.EnrichError along with
providerUtils.NewBifrostOperationError, and leave provider-local fields like
RawResponse to be filled here while leaving identity/model population to
PopulateExtraFields.
core/providers/replicate/replicate.go (1)

252-255: ⚠️ Potential issue | 🟠 Major

Differentiate polling cancellation from an actual timeout.

Line 252 handles every pollCtx.Done() as ErrProviderRequestTimedOut and still wraps it as a generic operation error, so a client-cancelled async prediction is misreported as a timeout and timeout-specific handling/metrics won't fire. Return the normal cancellation error on context.Canceled, and use NewBifrostTimeoutError(...) only for the deadline case.

♻️ Suggested fix
 	case <-pollCtx.Done():
-		return nil, nil, providerResponseHeaders, providerUtils.NewBifrostOperationError(
-			schemas.ErrProviderRequestTimedOut,
-			fmt.Errorf("prediction polling timed out after %d seconds", timeoutSeconds))
+		if errors.Is(pollCtx.Err(), context.Canceled) {
+			return nil, nil, providerResponseHeaders, &schemas.BifrostError{
+				IsBifrostError: false,
+				Error: &schemas.ErrorField{
+					Type:    schemas.Ptr(schemas.RequestCancelled),
+					Message: schemas.ErrRequestCancelled,
+					Error:   pollCtx.Err(),
+				},
+			}
+		}
+		return nil, nil, providerResponseHeaders, providerUtils.NewBifrostTimeoutError(
+			schemas.ErrProviderRequestTimedOut,
+			fmt.Errorf("prediction polling timed out after %d seconds", timeoutSeconds),
+		)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/replicate/replicate.go` around lines 252 - 255, The current
poll loop treats any pollCtx.Done() as a timeout and returns
providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut),
which misreports client cancellations; update the pollCtx.Done() branch in the
function that returns (nil, nil, providerResponseHeaders, ...) to inspect the
context error (pollCtx.Err()) and: if it equals context.Canceled return the
standard cancellation error (propagate context.Canceled or a non-timeout
provider error), if it equals context.DeadlineExceeded or otherwise truly timed
out use providerUtils.NewBifrostTimeoutError(...) (or wrap
schemas.ErrProviderRequestTimedOut) so timeout-specific handling/metrics run;
reference pollCtx, providerUtils.NewBifrostOperationError,
providerUtils.NewBifrostTimeoutError, and schemas.ErrProviderRequestTimedOut
when making the change.
core/providers/bedrock/bedrock.go (1)

3319-3326: ⚠️ Potential issue | 🟠 Major

Escape raw ARN / slash-containing model IDs in the default path branch.

request.Model is now the centrally resolved identifier, so aliases can legitimately resolve to full ARNs. The plain branch still formats /model/%s/... with raw / characters, which breaks invoke/count-tokens/chat/image routes for ARN-like IDs and any other slash-containing model string.

Possible fix
 func (provider *BedrockProvider) getModelPath(basePath string, model string, key schemas.Key) string {
-	path := fmt.Sprintf("%s/%s", model, basePath)
+	escapedModel := url.PathEscape(model)
+	path := fmt.Sprintf("%s/%s", escapedModel, basePath)
 	// If ARN is present, Bedrock expects the ARN-scoped identifier
 	if key.BedrockKeyConfig != nil && key.BedrockKeyConfig.ARN != nil && key.BedrockKeyConfig.ARN.GetValue() != "" {
 		encodedModelIdentifier := url.PathEscape(fmt.Sprintf("%s/%s", key.BedrockKeyConfig.ARN.GetValue(), model))
 		path = fmt.Sprintf("%s/%s", encodedModelIdentifier, basePath)
 	}
 	return path
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 3319 - 3326, The function
getModelPath currently only URL-escapes ARN-scoped identifiers but leaves the
default branch using raw model strings (e.g., request.Model) which can contain
slashes; update getModelPath to always PathEscape the model segment in the
default path construction (use url.PathEscape(model) when building path :=
fmt.Sprintf("%s/%s", model, basePath)) while keeping the existing ARN branch
(key.BedrockKeyConfig.ARN) logic intact so any model or ARN containing slashes
is safely encoded.
🧹 Nitpick comments (4)
core/providers/openai/large_payload.go (1)

98-98: Drop the now-unused providerName parameter from finalizeOpenAIResponse.

After FinalizeResponseWithLargeDetection stopped taking it, this argument is dead and just keeps an extra provider-specific hop in a now provider-agnostic helper.

♻️ Small cleanup
-	body, result, respErr := finalizeOpenAIResponse(ctx, resp, latency, providerName, logger)
+	body, result, respErr := finalizeOpenAIResponse(ctx, resp, latency, logger)
@@
 func finalizeOpenAIResponse(
 	ctx *schemas.BifrostContext,
 	resp *fasthttp.Response,
 	latency time.Duration,
-	providerName schemas.ModelProvider,
 	logger schemas.Logger,
 ) ([]byte, *largePayloadResult, *schemas.BifrostError) {

Also applies to: 120-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/large_payload.go` at line 98, finalizeOpenAIResponse
currently accepts an unused providerName parameter; remove that parameter from
its signature and from all call sites in this file (e.g., the call at body,
result, respErr := finalizeOpenAIResponse(ctx, resp, latency, providerName,
logger) and the other occurrences around lines referenced) so the helper is
provider-agnostic—update the function definition of finalizeOpenAIResponse to
accept (ctx, resp, latency, logger) only and update every caller in
core/providers/openai/large_payload.go to pass the remaining arguments (ctx,
resp, latency, logger) without providerName.
framework/configstore/migrations.go (1)

3619-3633: Prefer schemaKeyFromTableKey in these new hash backfills.

These two new recompute paths hand-roll the tables.TableKeyschemas.Key mapping again. Reusing the shared helper here will keep future hash-input changes from drifting across migration blocks.

♻️ Suggested simplification
-				schemaKey := schemas.Key{
-					Name:               key.Name,
-					Value:              key.Value,
-					Models:             key.Models,
-					BlacklistedModels:  key.BlacklistedModels,
-					Weight:             getWeight(key.Weight),
-					AzureKeyConfig:     key.AzureKeyConfig,
-					VertexKeyConfig:    key.VertexKeyConfig,
-					BedrockKeyConfig:   key.BedrockKeyConfig,
-					Aliases:            key.Aliases,
-					VLLMKeyConfig:      key.VLLMKeyConfig,
-					ReplicateKeyConfig: key.ReplicateKeyConfig,
-					Enabled:            key.Enabled,
-					UseForBatchAPI:     key.UseForBatchAPI,
-				}
+				schemaKey := schemaKeyFromTableKey(key)

Apply the same replacement in both recompute loops.

Also applies to: 5520-5533

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 3619 - 3633, The new
recompute loops manually construct schemas.Key from tables.TableKey; replace
that hand-rolled mapping with the shared helper schemaKeyFromTableKey so both
backfill paths call schemaKeyFromTableKey(key) instead of rebuilding fields
(references: schemas.Key, tables.TableKey, schemaKeyFromTableKey), and apply the
same change for the other recompute block at the cited location (also around the
block at 5520-5533) to keep mapping logic centralized.
core/providers/mistral/models.go (1)

19-52: Add a small regression test for the alias/backfill output shape.

This path now depends on FilterModel, alias propagation, ResolvedID deduping, and BackfillModels. A focused test covering a direct model hit, an alias hit, and a backfilled alias would make future pipeline changes much safer here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/mistral/models.go` around lines 19 - 52, Add a regression test
that exercises the ListModelsPipeline path in models.go to ensure alias/backfill
output shape is stable: instantiate a ListModelsPipeline (or call the enclosing
ListModels function) with test data covering (1) a direct model hit where
model.ID maps to ResolvedID, (2) an alias hit where FilterModel returns an
AliasValue for a different ResolvedID, and (3) absence of a model so
BackfillModels must add a backfilled alias; assert the resulting
bifrostResponse.Data contains entries with the expected ID format
(schemas.Mistral + "/" + ResolvedID), Alias when AliasValue is set, that
ResolvedID deduping occurs (no duplicate ResolvedID entries), and that
BackfillModels entries are appended when included map lacks the resolved id. Use
the ListModelsPipeline, FilterModel, ResolvedID, AliasValue, BackfillModels, and
ShouldEarlyExit symbols to locate the logic to exercise.
core/providers/bedrock/bedrock.go (1)

3257-3266: Preserve the fallback failure cause.

Both failed attempts are discarded here: the returned NewBifrostOperationError carries a generic message with a nil cause even though FileList and FileContent both produced concrete errors. Please wrap at least the direct-access failure here, ideally both.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 3257 - 3266, The current
fallback swallows both listing and direct-access errors (bifrostErr and
directErr) when returning providerUtils.NewBifrostOperationError; modify the
return to include the concrete error cause(s) instead of nil by wrapping them
(e.g., create a combined error that includes bifrostErr and directErr with
fmt.Errorf and %w) and pass that wrapped error as the cause to
NewBifrostOperationError so the caller retains the original failure details for
provider.FileList/FileContent and the outputS3URI context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/bedrock/bedrock.go`:
- Around line 1695-1703: The code rejects non-ARN model IDs but callers may pass
plain foundation-model IDs listed by ListModels; resolve the model alias first
and use the resolved value when validating and building the rerank request. Call
key.Aliases.Resolve(request.Model) (or the existing alias resolver used by
requestWorker) to produce resolvedModel, then check
strings.HasPrefix(resolvedModel, "arn:") and pass resolvedModel into
ToBedrockRerankRequest (instead of request.Model) so plain ModelIDs are
normalized to ARNs before validation and request construction.

In `@core/providers/cohere/cohere.go`:
- Around line 200-207: The error paths currently return bare provider errors and
drop Cohere's response body; update the branches around
providerUtils.MaterializeStreamErrorBody / parseCohereError and the
FinalizeResponseWithLargeDetection error path so the returned error carries the
provider payload (e.g., attach resp.Body() or the decoded body to the error as
RawResponse) instead of returning a bare error; ensure the functions that call
completeRequest and listModelsByKey can read RawResponse and that provider-local
fields stay in the provider layer (leave identity/model enrichment to
PopulateExtraFields) so sendBackRawResponse can still capture 4xx/5xx and decode
payloads.

In `@core/providers/elevenlabs/elevenlabs.go`:
- Around line 102-103: The HTTP error branches are discarding the response body
before building the provider error; update the error-returning paths (e.g., in
listModelsByKey, Transcription, Speech, and SpeechStream) to read resp.Body()
and pass it into parseElevenlabsError / EnrichError so the returned ElevenLabs
error includes RawResponse; specifically, capture the raw response bytes
(resp.Body()) and set the provider-local RawResponse field (or pass it as an
argument) when calling parseElevenlabsError or EnrichError, leaving
identity/model population to PopulateExtraFields and ensuring
sendBackRawResponse can access the upstream 4xx/5xx payload.

In `@core/providers/gemini/gemini.go`:
- Around line 440-444: The error handlers call parseGeminiError and copy
resp.Body() while the response is still streaming, which can lose the upstream
raw body; update the handlers (HandleGeminiResponsesStream, SpeechStream,
TranscriptionStream and the non-stream branch shown) to first materialize the
streamed error body by invoking providerUtils.MaterializeStreamErrorBody(ctx,
resp) (or the equivalent helper used elsewhere) before calling parseGeminiError
or copying resp.Body(), then continue with
providerUtils.ReleaseStreamingResponse(resp) and the existing EnrichError call
so the parsed error and jsonBody contain the full buffered payload.

In `@core/utils.go`:
- Around line 135-169: The validateKey function currently mis-validates
URL-backed and env-var-backed configs; update validateKey to use
EnvVar.IsDefined() instead of GetValue() for Bedrock value checks and for Azure
endpoint checks, and add explicit cases for schemas.Ollama and schemas.SGL that
require their per-key config and that their .Url field IsDefined() (e.g., check
key.OllamaKeyConfig != nil && key.OllamaKeyConfig.Url.IsDefined()); similarly
ensure Azure validates key.AzureKeyConfig != nil and
key.AzureKeyConfig.Endpoint.IsDefined(). Keep the existing checks for Vertex,
Replicate, VLLM but replace any GetValue() string comparisons with IsDefined()
where env var fields represent URLs.

In `@framework/configstore/migrations.go`:
- Around line 5502-5516: The backfill in
migrationDropDeploymentColumnsAndAddAliases only updates rows where
TableKey.Provider == "replicate", missing custom providers whose
custom_provider_config_json.base_provider_type == "replicate"; change the query
that updates ReplicateUseDeploymentsEndpoint (and the subsequent selection of
affectedKeys) to include keys whose provider is either the literal "replicate"
OR whose custom_provider_config_json->>'base_provider_type' = 'replicate' (e.g.,
via a JOIN on the custom provider config table or a JSON field predicate) so any
Replicate-backed TableKey (not just the canonical name) with non-empty
aliases_json gets ReplicateUseDeploymentsEndpoint set and has its config_hash
recomputed.

In `@framework/configstore/tables/encryption_test.go`:
- Around line 200-208: The test currently silently succeeds when
raw["aliases_json"] is nil or an unexpected type because rawAliasesStr remains
empty; change the test to fail fast by asserting that raw["aliases_json"] is not
nil and is either a string or []byte before converting it: check rawAliasesVal
(from raw["aliases_json"]) and call t.Fatalf or assert.FailNow with a clear
message if it's nil or not one of the expected types, then convert into
rawAliasesStr and proceed with assert.NotContains; apply the same guard to the
other occurrences noted (lines ~1189-1197) referencing
rawAliasesVal/rawAliasesStr and assert.NotContains.

In `@framework/logstore/migrations.go`:
- Around line 2205-2212: The Rollback function drops the alias column without
removing its index first, which breaks on SQLite; update the Rollback in
migrations.go to check for and remove the index "idx_logs_alias" (e.g. using
mig.HasIndex(&Log{}, "idx_logs_alias") and mig.DropIndex(&Log{},
"idx_logs_alias")) before calling mig.DropColumn(&Log{}, "alias") so the index
is removed prior to dropping the column.
- Around line 2187-2201: The migration "logs_add_alias_column" currently sets
UseTransaction = true and runs a blocking CREATE INDEX; change it to set
UseTransaction = false so the migration acts as a version marker only (keep the
column-add logic in the Migration Migrate function but remove/avoid executing
the CREATE INDEX statement there), and ensure the concurrent index creation for
idx_logs_alias is performed by the background index builder in
ensurePerformanceIndexes() (follow the pattern used by
migrationAddProviderHistogramIndex, migrationAddMetadataGINIndex, and
migrationAddLogsAndDashboardPerformanceIndexes) so the index is created with
CREATE INDEX CONCURRENTLY after startup without blocking writes.

---

Outside diff comments:
In `@framework/configstore/clientconfig.go`:
- Around line 585-592: GenerateKeyHash currently marshals key.Aliases (a map)
directly which yields nondeterministic output; change the Aliases handling in
the block that checks key.Aliases != nil to produce a deterministic, sorted
representation before hashing (e.g., build a slice of key/value pairs or "k=v"
entries from key.Aliases, sort by key, then marshal that sorted slice and write
it to hash) so the reconciliation hash is stable across runs.

In `@framework/configstore/rdb.go`:
- Around line 538-612: The update loop currently recomputes a fresh keyHash via
GenerateKeyHash but then overwrites it with existingKey.ConfigHash when an
existing key is found; change the merge logic in the block that handles
existingKeysMap[key.ID] so that dbKey.ConfigHash retains the newly computed
keyHash (from GenerateKeyHash) instead of assigning existingKey.ConfigHash,
while continuing to preserve other existing fields (ID, Status, Description,
EncryptionStatus, CreatedAt) and then saving with
txDB.WithContext(ctx).Save(&dbKey). Ensure no other code path later replaces
dbKey.ConfigHash with the old value.

In `@plugins/mocker/main.go`:
- Around line 1058-1087: The DefaultBehaviorSuccess branch in
handleDefaultBehavior() always returns a chat-completions short-circuit payload
even when PreLLMHook() passed a schemas.ResponsesRequest or
schemas.ResponsesStreamRequest; update this branch to honor the incoming request
type by reusing the same request-type branching as generateSuccessShortCircuit()
(or call/delegate to generateSuccessShortCircuit()) so that when the input is
schemas.ResponsesRequest you return a schemas.BifrostResponsesResponse and when
it is schemas.ResponsesStreamRequest you return a
schemas.BifrostResponsesStreamResponse, and set the corresponding
ExtraFields.RequestType accordingly instead of always using
schemas.ChatCompletionRequest.
- Around line 905-917: The code unconditionally writes to
mockResponse.ChatResponse.Model which can be nil for
schemas.BifrostResponsesResponse or schemas.BifrostResponsesStreamResponse and
causes a panic when content.Model is set; change the assignment to first check
content.Model and then guard the nil dereference by updating the active response
variant—e.g., if mockResponse.ChatResponse != nil set
mockResponse.ChatResponse.Model = *content.Model, else if
mockResponse.ChatStreamResponse (or the stream variant used in this struct) !=
nil set its Model = *content.Model; if neither exists, do nothing. Ensure you
reference content.Model, mockResponse.ChatResponse, and the stream response
field (schemas.BifrostResponsesStreamResponse) when making this change.

---

Duplicate comments:
In `@core/providers/bedrock/bedrock.go`:
- Around line 3319-3326: The function getModelPath currently only URL-escapes
ARN-scoped identifiers but leaves the default branch using raw model strings
(e.g., request.Model) which can contain slashes; update getModelPath to always
PathEscape the model segment in the default path construction (use
url.PathEscape(model) when building path := fmt.Sprintf("%s/%s", model,
basePath)) while keeping the existing ARN branch (key.BedrockKeyConfig.ARN)
logic intact so any model or ARN containing slashes is safely encoded.

In `@core/providers/bedrock/invoke.go`:
- Around line 633-643: The current model-selection only inspects resp.Response,
which is often nil for early stream events; update the logic in invoke stream
handling to also inspect chunk-level fields (e.g., resp.Chunk and
resp.Chunk.ExtraFields) for model hints (ResolvedModelUsed,
OriginalModelRequested) and for a model string on the chunk itself, falling back
to resp.Response only if chunk info is absent—this restores correct routing of
early "Created"/chunk events (so Nova alias chunks are serialized correctly).

In `@core/providers/elevenlabs/elevenlabs.go`:
- Around line 248-250: The error branches that currently return
NewBifrostOperationError or call EnrichError with responseBody=nil should
instead call providerUtils.EnrichError while the buffered response body
(resp.Body() / responseBody returned by CheckAndDecodeBody) is still in scope so
the RawResponse is preserved; update the code paths around CheckAndDecodeBody
(and the other listed locations at lines near 267, 536-539, 552-559, 567-570) to
capture the decoded body into a local variable and pass that body into
providerUtils.EnrichError along with providerUtils.NewBifrostOperationError, and
leave provider-local fields like RawResponse to be filled here while leaving
identity/model population to PopulateExtraFields.

In `@core/providers/gemini/gemini.go`:
- Around line 1217-1220: The debug log in the error path
(provider.logger.Debug(fmt.Sprintf(...))) is printing the raw Gemini response
body and must be removed or replaced with a non-sensitive summary; update the
error-handling branch where resp.StatusCode() != fasthttp.StatusOK (after
calling providerUtils.MaterializeStreamErrorBody and before/after
parseGeminiError) to stop logging string(resp.Body()) — either remove the
provider.logger.Debug call entirely or log a sanitized message (e.g., status
code and a short error code or truncated/hashed summary) so that prompt/file
content from Gemini is never written to logs; keep the surrounding calls to
parseGeminiError, providerUtils.EnrichError, and the existing sendBackRaw* flags
unchanged.

In `@core/providers/replicate/replicate.go`:
- Around line 252-255: The current poll loop treats any pollCtx.Done() as a
timeout and returns
providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut),
which misreports client cancellations; update the pollCtx.Done() branch in the
function that returns (nil, nil, providerResponseHeaders, ...) to inspect the
context error (pollCtx.Err()) and: if it equals context.Canceled return the
standard cancellation error (propagate context.Canceled or a non-timeout
provider error), if it equals context.DeadlineExceeded or otherwise truly timed
out use providerUtils.NewBifrostTimeoutError(...) (or wrap
schemas.ErrProviderRequestTimedOut) so timeout-specific handling/metrics run;
reference pollCtx, providerUtils.NewBifrostOperationError,
providerUtils.NewBifrostTimeoutError, and schemas.ErrProviderRequestTimedOut
when making the change.

---

Nitpick comments:
In `@core/providers/bedrock/bedrock.go`:
- Around line 3257-3266: The current fallback swallows both listing and
direct-access errors (bifrostErr and directErr) when returning
providerUtils.NewBifrostOperationError; modify the return to include the
concrete error cause(s) instead of nil by wrapping them (e.g., create a combined
error that includes bifrostErr and directErr with fmt.Errorf and %w) and pass
that wrapped error as the cause to NewBifrostOperationError so the caller
retains the original failure details for provider.FileList/FileContent and the
outputS3URI context.

In `@core/providers/mistral/models.go`:
- Around line 19-52: Add a regression test that exercises the ListModelsPipeline
path in models.go to ensure alias/backfill output shape is stable: instantiate a
ListModelsPipeline (or call the enclosing ListModels function) with test data
covering (1) a direct model hit where model.ID maps to ResolvedID, (2) an alias
hit where FilterModel returns an AliasValue for a different ResolvedID, and (3)
absence of a model so BackfillModels must add a backfilled alias; assert the
resulting bifrostResponse.Data contains entries with the expected ID format
(schemas.Mistral + "/" + ResolvedID), Alias when AliasValue is set, that
ResolvedID deduping occurs (no duplicate ResolvedID entries), and that
BackfillModels entries are appended when included map lacks the resolved id. Use
the ListModelsPipeline, FilterModel, ResolvedID, AliasValue, BackfillModels, and
ShouldEarlyExit symbols to locate the logic to exercise.

In `@core/providers/openai/large_payload.go`:
- Line 98: finalizeOpenAIResponse currently accepts an unused providerName
parameter; remove that parameter from its signature and from all call sites in
this file (e.g., the call at body, result, respErr :=
finalizeOpenAIResponse(ctx, resp, latency, providerName, logger) and the other
occurrences around lines referenced) so the helper is provider-agnostic—update
the function definition of finalizeOpenAIResponse to accept (ctx, resp, latency,
logger) only and update every caller in core/providers/openai/large_payload.go
to pass the remaining arguments (ctx, resp, latency, logger) without
providerName.

In `@framework/configstore/migrations.go`:
- Around line 3619-3633: The new recompute loops manually construct schemas.Key
from tables.TableKey; replace that hand-rolled mapping with the shared helper
schemaKeyFromTableKey so both backfill paths call schemaKeyFromTableKey(key)
instead of rebuilding fields (references: schemas.Key, tables.TableKey,
schemaKeyFromTableKey), and apply the same change for the other recompute block
at the cited location (also around the block at 5520-5533) to keep mapping logic
centralized.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 339c36cd-4117-4f35-9c3c-f14946e873e8

📥 Commits

Reviewing files that changed from the base of the PR and between 15283ab and 52c1367.

📒 Files selected for processing (177)
  • core/bifrost.go
  • core/internal/llmtests/account.go
  • core/internal/llmtests/image_edit.go
  • core/internal/llmtests/image_generation.go
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/response_validation.go
  • core/internal/llmtests/speech_synthesis.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/internal/llmtests/transcription_stream.go
  • core/internal/llmtests/video.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/batch.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/errors.go
  • core/providers/anthropic/models.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/text.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/utils.go
  • core/providers/azure/azure.go
  • core/providers/azure/files.go
  • core/providers/azure/models.go
  • core/providers/azure/utils.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/invoke.go
  • core/providers/bedrock/models.go
  • core/providers/bedrock/rerank_test.go
  • core/providers/bedrock/s3.go
  • core/providers/bedrock/signer.go
  • core/providers/bedrock/text.go
  • core/providers/cerebras/cerebras.go
  • core/providers/cohere/chat.go
  • core/providers/cohere/cohere.go
  • core/providers/cohere/errors.go
  • core/providers/cohere/models.go
  • core/providers/elevenlabs/elevenlabs.go
  • core/providers/elevenlabs/errors.go
  • core/providers/elevenlabs/models.go
  • core/providers/gemini/batch.go
  • core/providers/gemini/errors.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/models.go
  • core/providers/gemini/videos.go
  • core/providers/groq/groq.go
  • core/providers/huggingface/errors.go
  • core/providers/huggingface/huggingface.go
  • core/providers/huggingface/models.go
  • core/providers/huggingface/responses.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/huggingface/utils.go
  • core/providers/mistral/mistral.go
  • core/providers/mistral/models.go
  • core/providers/mistral/transcription.go
  • core/providers/nebius/errors.go
  • core/providers/nebius/nebius.go
  • core/providers/ollama/ollama.go
  • core/providers/openai/batch.go
  • core/providers/openai/errors.go
  • core/providers/openai/errors_test.go
  • core/providers/openai/files.go
  • core/providers/openai/images.go
  • core/providers/openai/large_payload.go
  • core/providers/openai/models.go
  • core/providers/openai/openai.go
  • core/providers/openai/transcription.go
  • core/providers/openai/videos.go
  • core/providers/openrouter/openrouter.go
  • core/providers/parasail/parasail.go
  • core/providers/perplexity/chat.go
  • core/providers/perplexity/perplexity.go
  • core/providers/replicate/errors.go
  • core/providers/replicate/files.go
  • core/providers/replicate/images.go
  • core/providers/replicate/models.go
  • core/providers/replicate/replicate.go
  • core/providers/replicate/utils.go
  • core/providers/replicate/videos.go
  • core/providers/runway/errors.go
  • core/providers/runway/runway.go
  • core/providers/runway/videos.go
  • core/providers/sgl/sgl.go
  • core/providers/utils/large_response.go
  • core/providers/utils/make_request_test.go
  • core/providers/utils/models.go
  • core/providers/utils/utils.go
  • core/providers/vertex/embedding.go
  • core/providers/vertex/errors.go
  • core/providers/vertex/models.go
  • core/providers/vertex/rerank.go
  • core/providers/vertex/rerank_test.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/utils.go
  • core/providers/vllm/vllm.go
  • core/providers/xai/errors.go
  • core/providers/xai/xai.go
  • core/schemas/account.go
  • core/schemas/bifrost.go
  • core/schemas/chatcompletions.go
  • core/schemas/models.go
  • core/schemas/models_test.go
  • core/schemas/tracer.go
  • core/utils.go
  • docs/features/litellm-compat.mdx
  • docs/openapi/schemas/management/providers.yaml
  • docs/providers/provider-routing.mdx
  • docs/providers/supported-providers/azure.mdx
  • docs/providers/supported-providers/bedrock.mdx
  • docs/providers/supported-providers/replicate.mdx
  • docs/providers/supported-providers/vertex.mdx
  • docs/quickstart/gateway/provider-configuration.mdx
  • docs/quickstart/go-sdk/provider-configuration.mdx
  • framework/configstore/clientconfig.go
  • framework/configstore/encryption_test.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/encryption_test.go
  • framework/configstore/tables/key.go
  • framework/configstore/tables/virtualkey.go
  • framework/logstore/migrations.go
  • framework/logstore/rdb.go
  • framework/logstore/store.go
  • framework/logstore/tables.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/sync.go
  • framework/streaming/accumulator_test.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/streaming/types.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/governance/model_provider_governance_test.go
  • plugins/governance/store.go
  • plugins/litellmcompat/texttochat.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/logging/utils.go
  • plugins/logging/writer.go
  • plugins/maxim/main.go
  • plugins/mocker/main.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • plugins/semanticcache/plugin_integration_test.go
  • plugins/semanticcache/plugin_vectorstore_test.go
  • plugins/telemetry/main.go
  • tests/governance/e2e_test.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/logging.go
  • transports/bifrost-http/handlers/provider_keys.go
  • transports/bifrost-http/handlers/wsresponses.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/cursor.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/config_test.go
  • transports/config.schema.json
  • transports/schema_test/config_schema_test.go
  • ui/app/workspace/logs/page.tsx
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • ui/app/workspace/logs/views/columns.tsx
  • ui/app/workspace/providers/dialogs/addNewKeySheet.tsx
  • ui/app/workspace/providers/fragments/apiKeysFormFragment.tsx
  • ui/components/filters/filterPopover.tsx
  • ui/components/ui/headersTable.tsx
  • ui/lib/constants/config.ts
  • ui/lib/schemas/providerForm.ts
  • ui/lib/store/apis/logsApi.ts
  • ui/lib/types/config.ts
  • ui/lib/types/logs.ts
  • ui/lib/types/schemas.ts
  • ui/lib/utils/validation.ts
💤 Files with no reviewable changes (19)
  • core/providers/vertex/rerank_test.go
  • core/providers/cohere/chat.go
  • core/providers/perplexity/chat.go
  • core/providers/vertex/embedding.go
  • core/providers/vllm/utils.go
  • core/providers/bedrock/chat.go
  • core/providers/groq/groq.go
  • core/providers/replicate/videos.go
  • core/providers/cerebras/cerebras.go
  • core/providers/anthropic/text.go
  • core/providers/huggingface/responses.go
  • core/providers/replicate/errors.go
  • core/providers/huggingface/speech.go
  • core/providers/huggingface/transcription.go
  • core/providers/replicate/images.go
  • framework/modelcatalog/sync.go
  • core/providers/parasail/parasail.go
  • framework/configstore/tables/virtualkey.go
  • core/providers/replicate/files.go
✅ Files skipped from review due to trivial changes (26)
  • core/internal/llmtests/image_variation.go
  • core/internal/llmtests/speech_synthesis_stream.go
  • core/internal/llmtests/video.go
  • docs/providers/supported-providers/replicate.mdx
  • plugins/governance/model_provider_governance_test.go
  • core/schemas/models_test.go
  • framework/streaming/transcription.go
  • core/providers/xai/xai.go
  • core/internal/llmtests/transcription_stream.go
  • plugins/governance/store.go
  • plugins/semanticcache/plugin_cache_type_test.go
  • docs/providers/supported-providers/bedrock.mdx
  • framework/streaming/accumulator_test.go
  • core/providers/azure/files.go
  • framework/streaming/types.go
  • core/providers/anthropic/utils.go
  • docs/providers/provider-routing.mdx
  • docs/quickstart/gateway/provider-configuration.mdx
  • core/providers/bedrock/rerank_test.go
  • plugins/logging/main.go
  • core/providers/anthropic/models.go
  • core/providers/openai/batch.go
  • core/providers/runway/runway.go
  • framework/configstore/tables/key.go
  • core/providers/mistral/mistral.go
  • core/providers/vertex/models.go
🚧 Files skipped from review as they are similar to previous changes (57)
  • core/internal/llmtests/image_generation.go
  • core/internal/llmtests/speech_synthesis.go
  • core/providers/gemini/videos.go
  • core/providers/openai/videos.go
  • core/providers/runway/errors.go
  • core/providers/bedrock/signer.go
  • core/providers/utils/make_request_test.go
  • framework/streaming/chat.go
  • framework/streaming/audio.go
  • framework/logstore/store.go
  • core/providers/openai/transcription.go
  • core/providers/bedrock/s3.go
  • core/providers/anthropic/errors.go
  • plugins/maxim/main.go
  • core/providers/nebius/errors.go
  • core/internal/llmtests/response_validation.go
  • core/providers/gemini/errors.go
  • framework/modelcatalog/pricing_test.go
  • core/internal/llmtests/image_edit.go
  • core/schemas/tracer.go
  • core/providers/huggingface/utils.go
  • docs/features/litellm-compat.mdx
  • framework/streaming/responses.go
  • core/providers/gemini/batch.go
  • core/providers/openai/images.go
  • framework/tracing/tracer.go
  • core/providers/vertex/rerank.go
  • framework/logstore/tables.go
  • core/providers/cohere/errors.go
  • core/providers/elevenlabs/errors.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/bedrock_test.go
  • plugins/logging/operations.go
  • framework/configstore/encryption_test.go
  • plugins/logging/writer.go
  • core/providers/huggingface/errors.go
  • core/providers/bedrock/text.go
  • core/providers/openrouter/openrouter.go
  • core/providers/azure/utils.go
  • core/providers/nebius/nebius.go
  • core/providers/runway/videos.go
  • core/providers/openai/files.go
  • docs/openapi/schemas/management/providers.yaml
  • core/providers/perplexity/perplexity.go
  • plugins/governance/main.go
  • core/providers/mistral/transcription.go
  • core/providers/huggingface/huggingface.go
  • core/providers/cohere/models.go
  • core/providers/openai/errors.go
  • docs/providers/supported-providers/vertex.mdx
  • core/internal/llmtests/account.go
  • core/providers/elevenlabs/models.go
  • core/bifrost.go
  • core/providers/azure/azure.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/anthropic.go
  • core/providers/vllm/vllm.go

Comment thread core/providers/bedrock/bedrock.go
Comment thread core/providers/cohere/cohere.go
Comment thread core/providers/elevenlabs/elevenlabs.go
Comment thread core/providers/gemini/gemini.go
Comment thread core/utils.go
Comment thread framework/configstore/migrations.go
Comment thread framework/configstore/tables/encryption_test.go
Comment thread framework/logstore/migrations.go Outdated
Comment thread framework/logstore/migrations.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch 2 times, most recently from e6a71df to 99bcebe Compare April 2, 2026 09:17
akshaydeo
akshaydeo previously approved these changes Apr 2, 2026
Copy link
Copy Markdown
Contributor

akshaydeo commented Apr 2, 2026

Merge activity

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
core/providers/ollama/ollama.go (2)

83-85: ⚠️ Potential issue | 🟡 Minor

Stale doc comment references per-key URL routing that no longer exists.

The comment mentions "each backend is queried with its own URL (from ollama_key_config)" but all keys now use the same provider.networkConfig.BaseURL.

Suggested fix
 // ListModels performs a list models request to Ollama's API.
-// Requests are made concurrently per key so that each backend is queried
-// with its own URL (from ollama_key_config).
+// Requests are made concurrently per key to aggregate results.
 func (provider *OllamaProvider) ListModels(ctx *schemas.BifrostContext, keys []schemas.Key, request *schemas.BifrostListModelsRequest) (*schemas.BifrostListModelsResponse, *schemas.BifrostError) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/ollama/ollama.go` around lines 83 - 85, Update the stale doc
comment for ListModels in ollama.go to remove the per-key URL routing mention
and instead state that all keys use the shared provider.networkConfig.BaseURL;
reference the ListModels function and the provider.networkConfig.BaseURL symbol
so reviewers can find the change, and keep the comment concise about concurrent
requests per key but using the same base URL.

51-52: ⚠️ Potential issue | 🟡 Minor

Stale comment: Per-key URL support removed but comment still references it.

The comment states "BaseURL is optional when keys have ollama_key_config with per-key URLs", but the code no longer reads ollama_key_config.url. Either restore per-key URL support or update this comment to reflect the new behavior.

Suggested fix if per-key URLs are intentionally removed
-	// BaseURL is optional when keys have ollama_key_config with per-key URLs
+	// BaseURL is required and used for all Ollama requests
 	return &OllamaProvider{
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/ollama/ollama.go` around lines 51 - 52, Update the stale
comment that mentions per-key URLs: remove or replace "BaseURL is optional when
keys have ollama_key_config with per-key URLs" with a statement that reflects
current behavior (e.g., "BaseURL must be provided; per-key URL support via
ollama_key_config has been removed"). Edit the comment in ollama.go near the
BaseURL declaration so it no longer references ollama_key_config or per-key URLs
and accurately describes how BaseURL is used by the code.
core/providers/sgl/sgl.go (1)

69-80: ⚠️ Potential issue | 🔴 Critical

Restore per-key SGL URL resolution before dispatch.

In the provided stack context, a single provider.networkConfig.BaseURL cannot represent the per-key SGLang endpoints that SGLKeyConfig.URL is still validated and documented for in core/utils.go:172-177 and core/schemas/account.go:262-274. With these call sites hardcoded to the provider-level URL, multi-instance SGL setups collapse onto one backend, ListModels now fans out one request per key to the same host instead of querying each configured instance, and key-only configs produce invalid relative URLs when BaseURL is empty. Please resolve the base URL from key.SGLKeyConfig.URL here, with networkConfig.BaseURL only as an explicit fallback if that contract is being changed in the same stack.

Also applies to: 97-110, 117-132, 137-150, 159-176, 204-216

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/sgl/sgl.go` around lines 69 - 80, The code is using
provider.networkConfig.BaseURL for SGL requests but must resolve the per-key
endpoint from key.SGLKeyConfig.URL first; update listModelsByKey to compute
baseURL := key.SGLKeyConfig.URL and fallback to provider.networkConfig.BaseURL
if empty, then pass baseURL+providerUtils.GetPathFromContext(ctx, "/v1/models")
to openai.ListModelsByKey (and apply the same change in the other SGL call sites
mentioned: the blocks covering lines 97-110, 117-132, 137-150, 159-176, 204-216)
so each request uses the key-specific URL when present and only uses
provider.networkConfig.BaseURL as an explicit fallback.
core/providers/utils/make_request_test.go (1)

297-314: ⚠️ Potential issue | 🟠 Major

Stop asserting Provider on the bare timeout constructor.

NewBifrostTimeoutError() no longer initializes ExtraFields.Provider, so this test now fails even though the constructor change is intentional. If you still want provider coverage, move that assertion to a higher-level request-path test after extra fields are populated.

🧪 Suggested fix
 	if err.Error.Message != "test timeout" {
 		t.Fatalf("expected 'test timeout', got %s", err.Error.Message)
 	}
-	if err.ExtraFields.Provider != "openai" {
-		t.Fatalf("expected provider openai, got %s", err.ExtraFields.Provider)
-	}
 }

Based on learnings, in maximhq/bifrost the error fields Provider, OriginalModelRequested, and ResolvedModelUsed are populated centrally by (*BifrostError).PopulateExtraFields in core/bifrost.go.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/make_request_test.go` around lines 297 - 314, The test
asserts ExtraFields.Provider on NewBifrostTimeoutError, but
NewBifrostTimeoutError no longer sets Provider; remove the provider assertion
from TestNewBifrostTimeoutError (or move it to a higher-level test that calls
BifrostError.PopulateExtraFields) so the unit test only verifies fields that the
constructor actually initializes (e.g., IsBifrostError, StatusCode, Error.Type,
Error.Message); if you need provider coverage, add a separate test that
constructs the error and then calls (*BifrostError).PopulateExtraFields to
assert ExtraFields.Provider, OriginalModelRequested, and ResolvedModelUsed.
docs/providers/supported-providers/replicate.mdx (1)

72-82: ⚠️ Potential issue | 🟡 Minor

Rename the surrounding prose to aliases.

The JSON example now uses top-level aliases, but the text above it still tells users to configure deployments in Replicate key config. That points readers at the old shape.

✏️ Suggested wording
-Configure deployed models in the Replicate key configuration. Deployments map custom model identifiers to actual deployment paths.
+Configure deployment aliases at the key level. Aliases map custom model identifiers to the actual deployment paths.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/supported-providers/replicate.mdx` around lines 72 - 82, The
surrounding explanatory prose above the JSON example still refers to configuring
"deployments" in the Replicate key configuration; update that text to refer to
the top-level "aliases" mapping instead, e.g., change wording to explain that
users should add an "aliases" object that maps local identifiers (like
"my-model") to deployment paths (like "owner/my-deployment-name"), and ensure
the prose mentions the JSON key "aliases" so it matches the example and
clarifies where to put those mappings.
docs/providers/supported-providers/azure.mdx (1)

171-179: ⚠️ Potential issue | 🟠 Major

Document that allowed_models uses the alias name, not the resolved deployment ID.

With aliases configured, key selection still happens against the originally requested model. A config like allowed_models: ["my-gpt4-deployment"] looks reasonable from this section, but the key will never match because requests are filtered as gpt-4 before alias resolution. Please call that out where allowed_models is described.

💡 Suggested wording
- `allowed_models` - List of allowed models to use from this key (optional)
+ `allowed_models` - List of allowed requested model names / aliases to use from this key (optional). When `aliases` is configured, list the alias names here, not the resolved Azure deployment IDs.
Based on learnings, `selectKeyFromProviderForModel` intentionally filters keys using the original requested model string, so key `Models` / `BlacklistedModels` must list alias names and alias resolution happens later in `requestWorker`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/supported-providers/azure.mdx` around lines 171 - 179, Update
the docs to clarify that allowed_models must list alias names (the keys from
aliases) not resolved deployment IDs: state that selectKeyFromProviderForModel
filters keys against the original requested model string (so
allowed_models/Models/BlacklistedModels should use alias names) and that alias
resolution occurs later in requestWorker when the deployment ID is substituted;
mention aliases, allowed_models, selectKeyFromProviderForModel, and
requestWorker by name to guide readers and avoid confusion.
framework/streaming/images.go (1)

308-319: ⚠️ Potential issue | 🟡 Minor

Preserve RawRequest nil-pointer semantics.

RawRequest is always non-nil here, even when the provider never attached one. That collapses “not set” into “set to nil”, which changes downstream serialization behavior.

💡 Suggested fix
-			var rawRequest interface{}
+			var rawRequest *interface{}
 			if result != nil && result.ImageGenerationStreamResponse != nil && result.ImageGenerationStreamResponse.ExtraFields.RawRequest != nil {
-				rawRequest = result.ImageGenerationStreamResponse.ExtraFields.RawRequest
+				rawRequest = bifrost.Ptr(result.ImageGenerationStreamResponse.ExtraFields.RawRequest)
 			}
 			return &ProcessedStreamResponse{
 				RequestID:      requestID,
 				StreamType:     StreamTypeImage,
 				Provider:       provider,
 				RequestedModel: requestedModel,
 				ResolvedModel:  resolvedModel,
 				Data:           data,
-				RawRequest:     &rawRequest,
+				RawRequest:     rawRequest,
 			}, nil

Based on learnings, in maximhq/bifrost streaming code ProcessedStreamResponse.RawRequest is kept as *interface{} so a nil pointer means “not set” while a non-nil pointer wrapping a nil interface means “set to a nil value.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/streaming/images.go` around lines 308 - 319,
ProcessedStreamResponse.RawRequest is always set to a non-nil *interface{}
because the code takes the address of a local zero-value (var rawRequest
interface{}) and always assigns &rawRequest; change this so RawRequest stays nil
when the provider never attached one. Only create a pointer and assign
RawRequest when result.ImageGenerationStreamResponse != nil &&
result.ImageGenerationStreamResponse.ExtraFields.RawRequest != nil (e.g. copy
that ExtraFields.RawRequest into a local variable rr and set RawRequest = &rr);
otherwise leave RawRequest as nil. Ensure you update the block that builds the
ProcessedStreamResponse (the variables result,
ImageGenerationStreamResponse.ExtraFields.RawRequest and the
ProcessedStreamResponse.RawRequest assignment) accordingly.
plugins/governance/main.go (1)

702-705: ⚠️ Potential issue | 🟠 Major

Use typed context keys instead of raw strings for model and modelId.

Lines 702, 705, 860, and 868 use plain string keys "model" and "modelId", which violates the coding guideline requiring dedicated named types. These keys are read in transports/bifrost-http/integrations/genai.go (lines 846, 952, 1219, 1268) and transports/bifrost-http/integrations/bedrock.go (line 1166). Define typed key constants in schemas and migrate all reads and writes together:

  • ctx.SetValue("model", ...) → typed key
  • ctx.UserValue("model") → same typed key

Apply to both "model" and "modelId" across both files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/governance/main.go` around lines 702 - 705, Replace raw string
context keys "model" and "modelId" with dedicated typed key constants: add typed
key types/constants (e.g., type ContextKey string; const ModelKey ContextKey =
"model"; const ModelIDKey ContextKey = "modelId") in the schemas package, then
update all usages where ctx.SetValue("model", ...), ctx.SetValue("modelId",
...), and ctx.UserValue("model") / ctx.UserValue("modelId") are used (including
in the code that sets values in the middleware and the reads in functions
referenced as transports/bifrost-http/integrations/genai.go and
transports/bifrost-http/integrations/bedrock.go) to use the new typed keys
(ctx.SetValue(schemas.ModelKey, ...) and ctx.UserValue(schemas.ModelKey), etc.),
and ensure imports reference the schemas package where keys are defined.
core/bifrost.go (1)

4297-4372: ⚠️ Potential issue | 🟠 Major

Recompute provider/model after RunLLMPreHooks before populating errors.

All of these PopulateExtraFields(...) calls still use provider and model captured before the pre-hooks ran. A hook that rewrites routing metadata will therefore produce short-circuit, shutdown, and queue-wait errors that point at the pre-hook provider/model instead of the request that actually moved through the pipeline.

♻️ Suggested direction
effectiveProvider, effectiveModel, _ := preReq.GetRequestFields()
// use effectiveProvider/effectiveModel for PopulateExtraFields calls
// after pre-hooks have returned a non-nil request

Based on learnings, provider switching from a PreLLMHook uses the updated request as the single source of truth for downstream routing and attribution.

Also applies to: 4481-4620

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4297 - 4372, After RunLLMPreHooks returns a
non-nil preReq, recompute the effective routing attribution from the updated
request and use those values when populating errors: call
preReq.GetRequestFields() (e.g. effectiveProvider, effectiveModel, _) and
replace uses of the original provider/model in PopulateExtraFields(...) in all
error-return paths following the pre-hook (shortCircuit error branch, nil preReq
check, pq.isClosing() branches, queue send/select default and blocking select,
and ctx.Done() branches). Ensure every bifrostErr.PopulateExtraFields(...) after
the pre-hooks uses effectiveProvider/effectiveModel so error metadata reflects
the request after pre-hook routing changes.
core/providers/gemini/gemini.go (1)

737-760: ⚠️ Potential issue | 🟠 Major

Drain the large-payload request reader in this path.

GeminiProvider.responsesWithLargeResponseDetection is the only large-payload request path here that never drains the ingress reader after providerUtils.MakeRequestWithContext returns. If Gemini responds before consuming the full upload, this can still surface broken-pipe/502 errors through the proxy chain.

🩹 Minimal fix
latency, bifrostErr, wait := providerUtils.MakeRequestWithContext(ctx, streamingClient, req, resp)
if bifrostErr != nil {
    wait()
    fasthttp.ReleaseResponse(resp)
    return nil, bifrostErr
}
+if bodyReader != nil {
+   providerUtils.DrainLargePayloadRemainder(ctx)
+}

 // Handle error response — materialize stream body for error parsing
 if resp.StatusCode() != fasthttp.StatusOK {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 737 - 760, The error paths
after providerUtils.MakeRequestWithContext (the bifrostErr branch), the non-200
response branch (after providerUtils.MaterializeStreamErrorBody /
parseGeminiError), and the respErr branch (after
providerUtils.FinalizeResponseWithLargeDetection) must drain the incoming
request's upload reader before calling wait()/return to avoid broken-pipe/502s;
add a call to the shared drain helper (e.g., providerUtils.DrainRequestBody or
analogous drain function) passing ctx and req in each of these branches, or if
no helper exists, synchronously read-and-discard the request body/reader until
EOF before releasing the response and returning.
♻️ Duplicate comments (6)
core/providers/utils/models.go (1)

327-348: ⚠️ Potential issue | 🟡 Minor

Keep wildcard alias backfills in a stable order.

Line 329 still ranges p.Aliases directly, so the alias-only tail of /models can reshuffle between identical calls.

🔁 Suggested fix
 	// Case B: wildcard allowlist — backfill only explicitly configured aliases.
 	if !p.Unfiltered && len(p.Aliases) > 0 {
-		for aliasKey, providerID := range p.Aliases {
+		aliasKeys := make([]string, 0, len(p.Aliases))
+		for aliasKey := range p.Aliases {
+			aliasKeys = append(aliasKeys, aliasKey)
+		}
+		sort.Slice(aliasKeys, func(i, j int) bool {
+			ai, aj := strings.ToLower(aliasKeys[i]), strings.ToLower(aliasKeys[j])
+			if ai == aj {
+				return aliasKeys[i] < aliasKeys[j]
+			}
+			return ai < aj
+		})
+		for _, aliasKey := range aliasKeys {
+			providerID := p.Aliases[aliasKey]
 			if included[strings.ToLower(aliasKey)] {
 				continue
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/utils/models.go` around lines 327 - 348, The loop over
p.Aliases is non-deterministic because map iteration order can change; before
iterating (where p.Aliases is used to build alias-only models) collect the keys
of p.Aliases into a slice, sort that slice (e.g., lexicographically, using
strings.ToLower if you want case-insensitive stability), then iterate the sorted
keys and use p.Aliases[key] for providerID; keep the existing blacklist check
(matches, p.BlacklistedModels) and append to result as before so the generated
schemas.Model order is stable across identical calls.
core/providers/anthropic/models.go (1)

23-27: ⚠️ Potential issue | 🟠 Major

Don't backfill from page-scoped Anthropic data.

This method still returns HasMore/NextPageToken for the current Anthropic page. BackfillModels therefore runs without full-list context and can synthesize allowlisted/aliased entries that actually belong to earlier or later pages, which leads to duplicates and mixed metadata once clients paginate. Backfill needs to happen after the full provider listing is accumulated, or this converter needs cumulative state from the caller.

Also applies to: 61-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/models.go` around lines 23 - 27, The converter is
exposing page-scoped Anthropic pagination (response.HasMore/response.LastID ->
bifrostResponse.NextPageToken) which lets BackfillModels run on partial pages
and produce duplicates; stop emitting NextPageToken for page-scoped responses
and only set bifrostResponse.NextPageToken when the converter has the complete
provider listing (or when the caller passes a cumulative/full-list flag).
Specifically, remove or guard the mapping from response.HasMore/response.LastID
to bifrostResponse.NextPageToken in the models conversion code (the block that
reads response.HasMore && response.LastID) and either (a) perform BackfillModels
after the full provider list is accumulated, or (b) add an explicit
parameter/flag to the conversion function indicating “isFullList” and only set
NextPageToken when that flag is true. Ensure BackfillModels is invoked by the
caller after full aggregation rather than per page.
core/providers/elevenlabs/elevenlabs.go (1)

267-268: ⚠️ Potential issue | 🟡 Minor

Keep with-timestamps parse failures on the enriched error path.

Once the 200 body is available, this bare NewBifrostOperationError drops the provider payload and forwarded headers that the other ElevenLabs error paths preserve.

♻️ Minimal fix
-			return nil, providerUtils.NewBifrostOperationError("failed to parse with-timestamps response", err)
+			return nil, providerUtils.EnrichError(
+				ctx,
+				providerUtils.NewBifrostOperationError("failed to parse with-timestamps response", err),
+				jsonData,
+				body,
+				provider.sendBackRawRequest,
+				provider.sendBackRawResponse,
+			)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/elevenlabs/elevenlabs.go` around lines 267 - 268, The
sonic.Unmarshal failure currently returns providerUtils.NewBifrostOperationError
which loses the provider payload and forwarded headers; change this to use the
same enriched error constructor used by the other ElevenLabs error paths (the
variant that preserves providerPayload and forwardedHeaders) when handling the
timestamp parsing error—replace the NewBifrostOperationError call in the
sonic.Unmarshal -> timestampResponse error branch with the enriched error
builder used elsewhere in this file so the 200 body and forwarded headers are
attached to the returned error.
core/bifrost.go (1)

6237-6237: ⚠️ Potential issue | 🟠 Major

Validate keys against baseProviderType here.

Passing providerKey into validateKey reopens the custom-provider validation gap: custom Azure/Bedrock/Vertex/Replicate/VLLM providers can bypass the base-provider config checks again.

♻️ Minimal fix
- isKeyValid := validateKey(providerKey, &key)
+ isKeyValid := validateKey(baseProviderType, &key)

Based on learnings, selectKeyFromProviderForModel is the centralized key-config validation gate, so custom providers must be validated against CustomProviderConfig.BaseProviderType rather than the custom provider key.

Also applies to: 6253-6253

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` at line 6237, The call to validateKey currently passes the
custom provider key (providerKey) which allows custom providers to bypass
base-provider checks; update the call(s) to validateKey to pass the resolved
base provider type (use the value derived from
CustomProviderConfig.BaseProviderType or the variable named baseProviderType
used by selectKeyFromProviderForModel) instead of providerKey so validation is
performed against the base provider. Locate usages like validateKey(providerKey,
&key) and replace the first argument with the base provider type (the same
identifier used by selectKeyFromProviderForModel), and apply the same change to
the other occurrence(s) of validateKey in this file to ensure all custom
providers are validated against BaseProviderType.
core/providers/gemini/gemini.go (2)

440-443: ⚠️ Potential issue | 🟠 Major

Materialize streamed error bodies before parsing handshake failures.

These branches run with resp.StreamBody = true, but they still call parseGeminiError(resp) before buffering the body. If Gemini returns a streamed JSON error here, the upstream message/raw response is lost right at handshake time.

🩹 Minimal fix
if resp.StatusCode() != fasthttp.StatusOK {
    defer providerUtils.ReleaseStreamingResponse(resp)
+   providerUtils.MaterializeStreamErrorBody(ctx, resp)
    respBody := append([]byte(nil), resp.Body()...)
    return nil, providerUtils.EnrichError(ctx, parseGeminiError(resp), jsonBody, respBody, sendBackRawRequest, sendBackRawResponse)
}

Apply the same providerUtils.MaterializeStreamErrorBody(ctx, resp) call before parseGeminiError(resp) in the HandleGeminiResponsesStream, GeminiProvider.SpeechStream, and GeminiProvider.TranscriptionStream handshake branches as well.

Also applies to: 928-930, 1411-1413, 1700-1702

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 440 - 443, The handshake
branches in HandleGeminiResponsesStream, GeminiProvider.SpeechStream, and
GeminiProvider.TranscriptionStream (and the other listed spots) call
parseGeminiError(resp) while resp.StreamBody is true, which can lose streamed
JSON error bodies; before calling parseGeminiError(resp) or any error-enrichment
that reads resp.Body, call providerUtils.MaterializeStreamErrorBody(ctx, resp)
to buffer the stream (then defer providerUtils.ReleaseStreamingResponse(resp) as
appropriate) so parseGeminiError and providerUtils.EnrichError see the full
error body.

1217-1220: ⚠️ Potential issue | 🟠 Major

Stop logging raw Gemini error bodies here.

resp.Body() can echo prompts, filenames, or other request data. Logging it in this branch reintroduces the same data-leak path already removed from the image handlers.

🩹 Minimal fix
if resp.StatusCode() != fasthttp.StatusOK {
    providerUtils.MaterializeStreamErrorBody(ctx, resp)
-   provider.logger.Debug(fmt.Sprintf("error from %s provider: %s", providerName, string(resp.Body())))
+   provider.logger.Debug("error from %s provider: status=%d", providerName, resp.StatusCode())
    parsedErr := providerUtils.EnrichError(ctx, parseGeminiError(resp), jsonData, nil, provider.sendBackRawRequest, provider.sendBackRawResponse)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini.go` around lines 1217 - 1220, The code logs raw
Gemini response bodies via provider.logger.Debug(string(resp.Body())), which may
leak prompts or filenames; remove that debug call (or replace it with a
non-sensitive message) inside the error branch where resp.StatusCode() !=
fasthttp.StatusOK, leaving providerUtils.MaterializeStreamErrorBody(ctx, resp)
and parsedErr := providerUtils.EnrichError(... ) intact; if you want tracing,
log only non-sensitive metadata such as providerName and resp.StatusCode()
instead of resp.Body(), referencing the provider.logger.Debug call, resp.Body(),
providerUtils.MaterializeStreamErrorBody, parseGeminiError, and
providerUtils.EnrichError to locate the code to change.
🧹 Nitpick comments (3)
core/providers/mistral/transcription.go (1)

112-163: Remove stale providerName parameter now that error helpers no longer use it.

After this change set, providerName is threaded into parseTranscriptionFormDataBodyFromRequest but not used anymore. It’s safe to simplify both helper signatures.

♻️ Suggested cleanup
-func createMistralTranscriptionMultipartBody(req *MistralTranscriptionRequest, providerName schemas.ModelProvider) (*bytes.Buffer, string, *schemas.BifrostError) {
+func createMistralTranscriptionMultipartBody(req *MistralTranscriptionRequest) (*bytes.Buffer, string, *schemas.BifrostError) {
 	var body bytes.Buffer
 	writer := multipart.NewWriter(&body)

-	if err := parseTranscriptionFormDataBodyFromRequest(writer, req, providerName); err != nil {
+	if err := parseTranscriptionFormDataBodyFromRequest(writer, req); err != nil {
 		return nil, "", err
 	}
 	return &body, writer.FormDataContentType(), nil
 }

-func parseTranscriptionFormDataBodyFromRequest(writer *multipart.Writer, req *MistralTranscriptionRequest, providerName schemas.ModelProvider) *schemas.BifrostError {
+func parseTranscriptionFormDataBodyFromRequest(writer *multipart.Writer, req *MistralTranscriptionRequest) *schemas.BifrostError {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/mistral/transcription.go` around lines 112 - 163, The helper
parameter providerName is no longer used—remove it from the signature of
parseTranscriptionFormDataBodyFromRequest and any helper functions it threads
into (and update calls such as the one in the multipart form construction block
shown) so they no longer accept or pass providerName; update the function
declaration for parseTranscriptionFormDataBodyFromRequest, remove providerName
from all call sites within transcription.go (and any helper functions it
forwards to), and run a build to ensure all remaining references are removed.
core/providers/elevenlabs/elevenlabs.go (1)

457-458: Measure done-chunk latency from stream setup.

startTime is captured before client.Do, while lastChunkTime starts after the stream is ready. That makes the final Latency the only metric here that includes handshake time.

Based on learnings, initialize the streaming startTime inside the streaming goroutine after client.Do and the stream is set up so final-chunk latency reflects post-handshake streaming time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/elevenlabs/elevenlabs.go` around lines 457 - 458, Currently
startTime is captured before client.Do, making Latency include handshake time;
move the startTime initialization into the streaming goroutine immediately after
client.Do (i.e., after the stream is successfully created/ready) so both
lastChunkTime and final Latency measure only post-handshake streaming time.
Update references to startTime used when computing Latency (and any comparisons
with lastChunkTime) to use the new in-goroutine startTime, and remove the
pre-client.Do capture; keep ChunkIndex/Latency assignment logic (ChunkIndex:
chunkIndex + 1, Latency: time.Since(startTime).Milliseconds()) unchanged but
sourced from the relocated startTime. Ensure no race by scoping startTime to the
goroutine where the stream is read.
core/providers/vertex/models.go (1)

57-72: Refresh the function comment to match the new pipeline flow.

The block above this signature still describes the removed three-pass/deployments implementation, so it now misstates how models are filtered and backfilled here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vertex/models.go` around lines 57 - 72, The function comment
above VertexListModelsResponse.ToBifrostListModelsResponse is outdated and still
describes the removed three-pass/deployments logic; update the docblock to
accurately describe the current pipeline: outline that models are produced from
the Vertex API response, then filtered using allowedModels and blacklistedModels
(with unfiltered overriding filters), and that aliases map is used to map model
IDs to display names; remove references to "custom vs non-custom", "digit-only
deployments", and the three-pass backfill/deployments steps and succinctly state
the current filtering/backfill behavior and parameter roles (allowedModels,
blacklistedModels, aliases, unfiltered).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 4993-4997: The code currently mutates the caller's request by
calling req.SetModel(resolvedModel) (after computing originalModelRequested and
resolvedModel); instead, avoid rewriting the shared request in place — use
BifrostRequest.Clone() to create a shallow copy, call SetModel(resolvedModel) on
the clone (so Provider/Model can be rewritten) and pass the clone to downstream
logic, while keeping originalModelRequested available for key selection and
allow/block list checks; do not change key selection to use resolvedModel.
- Around line 3876-3885: The postHookRunner currently stamps chunk/error extra
fields from the original req instead of the post-prehook request returned by
RunLLMPreHooks; update postHookRunner to use the request returned by the
pre-hook (the preReq/postReq value from RunLLMPreHooks) when calling
GetRequestFields and when passing the requestType/provider/model into
PopulateExtraFields (i.e., replace uses of req and wsProvider/wsModel derived
from req with the fields derived from the pre-hook request) so streamed
chunks/errors reflect the effective routing metadata; ensure both
result.PopulateExtraFields and err.PopulateExtraFields use that post-prehook
request’s values.

In `@core/providers/anthropic/models.go`:
- Around line 41-58: The loop appends entries to bifrostResponse.Data before
marking a ResolvedID as seen, so duplicate ResolvedID (or case variants) can be
emitted; update the loop in the response.Data iteration (where
pipeline.FilterModel(model.ID) yields result and you build entry) to compute key
:= strings.ToLower(result.ResolvedID), check included[key] and skip appending if
true, otherwise set included[key] = true before appending; ensure the same
dedupe logic covers alias shadowing by using the ResolvedID key used for
constructing entry.ID.

In `@core/providers/elevenlabs/elevenlabs.go`:
- Around line 567-570: The unconditional sonic.Unmarshal in the
ShouldSendBackRawResponse branch can fail redundantly because
parseTranscriptionResponse() already parsed the JSON body; update the block in
elevenlabs.go so that when providerUtils.ShouldSendBackRawResponse(ctx,
provider.sendBackRawResponse) is true you try sonic.Unmarshal(responseBody,
&rawResponse) but on error do not return
NewBifrostOperationError(schemas.ErrProviderRawResponseUnmarshal); instead
fallback to using the already-parsed result from parseTranscriptionResponse()
(or a safe nil/default) as the rawResponse so we do not fail on
malformed-but-previously-parsed payloads; keep all existing identifiers
(rawResponse, sonic.Unmarshal, providerUtils.NewBifrostOperationError,
schemas.ErrProviderRawResponseUnmarshal) and only change the error path to use
the parsed fallback.

In `@core/providers/huggingface/models.go`:
- Around line 26-34: The converter constructs providerUtils.ListModelsPipeline
without setting the inferenceProvider, so the backfill rewrites model IDs onto
the current providerKey and can double-prefix or misattribute HuggingFace
models; fix by passing the correct inferenceProvider into the pipeline
(add/assign the inferenceProvider field on the ListModelsPipeline instance used
in this converter and in the analogous block around the other backfill code),
and ensure the backfill logic uses pipeline.InferenceProvider when rewriting IDs
(or alternatively move the backfill step to run after per-inference-provider
results are aggregated) so final IDs are formed as
providerKey/inferenceProvider/modelID and not backfilled globally.

In `@core/providers/ollama/ollama.go`:
- Around line 67-68: The doc comment for listModelsByKey is stale: it references
"resolving the per-key URL" which no longer happens because listModelsByKey now
uses provider.networkConfig.BaseURL for all keys; update the comment on the
listModelsByKey function to accurately describe current behavior (i.e., that it
queries each key against the shared provider.networkConfig.BaseURL rather than
resolving per-key URLs) and remove any mention of per-key URL resolution so the
comment matches the implementation.
- Line 73: The code currently ignores per-key URLs by always using
provider.networkConfig.BaseURL (in ollama.go where
provider.networkConfig.BaseURL+providerUtils.GetPathFromContext(...) is used)
while validation and comments still reference OllamaKeyConfig.URL and missing
helper functions getBaseURL()/baseURLOrError(); fix by restoring per-key URL
resolution: reintroduce/replicate the getBaseURL()/baseURLOrError() logic used
to prefer key.OllamaKeyConfig.URL and fall back to
provider.networkConfig.BaseURL, use that resolved base URL in ollama.go (and
sgl.go) for all requests, and update/remove the validation in core/utils.go that
requires OllamaKeyConfig.URL if you choose to drop per-key support; also update
comments in ollama.go and sgl.go to accurately describe the behavior.

In `@core/providers/utils/models.go`:
- Around line 28-43: ToDisplayName currently only splits the id on one separator
at a time (either "-" or "_"), allowing mixed IDs like "gemini_2.0-pro-exp" to
retain underscores; change ToDisplayName to split on both hyphen and underscore
(use strings.FieldsFunc or an equivalent that treats '-' or '_' as separators),
then lowercase+title-case each non-empty part (the existing loop over parts) and
join with a single space before returning; keep the caser variable and the same
part normalization logic in the ToDisplayName function to ensure consistent
capitalization.

In `@core/schemas/models.go`:
- Line 141: The change from the public JSON field `deployment` to `alias` breaks
wire compatibility; restore backward compatibility by accepting both names:
either add a new exported field `Deployment` with the original
`json:"deployment,omitempty"` tag and map it into the existing `Alias` usage, or
implement custom UnmarshalJSON/MarshalJSON for the model struct so the JSON
accepts and emits both `deployment` and `alias` (ensure internal code uses
`Alias` as the canonical value). Update any constructors/serializers that
previously referenced `Deployment` to normalize into `Alias` and mark
`Deployment` as deprecated in comments.

In `@core/utils.go`:
- Around line 168-175: The validateKey function was altered to pre-filter
schemas.Ollama and schemas.SGL by checking key.OllamaKeyConfig and
key.SGLKeyConfig, which suppresses provider-level errors; revert this by
removing those two case branches so validateKey does not special-case
schemas.Ollama or schemas.SGL and lets the provider implementations handle
nil/unconfigured key configs (references: validateKey, schemas.Ollama,
schemas.SGL, key.OllamaKeyConfig, key.SGLKeyConfig).

In `@docs/providers/supported-providers/azure.mdx`:
- Around line 157-169: The "Direct Authentication (API Key)" example is missing
the API key value so it produces a credential-less config; update the JSON
example under "azure_key_config" to include the top-level API key field (e.g.,
add "value": "<YOUR_AZURE_OPENAI_API_KEY>") alongside "endpoint" and
"api_version" so the keys "aliases" and "azure_key_config" (and their fields
"endpoint" and "api_version") form a valid direct-auth configuration that
actually contains the API key.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 4297-4372: After RunLLMPreHooks returns a non-nil preReq,
recompute the effective routing attribution from the updated request and use
those values when populating errors: call preReq.GetRequestFields() (e.g.
effectiveProvider, effectiveModel, _) and replace uses of the original
provider/model in PopulateExtraFields(...) in all error-return paths following
the pre-hook (shortCircuit error branch, nil preReq check, pq.isClosing()
branches, queue send/select default and blocking select, and ctx.Done()
branches). Ensure every bifrostErr.PopulateExtraFields(...) after the pre-hooks
uses effectiveProvider/effectiveModel so error metadata reflects the request
after pre-hook routing changes.

In `@core/providers/gemini/gemini.go`:
- Around line 737-760: The error paths after
providerUtils.MakeRequestWithContext (the bifrostErr branch), the non-200
response branch (after providerUtils.MaterializeStreamErrorBody /
parseGeminiError), and the respErr branch (after
providerUtils.FinalizeResponseWithLargeDetection) must drain the incoming
request's upload reader before calling wait()/return to avoid broken-pipe/502s;
add a call to the shared drain helper (e.g., providerUtils.DrainRequestBody or
analogous drain function) passing ctx and req in each of these branches, or if
no helper exists, synchronously read-and-discard the request body/reader until
EOF before releasing the response and returning.

In `@core/providers/ollama/ollama.go`:
- Around line 83-85: Update the stale doc comment for ListModels in ollama.go to
remove the per-key URL routing mention and instead state that all keys use the
shared provider.networkConfig.BaseURL; reference the ListModels function and the
provider.networkConfig.BaseURL symbol so reviewers can find the change, and keep
the comment concise about concurrent requests per key but using the same base
URL.
- Around line 51-52: Update the stale comment that mentions per-key URLs: remove
or replace "BaseURL is optional when keys have ollama_key_config with per-key
URLs" with a statement that reflects current behavior (e.g., "BaseURL must be
provided; per-key URL support via ollama_key_config has been removed"). Edit the
comment in ollama.go near the BaseURL declaration so it no longer references
ollama_key_config or per-key URLs and accurately describes how BaseURL is used
by the code.

In `@core/providers/sgl/sgl.go`:
- Around line 69-80: The code is using provider.networkConfig.BaseURL for SGL
requests but must resolve the per-key endpoint from key.SGLKeyConfig.URL first;
update listModelsByKey to compute baseURL := key.SGLKeyConfig.URL and fallback
to provider.networkConfig.BaseURL if empty, then pass
baseURL+providerUtils.GetPathFromContext(ctx, "/v1/models") to
openai.ListModelsByKey (and apply the same change in the other SGL call sites
mentioned: the blocks covering lines 97-110, 117-132, 137-150, 159-176, 204-216)
so each request uses the key-specific URL when present and only uses
provider.networkConfig.BaseURL as an explicit fallback.

In `@core/providers/utils/make_request_test.go`:
- Around line 297-314: The test asserts ExtraFields.Provider on
NewBifrostTimeoutError, but NewBifrostTimeoutError no longer sets Provider;
remove the provider assertion from TestNewBifrostTimeoutError (or move it to a
higher-level test that calls BifrostError.PopulateExtraFields) so the unit test
only verifies fields that the constructor actually initializes (e.g.,
IsBifrostError, StatusCode, Error.Type, Error.Message); if you need provider
coverage, add a separate test that constructs the error and then calls
(*BifrostError).PopulateExtraFields to assert ExtraFields.Provider,
OriginalModelRequested, and ResolvedModelUsed.

In `@docs/providers/supported-providers/azure.mdx`:
- Around line 171-179: Update the docs to clarify that allowed_models must list
alias names (the keys from aliases) not resolved deployment IDs: state that
selectKeyFromProviderForModel filters keys against the original requested model
string (so allowed_models/Models/BlacklistedModels should use alias names) and
that alias resolution occurs later in requestWorker when the deployment ID is
substituted; mention aliases, allowed_models, selectKeyFromProviderForModel, and
requestWorker by name to guide readers and avoid confusion.

In `@docs/providers/supported-providers/replicate.mdx`:
- Around line 72-82: The surrounding explanatory prose above the JSON example
still refers to configuring "deployments" in the Replicate key configuration;
update that text to refer to the top-level "aliases" mapping instead, e.g.,
change wording to explain that users should add an "aliases" object that maps
local identifiers (like "my-model") to deployment paths (like
"owner/my-deployment-name"), and ensure the prose mentions the JSON key
"aliases" so it matches the example and clarifies where to put those mappings.

In `@framework/streaming/images.go`:
- Around line 308-319: ProcessedStreamResponse.RawRequest is always set to a
non-nil *interface{} because the code takes the address of a local zero-value
(var rawRequest interface{}) and always assigns &rawRequest; change this so
RawRequest stays nil when the provider never attached one. Only create a pointer
and assign RawRequest when result.ImageGenerationStreamResponse != nil &&
result.ImageGenerationStreamResponse.ExtraFields.RawRequest != nil (e.g. copy
that ExtraFields.RawRequest into a local variable rr and set RawRequest = &rr);
otherwise leave RawRequest as nil. Ensure you update the block that builds the
ProcessedStreamResponse (the variables result,
ImageGenerationStreamResponse.ExtraFields.RawRequest and the
ProcessedStreamResponse.RawRequest assignment) accordingly.

In `@plugins/governance/main.go`:
- Around line 702-705: Replace raw string context keys "model" and "modelId"
with dedicated typed key constants: add typed key types/constants (e.g., type
ContextKey string; const ModelKey ContextKey = "model"; const ModelIDKey
ContextKey = "modelId") in the schemas package, then update all usages where
ctx.SetValue("model", ...), ctx.SetValue("modelId", ...), and
ctx.UserValue("model") / ctx.UserValue("modelId") are used (including in the
code that sets values in the middleware and the reads in functions referenced as
transports/bifrost-http/integrations/genai.go and
transports/bifrost-http/integrations/bedrock.go) to use the new typed keys
(ctx.SetValue(schemas.ModelKey, ...) and ctx.UserValue(schemas.ModelKey), etc.),
and ensure imports reference the schemas package where keys are defined.

---

Duplicate comments:
In `@core/bifrost.go`:
- Line 6237: The call to validateKey currently passes the custom provider key
(providerKey) which allows custom providers to bypass base-provider checks;
update the call(s) to validateKey to pass the resolved base provider type (use
the value derived from CustomProviderConfig.BaseProviderType or the variable
named baseProviderType used by selectKeyFromProviderForModel) instead of
providerKey so validation is performed against the base provider. Locate usages
like validateKey(providerKey, &key) and replace the first argument with the base
provider type (the same identifier used by selectKeyFromProviderForModel), and
apply the same change to the other occurrence(s) of validateKey in this file to
ensure all custom providers are validated against BaseProviderType.

In `@core/providers/anthropic/models.go`:
- Around line 23-27: The converter is exposing page-scoped Anthropic pagination
(response.HasMore/response.LastID -> bifrostResponse.NextPageToken) which lets
BackfillModels run on partial pages and produce duplicates; stop emitting
NextPageToken for page-scoped responses and only set
bifrostResponse.NextPageToken when the converter has the complete provider
listing (or when the caller passes a cumulative/full-list flag). Specifically,
remove or guard the mapping from response.HasMore/response.LastID to
bifrostResponse.NextPageToken in the models conversion code (the block that
reads response.HasMore && response.LastID) and either (a) perform BackfillModels
after the full provider list is accumulated, or (b) add an explicit
parameter/flag to the conversion function indicating “isFullList” and only set
NextPageToken when that flag is true. Ensure BackfillModels is invoked by the
caller after full aggregation rather than per page.

In `@core/providers/elevenlabs/elevenlabs.go`:
- Around line 267-268: The sonic.Unmarshal failure currently returns
providerUtils.NewBifrostOperationError which loses the provider payload and
forwarded headers; change this to use the same enriched error constructor used
by the other ElevenLabs error paths (the variant that preserves providerPayload
and forwardedHeaders) when handling the timestamp parsing error—replace the
NewBifrostOperationError call in the sonic.Unmarshal -> timestampResponse error
branch with the enriched error builder used elsewhere in this file so the 200
body and forwarded headers are attached to the returned error.

In `@core/providers/gemini/gemini.go`:
- Around line 440-443: The handshake branches in HandleGeminiResponsesStream,
GeminiProvider.SpeechStream, and GeminiProvider.TranscriptionStream (and the
other listed spots) call parseGeminiError(resp) while resp.StreamBody is true,
which can lose streamed JSON error bodies; before calling parseGeminiError(resp)
or any error-enrichment that reads resp.Body, call
providerUtils.MaterializeStreamErrorBody(ctx, resp) to buffer the stream (then
defer providerUtils.ReleaseStreamingResponse(resp) as appropriate) so
parseGeminiError and providerUtils.EnrichError see the full error body.
- Around line 1217-1220: The code logs raw Gemini response bodies via
provider.logger.Debug(string(resp.Body())), which may leak prompts or filenames;
remove that debug call (or replace it with a non-sensitive message) inside the
error branch where resp.StatusCode() != fasthttp.StatusOK, leaving
providerUtils.MaterializeStreamErrorBody(ctx, resp) and parsedErr :=
providerUtils.EnrichError(... ) intact; if you want tracing, log only
non-sensitive metadata such as providerName and resp.StatusCode() instead of
resp.Body(), referencing the provider.logger.Debug call, resp.Body(),
providerUtils.MaterializeStreamErrorBody, parseGeminiError, and
providerUtils.EnrichError to locate the code to change.

In `@core/providers/utils/models.go`:
- Around line 327-348: The loop over p.Aliases is non-deterministic because map
iteration order can change; before iterating (where p.Aliases is used to build
alias-only models) collect the keys of p.Aliases into a slice, sort that slice
(e.g., lexicographically, using strings.ToLower if you want case-insensitive
stability), then iterate the sorted keys and use p.Aliases[key] for providerID;
keep the existing blacklist check (matches, p.BlacklistedModels) and append to
result as before so the generated schemas.Model order is stable across identical
calls.

---

Nitpick comments:
In `@core/providers/elevenlabs/elevenlabs.go`:
- Around line 457-458: Currently startTime is captured before client.Do, making
Latency include handshake time; move the startTime initialization into the
streaming goroutine immediately after client.Do (i.e., after the stream is
successfully created/ready) so both lastChunkTime and final Latency measure only
post-handshake streaming time. Update references to startTime used when
computing Latency (and any comparisons with lastChunkTime) to use the new
in-goroutine startTime, and remove the pre-client.Do capture; keep
ChunkIndex/Latency assignment logic (ChunkIndex: chunkIndex + 1, Latency:
time.Since(startTime).Milliseconds()) unchanged but sourced from the relocated
startTime. Ensure no race by scoping startTime to the goroutine where the stream
is read.

In `@core/providers/mistral/transcription.go`:
- Around line 112-163: The helper parameter providerName is no longer
used—remove it from the signature of parseTranscriptionFormDataBodyFromRequest
and any helper functions it threads into (and update calls such as the one in
the multipart form construction block shown) so they no longer accept or pass
providerName; update the function declaration for
parseTranscriptionFormDataBodyFromRequest, remove providerName from all call
sites within transcription.go (and any helper functions it forwards to), and run
a build to ensure all remaining references are removed.

In `@core/providers/vertex/models.go`:
- Around line 57-72: The function comment above
VertexListModelsResponse.ToBifrostListModelsResponse is outdated and still
describes the removed three-pass/deployments logic; update the docblock to
accurately describe the current pipeline: outline that models are produced from
the Vertex API response, then filtered using allowedModels and blacklistedModels
(with unfiltered overriding filters), and that aliases map is used to map model
IDs to display names; remove references to "custom vs non-custom", "digit-only
deployments", and the three-pass backfill/deployments steps and succinctly state
the current filtering/backfill behavior and parameter roles (allowedModels,
blacklistedModels, aliases, unfiltered).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

Comment thread core/bifrost.go Outdated
Comment thread core/bifrost.go
Comment thread core/providers/anthropic/models.go
Comment thread core/providers/elevenlabs/elevenlabs.go Outdated
Comment thread core/providers/huggingface/models.go
Comment thread core/providers/ollama/ollama.go Outdated
Comment thread core/providers/utils/models.go
Comment thread core/schemas/models.go
Comment thread core/utils.go
Comment thread docs/providers/supported-providers/azure.mdx
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
docs/providers/supported-providers/azure.mdx (1)

171-179: ⚠️ Potential issue | 🟡 Minor

Use models here, not allowed_models.

This section still documents the allowlist as allowed_models, but the examples in this alias refactor use the top-level models key. Keeping the old name here will send readers to a config field that does not match the rest of the docs.

✏️ Suggested wording
-- `allowed_models` - List of allowed models to use from this key (optional)
+- `models` - List of allowed models to use from this key (optional)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/supported-providers/azure.mdx` around lines 171 - 179, The doc
lists the allowlist field as `allowed_models` which is inconsistent with the
alias refactor examples that use the top-level `models` key; update the
configuration details to replace `allowed_models` with `models` and adjust the
brief description to say "List of allowed models to use from this key
(optional)" so the documented field name matches the examples and other
references (look for `allowed_models`, `models`, and `aliases` in the azure
OpenAI config section).
framework/streaming/transcription.go (1)

192-203: ⚠️ Potential issue | 🟡 Minor

Preserve the RawRequest nil-pointer contract.

RawRequest: &rawRequest is always non-nil here, even when no raw request was captured. That turns the “unset” case into “explicitly nil”, which downstream serialization and plugin code intentionally distinguish. Only populate ProcessedStreamResponse.RawRequest when ExtraFields.RawRequest is actually present.

Based on learnings, ProcessedStreamResponse.RawRequest is intentionally a *interface{} so downstream code can distinguish “not set” from “set to nil”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/streaming/transcription.go` around lines 192 - 203, The code always
takes the address of rawRequest and assigns it to
ProcessedStreamResponse.RawRequest which makes RawRequest non-nil even when
nothing was captured; change the logic so ProcessedStreamResponse.RawRequest is
only set when result.TranscriptionStreamResponse.ExtraFields.RawRequest is
present: keep the existing check for
result.TranscriptionStreamResponse.ExtraFields.RawRequest, assign the value to a
local rawRequest, create a pointer (e.g. rawRequestPtr) only in that branch, and
set RawRequest: rawRequestPtr; otherwise leave RawRequest nil. Target the block
that constructs the ProcessedStreamResponse and the local rawRequest handling
around result.TranscriptionStreamResponse.ExtraFields.RawRequest.
core/providers/sgl/sgl.go (1)

69-80: ⚠️ Potential issue | 🔴 Critical

Restore per-key SGL base URL resolution.

These request builders now ignore sgl_key_config.url and always concatenate against provider.networkConfig.BaseURL. Any deployment that relies on per-key SGL endpoints — especially multi-backend ListModels — will either hit the wrong host or build a relative URL like /v1/... when the provider-level base URL is intentionally empty. Please restore the previous per-key baseURLOrError path across all of these call sites.

Based on learnings, SGLKeyConfig.URL is a supported configuration field and should be treated as configured when defined via a literal or env-var reference.

Also applies to: 96-110, 117-132, 136-176, 203-216

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/sgl/sgl.go` around lines 69 - 80, The call in
SGLProvider.listModelsByKey always concatenates provider.networkConfig.BaseURL
and ignores per-key SGLKeyConfig.URL; restore per-key base URL resolution by
computing a baseURLOrError from the key's SGLKeyConfig.URL (falling back to
provider.networkConfig.BaseURL) and use baseURLOrError +
providerUtils.GetPathFromContext(ctx, "/v1/models") when calling
openai.ListModelsByKey; apply the same fix to the other SGLProvider call sites
referenced (the blocks around lines 96-110, 117-132, 136-176, 203-216) so each
uses the per-key base URL instead of always using
provider.networkConfig.BaseURL, and keep the other parameters (provider.client,
key, request.Unfiltered, provider.networkConfig.ExtraHeaders,
provider.GetProviderKey, providerUtils.ShouldSendBackRawRequest/Response)
unchanged.
framework/configstore/clientconfig.go (1)

585-592: ⚠️ Potential issue | 🟠 Major

Sort alias keys before hashing to ensure config_hash determinism.

KeyAliases is a plain map[string]string with no custom marshaling. sonic.Marshal does not guarantee deterministic key ordering by default (map iteration in Go is randomized), so identical alias sets will hash differently across process restarts. This breaks config_hash reconciliation in the persistence layer. Sort alias keys lexicographically before marshaling, or convert to a sorted slice of key-value pairs before hashing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/clientconfig.go` around lines 585 - 592, key.Aliases is
a map and sonic.Marshal on it yields non-deterministic ordering, causing
inconsistent config_hash; fix the block that currently calls
sonic.Marshal(key.Aliases) and hash.Write(data) by first extracting the map
entries from key.Aliases, sorting the keys lexicographically, building a
deterministic ordered representation (e.g., a slice of key/value pairs or a
sorted map) and then marshal that ordered structure (or marshal each pair in
sorted key order) before calling hash.Write; ensure this change is applied where
key.Aliases, sonic.Marshal, and hash.Write are used so config_hash becomes
deterministic across restarts.
core/bifrost.go (2)

3876-3890: ⚠️ Potential issue | 🟠 Major

Restamp the objects returned by RunPostLLMHooks.

These branches either stamp metadata before the hook call or only restamp bifrostErr. If a PostLLMHook swaps in a new response/error, the returned object can still leave without OriginalModelRequested / ResolvedModelUsed. Short-circuit responses are especially affected because they never pass through requestWorker, so there is no later chance to fill those fields. Mirror the post-hook restamp you already do in the worker streaming path at Lines 5017-5021.

Representative fix
 resp, bifrostErr := pipeline.RunPostLLMHooks(ctx, result, err, preCount)
 if IsFinalChunk(ctx) {
 	drainAndAttachPluginLogs(ctx)
 }
+if bifrostErr != nil {
+	bifrostErr.PopulateExtraFields(req.RequestType, wsProvider, wsModel, wsModel)
+}
+if resp != nil {
+	resp.PopulateExtraFields(req.RequestType, wsProvider, wsModel, wsModel)
+}
 return resp, bifrostErr

Apply the same post-hook restamp pattern in the non-WS short-circuit branches before returning resp / newBifrostMessageChan(resp).

Also applies to: 4294-4310, 4478-4558

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 3876 - 3890, The post-hook runner currently
calls pipeline.RunPostLLMHooks but only restamps the original result/err in some
branches; ensure any response or error object returned by RunPostLLMHooks is
re-stamped with request metadata before returning. Update the postHookRunner
(and the analogous non-WS short-circuit branches that return resp or
newBifrostMessageChan(resp)) to call PopulateExtraFields on the resp and/or
bifrostErr returned from RunPostLLMHooks using req.RequestType, wsProvider,
wsModel (and wsModel for both OriginalModelRequested and ResolvedModelUsed), and
keep the IsFinalChunk/drainAndAttachPluginLogs behavior unchanged so
short-circuit responses get OriginalModelRequested/ResolvedModelUsed filled like
the worker streaming path.

1192-1203: ⚠️ Potential issue | 🟡 Minor

Use response.ExtraFields when wrapping nil provider responses.

At this point req.Provider / req.Model are no longer the authoritative provenance: fallbacks can change provider, and alias resolution can change the dispatched model. When response != nil, copy Provider, OriginalModelRequested, and ResolvedModelUsed from response.GetExtraFields() and only fall back to req if the top-level response itself is nil.

Representative fix
+extra := schemas.BifrostErrorExtraFields{
+	RequestType:            schemas.ImageGenerationRequest,
+	Provider:               req.Provider,
+	OriginalModelRequested: req.Model,
+	ResolvedModelUsed:      req.Model,
+}
+if response != nil {
+	prev := response.GetExtraFields()
+	extra.Provider = prev.Provider
+	extra.OriginalModelRequested = prev.OriginalModelRequested
+	extra.ResolvedModelUsed = prev.ResolvedModelUsed
+}
+
 return nil, &schemas.BifrostError{
 	IsBifrostError: false,
 	Error: &schemas.ErrorField{
 		Message: "received nil response from provider",
 	},
-	ExtraFields: schemas.BifrostErrorExtraFields{
-		RequestType:            schemas.ImageGenerationRequest,
-		Provider:               req.Provider,
-		OriginalModelRequested: req.Model,
-		ResolvedModelUsed:      req.Model,
-	},
+	ExtraFields: extra,
 }

Apply the same pattern to the sibling nil-response wrappers in this file.

Also applies to: 1303-1314, 1413-1424, 1468-1479

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 1192 - 1203, When constructing the BifrostError
for nil provider responses, use the provenance from response.GetExtraFields()
when response is non-nil and only fall back to req.Provider/req.Model if
response or its ExtraFields are nil; update the block that builds
&schemas.BifrostError to set Provider, OriginalModelRequested, and
ResolvedModelUsed from response.GetExtraFields() (with req as fallback), and
apply the same pattern to the sibling nil-response wrappers in the same file
that build BifrostError for other response types so all nil-response errors
reflect the response's extra fields when available.
core/providers/bedrock/bedrock.go (1)

1312-1313: 🛠️ Refactor suggestion | 🟠 Major

Reset pooled Bedrock responses before returning them.

Responses now reuses bedrockChatResponsePool, but releaseBedrockChatResponse still puts the object back dirty. Please zero it in the release helper so prior payloads/slices do not stay retained in the pool.

♻️ Suggested fix
 func releaseBedrockChatResponse(resp *BedrockConverseResponse) {
 	if resp != nil {
+		*resp = BedrockConverseResponse{}
 		bedrockChatResponsePool.Put(resp)
 	}
 }

As per coding guidelines: **/*.go: Always reset ALL fields of pooled objects before calling pool.Put() to prevent stale data from leaking to subsequent users.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 1312 - 1313, The pooled
bedrock response is returned with retained data; update
releaseBedrockChatResponse to clear all fields of the struct (e.g., set
Responses slice to nil/zero length/capacity, reset any strings, maps, numeric
fields, nested structs and pointers) before calling
bedrockChatResponsePool.Put(), so acquireBedrockChatResponse always receives a
clean object; locate the releaseBedrockChatResponse function and explicitly zero
each field present on the response (including Responses) prior to pool.Put().
core/providers/azure/azure.go (1)

223-270: ⚠️ Potential issue | 🔴 Critical

Don't infer Azure Anthropic routing from the alias-resolved model string.

After the core alias step, request.Model can already be the Azure deployment identifier. Re-checking schemas.IsAnthropicModel(request.Model) here means Anthropic deployments with arbitrary names like prod-chat fall onto the OpenAI path, so we build /openai/deployments/..., api-key, and api-version instead of /anthropic/v1/messages, x-api-key, and anthropic-version. Please carry an explicit pre-alias model-family signal into these branches (and into completeRequest) instead of re-deriving it from the resolved deployment name. Based on learnings, Azure receives alias-resolved request.Model from core before provider dispatch, and with aliases enabled that value may already be the resolved deployment identifier.

Also applies to: 519-542, 610-663, 692-712, 777-811

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/azure/azure.go` around lines 223 - 270, The code incorrectly
re-determines Anthropic routing by calling schemas.IsAnthropicModel on the
alias-resolved request.Model; change the logic to accept and use an explicit
pre-alias model-family flag (e.g., isAnthropicPreAlias) instead of re-checking
the resolved model string. Update the Azure request builder functions in
azure.go (the block that currently sets isAnthropicModel and the branches that
build URLs/headers) and any call sites (including completeRequest) to take and
propagate this boolean so routing, header selection (anthropic-version vs
api-version, x-api-key vs api-key), and path construction always use the
original model-family signal passed from core rather than re-inferring from the
resolved deployment name.
♻️ Duplicate comments (6)
core/providers/replicate/models.go (1)

66-72: ⚠️ Potential issue | 🟠 Major

Delay backfill until the final Replicate page.

NextPageToken is still being forwarded here, so this converter can emit a partial provider page. Calling pipeline.BackfillModels(included) before the full listing is drained can synthesize allowlisted/aliased models from later pages and then re-emit them when those pages arrive.

Suggested fix
-	bifrostResponse.Data = append(bifrostResponse.Data,
-		pipeline.BackfillModels(included)...)
+	if deploymentsResponse == nil || deploymentsResponse.Next == nil {
+		bifrostResponse.Data = append(bifrostResponse.Data,
+			pipeline.BackfillModels(included)...)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/replicate/models.go` around lines 66 - 72, The converter
currently calls pipeline.BackfillModels(included) and appends to
bifrostResponse.Data even when deploymentsResponse.Next is set, causing
backfilled/aliased models to be emitted prematurely; change the logic so
BackfillModels(included) is only invoked when this is the final Replicate page
(i.e., when deploymentsResponse.Next is nil / no NextPageToken will be
forwarded). Specifically, move or guard the pipeline.BackfillModels(included)
call behind the same check used for deploymentsResponse.Next (or check
bifrostResponse.NextPageToken empty) so backfill runs only after the listing is
fully drained and prevents duplicate/partial emissions.
core/providers/openai/models.go (1)

34-46: ⚠️ Potential issue | 🟠 Major

Deduplicate ResolvedID before appending.

Aliases can collapse two upstream entries onto the same public model ID. This loop appends first and only then marks included, so a colliding alias/raw-model pair will emit duplicate IDs and make the later backfill bookkeeping unreliable. The custom-model path in core/providers/vertex/models.go already uses the right resolvedKey guard; the same fix is needed here and in the sibling converters changed in this stack.

🛠️ Suggested fix
 	for _, model := range response.Data {
 		for _, result := range pipeline.FilterModel(model.ID) {
+			resolvedKey := strings.ToLower(result.ResolvedID)
+			if included[resolvedKey] {
+				continue
+			}
 			entry := schemas.Model{
 				ID:            string(providerKey) + "/" + result.ResolvedID,
 				Created:       model.Created,
 				OwnedBy:       schemas.Ptr(model.OwnedBy),
 				ContextLength: model.ContextWindow,
 			}
 			if result.AliasValue != "" {
 				entry.Alias = schemas.Ptr(result.AliasValue)
 			}
 			bifrostResponse.Data = append(bifrostResponse.Data, entry)
-			included[strings.ToLower(result.ResolvedID)] = true
+			included[resolvedKey] = true
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/models.go` around lines 34 - 46, The loop appends
models before checking for duplicates by ResolvedID, causing duplicate public
IDs when aliases collide; inside the loop over response.Data and
pipeline.FilterModel(model.ID) compute a resolvedKey (e.g.,
strings.ToLower(result.ResolvedID) or the same canonical key used elsewhere),
check the included map for that key and continue if already present, and only
then create the schemas.Model entry, append to bifrostResponse.Data and set
included[resolvedKey] = true; update the block around response.Data,
pipeline.FilterModel, schemas.Model, bifrostResponse.Data, included and
result.ResolvedID accordingly.
core/providers/nebius/nebius.go (1)

324-326: ⚠️ Potential issue | 🟡 Minor

Use the effective raw-payload flags on the decode-error path too.

This branch still passes provider.sendBackRawRequest / provider.sendBackRawResponse directly, so request-scoped overrides are ignored when body decode fails. Reuse the same ShouldSendBackRaw* values as the nearby HTTP-error branch.

🔧 Minimal fix
-		return nil, providerUtils.EnrichError(ctx, providerUtils.NewBifrostOperationError(schemas.ErrProviderResponseDecode, err), jsonData, nil, provider.sendBackRawRequest, provider.sendBackRawResponse)
+		return nil, providerUtils.EnrichError(
+			ctx,
+			providerUtils.NewBifrostOperationError(schemas.ErrProviderResponseDecode, err),
+			jsonData,
+			nil,
+			providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest),
+			providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse),
+		)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/nebius/nebius.go` around lines 324 - 326, The decode-error
branch calling providerUtils.EnrichError when CheckAndDecodeBody fails should
use the effective per-request raw-payload flags instead of passing
provider.sendBackRawRequest / provider.sendBackRawResponse directly; update the
EnrichError call in the CheckAndDecodeBody error path (around
CheckAndDecodeBody, NewBifrostOperationError, schemas.ErrProviderResponseDecode)
to pass the results of ShouldSendBackRawRequest(ctx,
provider.sendBackRawRequest) and ShouldSendBackRawResponse(ctx,
provider.sendBackRawResponse) (the same values used in the nearby HTTP-error
branch) so request-scoped overrides are honored.
core/bifrost.go (1)

6232-6244: ⚠️ Potential issue | 🟠 Major

Use baseProviderType for validateKey too.

Passing providerKey into validateKey here reopens the custom-provider bypass that the earlier inline base-provider switch fixed. A custom Azure/Bedrock/Vertex/Replicate/VLLM provider will be validated as its custom name instead of its underlying base type, so the provider-specific key-config checks can be skipped.

Representative fix
-			isKeyValid := validateKey(providerKey, &key)
+			isKeyValid := validateKey(baseProviderType, &key)

Apply the same replacement in both loops.

Based on learnings, SelectKeyForProvider must resolve ProviderConfig.CustomProviderConfig.BaseProviderType before key selection/validation so custom providers use the centralized base-provider checks.

Also applies to: 6248-6257

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 6232 - 6244, The validateKey call is using
providerKey and therefore skips base-provider validations for custom providers;
update both loops in SelectKeyForProvider to pass baseProviderType (resolved
from ProviderConfig.CustomProviderConfig.BaseProviderType) into validateKey
instead of providerKey so key validation uses the centralized base-provider
checks (change validateKey(providerKey, &key) to validateKey(baseProviderType,
&key) in the loop shown and the corresponding second loop around lines
6248-6257).
core/providers/bedrock/bedrock.go (2)

3319-3326: ⚠️ Potential issue | 🟠 Major

Path-escape raw ARN model IDs in the plain branch too.

Only the key.BedrockKeyConfig.ARN branch is escaped here. If model itself is already the alias-resolved ARN / inference-profile ID, its / segments are formatted directly into /model/%s/..., so the Bedrock route is wrong for every caller that goes through this helper.

🐛 Suggested fix
 func (provider *BedrockProvider) getModelPath(basePath string, model string, key schemas.Key) string {
-	path := fmt.Sprintf("%s/%s", model, basePath)
+	escapedModel := model
+	if strings.HasPrefix(model, "arn:") || strings.Contains(model, "/") {
+		escapedModel = url.PathEscape(model)
+	}
+	path := fmt.Sprintf("%s/%s", escapedModel, basePath)
 	// If ARN is present, Bedrock expects the ARN-scoped identifier
 	if key.BedrockKeyConfig != nil && key.BedrockKeyConfig.ARN != nil && key.BedrockKeyConfig.ARN.GetValue() != "" {
 		encodedModelIdentifier := url.PathEscape(fmt.Sprintf("%s/%s", key.BedrockKeyConfig.ARN.GetValue(), model))
 		path = fmt.Sprintf("%s/%s", encodedModelIdentifier, basePath)
 	}
 	return path
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 3319 - 3326, In getModelPath,
the branch that builds path from the raw model string does not URL-escape
slash-containing ARNs or inference-profile IDs, causing incorrect Bedrock
routes; update the plain branch in getModelPath to apply url.PathEscape to the
model (same as the ARN branch does for the ARN+model combo) before formatting
the path so any '/' in model are percent-encoded and the resulting path matches
Bedrock's expected identifier format.

2095-2095: ⚠️ Potential issue | 🟡 Minor

Lowercase the new S3 / S3 URI error strings.

These messages are still surfaced as S3 ... / S3 URI ..., which breaks the repo's Go error-string convention. Please switch them to s3 / s3 uri consistently.

As per coding guidelines: **/*.go: Error strings should be lowercase with no trailing punctuation (Go convention).

Also applies to: 2227-2227, 2282-2288, 2335-2335, 2380-2386, 2435-2435, 2461-2467, 2515-2515, 2644-2644, 3224-3224

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` at line 2095, Update the error strings in
core/providers/bedrock/bedrock.go to follow Go error-string convention by
lowercasing "S3" and "S3 URI" to "s3" and "s3 uri" respectively wherever
providerUtils.NewProviderAPIError (and similar error constructors) is invoked;
locate the offending messages (for example the S3 upload error "S3 upload
failed: %s" and other occurrences called out around the given ranges) and
replace the capitalized tokens with the lowercase forms, ensuring no trailing
punctuation and that formatting verbs (%s) and status codes remain unchanged so
the semantics of the NewProviderAPIError calls are preserved.
🧹 Nitpick comments (5)
plugins/governance/store.go (1)

2253-2253: Consider including both rule Name and ID in warning logs for easier debugging.

Using rule.Name improves readability, but since names may not be unique across routing rules, including the ID alongside the name would help operators quickly locate the specific rule in the database when troubleshooting compilation failures.

🔧 Suggested improvement
-				gs.logger.Warn("Failed to pre-compile routing program for rule %s: %v", rule.Name, err)
+				gs.logger.Warn("Failed to pre-compile routing program for rule %s (id=%s): %v", rule.Name, rule.ID, err)
-	gs.logger.Warn("Failed to recompile routing program for rule %s: %v", rule.Name, err)
+	gs.logger.Warn("Failed to recompile routing program for rule %s (id=%s): %v", rule.Name, rule.ID, err)

Also applies to: 3734-3734

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/governance/store.go` at line 2253, The warning log in gs.logger.Warn
currently only includes rule.Name; update the call where routing program
pre-compilation fails (referenced by gs.logger.Warn) to include both rule.Name
and rule.ID so operators can unambiguously identify the rule; locate the warning
sites that reference rule.Name (e.g., the failing pre-compile log at the routing
compilation path and the similar occurrence around the 3734 area) and modify the
log message to include rule.ID alongside rule.Name and the error variable.
core/providers/openai/files.go (1)

85-99: Drop the stale providerName parameter from the retrieve helper too.

After centralizing extra-field population, this helper no longer uses providerName, so leaving it in the signature only keeps a dead parameter alive and makes the upload/retrieve helpers inconsistent.

✂️ Suggested cleanup
-func (r *OpenAIFileResponse) ToBifrostFileRetrieveResponse(providerName schemas.ModelProvider, latency time.Duration, sendBackRawRequest bool, sendBackRawResponse bool, rawRequest interface{}, rawResponse interface{}) *schemas.BifrostFileRetrieveResponse {
+func (r *OpenAIFileResponse) ToBifrostFileRetrieveResponse(latency time.Duration, sendBackRawRequest bool, sendBackRawResponse bool, rawRequest interface{}, rawResponse interface{}) *schemas.BifrostFileRetrieveResponse {

Based on learnings: provider-level response converters under core/providers/ should omit Provider / OriginalModelRequested / ResolvedModelUsed; requestWorker populates them centrally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/files.go` around lines 85 - 99, The
ToBifrostFileRetrieveResponse function signature still includes a now-unused
providerName parameter; remove providerName from the signature and from any
internal references in OpenAIFileResponse.ToBifrostFileRetrieveResponse, and
update all call sites to stop passing providerName (they should continue passing
latency, sendBackRawRequest, sendBackRawResponse, rawRequest, rawResponse).
Ensure the returned schemas.BifrostFileRetrieveResponse no longer relies on
providerName and that requestWorker (where extra fields are centralized)
continues to populate Provider/OriginalModelRequested/ResolvedModelUsed.
core/schemas/bifrost.go (1)

801-1005: PopulateExtraFields now duplicates GetExtraFields() 1:1.

This adds a second exhaustive response-type switch that has to stay in lockstep with GetExtraFields(). Writing through the existing helper would make future response additions one-touch instead of two-touch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/bifrost.go` around lines 801 - 1005, PopulateExtraFields
duplicates the big switch in GetExtraFields; replace the exhaustive per-response
switch in BifrostResponse.PopulateExtraFields with a single call to
r.GetExtraFields(), check for nil, compute resolvedModel like today, then set
the returned ExtraFields.RequestType, .Provider, .OriginalModelRequested and
.ResolvedModelUsed; keep the early nil receiver guard and ensure you handle the
case where GetExtraFields() returns nil (no-op) to avoid panics.
core/providers/nebius/nebius.go (1)

258-265: Keep image-generation validation in the core layer.

These guards duplicate the central request/input validation and can drift from the shared error contract. If this method needs a defensive check for direct callers, keep it narrowly scoped and documented instead of re-validating the normal provider path.

Based on learnings: Validate image generation inputs in core/bifrost.go before invoking any provider handler, and provider-layer methods can assume the request has been validated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/nebius/nebius.go` around lines 258 - 265, Remove the
duplicated prompt/input validation from the Nebius provider method: assume the
core layer (core/bifrost.go) performs full image-generation request validation
before calling provider handlers, so delete the checks that call
providerUtils.NewBifrostOperationError for empty request or prompt in nebius.go.
If you want one defensive check in the provider method, keep only a minimal
nil-check for request with a short comment documenting the invariant ("core
guarantees validated request") and reference that core/bifrost.go performs full
validation; do not re-validate prompt or input content in functions like the
Nebius image-generation handler.
core/providers/replicate/utils.go (1)

32-41: Error handling changes align with centralized population pattern.

The removal of the schemas.Replicate provider argument from NewBifrostOperationError calls is correct—provider/model identity fields are now populated centrally by PopulateExtraFields in core/bifrost.go.

Minor nit: fmt.Errorf("%s", errorMsg) and fmt.Errorf("prediction was canceled") could be simplified to errors.New() since no formatting is needed.

♻️ Optional simplification
 	if prediction.Status == ReplicatePredictionStatusFailed {
 		errorMsg := "prediction failed"
 		if prediction.Error != nil && *prediction.Error != "" {
 			errorMsg = *prediction.Error
 		}
 		return providerUtils.NewBifrostOperationError(
 			"prediction failed",
-			fmt.Errorf("%s", errorMsg))
+			errors.New(errorMsg))
 	}

 	if prediction.Status == ReplicatePredictionStatusCanceled {
 		return providerUtils.NewBifrostOperationError(
 			"prediction was canceled",
-			fmt.Errorf("prediction was canceled"))
+			errors.New("prediction was canceled"))
 	}

Based on learnings: "In maximhq/bifrost, the error fields Provider/OriginalModelRequested/ResolvedModelUsed should be populated centrally by (*BifrostError).PopulateExtraFields in core/bifrost.go across the core request paths."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/replicate/utils.go` around lines 32 - 41, Replace the
unnecessary fmt.Errorf usages with errors.New in the NewBifrostOperationError
calls in the replicate provider code: change fmt.Errorf("%s", errorMsg) to
errors.New(errorMsg) and fmt.Errorf("prediction was canceled") to
errors.New("prediction was canceled"); keep using NewBifrostOperationError
(provider/model fields are populated centrally by
(*BifrostError).PopulateExtraFields) so only the error value construction needs
simplification.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 4993-4997: getChannelMessage currently calls
req.SetModel(resolvedModel) which mutates the caller-owned BifrostRequest
because the shallow copy doesn't protect the inner request; clone the request
before mutating to avoid side effects by replacing the local req with a
deep/shallow clone (e.g., call req.Clone() or construct a new copy of the inner
typed request) and then call SetModel(resolvedModel) on that cloned request so
originalModelRequested stays preserved and the caller's request is not modified.

In `@core/internal/llmtests/account.go`:
- Around line 210-215: The problem is wildcard Models: []string{"*"} combined
with disjoint Aliases maps lets the weighted key selector pick a key that cannot
actually resolve the requested model, causing an unsupported model ID to be sent
to the provider. Fix by either narrowing each key's Models to the exact models
listed in its Aliases map (replace []string{"*"} with the concrete model list
entries for the Bedrock/Vertex keys defined near the Aliases maps) or, if
wildcard selection is needed, add a validation step after selection: call
Aliases.Resolve(requestedModel) on the chosen key and if it returns the same,
unresolved model (or the key lacks the alias), treat that key as ineligible and
reselect or return an explicit error; ensure this check is applied wherever the
selection/weighted routing code resolves models before sending to the provider.

In `@core/providers/azure/files.go`:
- Around line 112-114: The converter function in core/providers/azure/files.go
declares an unused parameter `providerName schemas.ModelProvider`—remove this
parameter from the converter's signature so it matches the OpenAI converter
(e.g., ToBifrostFileUploadResponse) and the call site in FileUpload() that
invokes the converter without that argument; ensure the converter only sets
Latency on ExtraFields (schemas.BifrostResponseExtraFields{Latency:
latency.Milliseconds()}) and rely on PopulateExtraFields() to fill Provider and
RequestType.

In `@core/providers/bedrock/bedrock.go`:
- Line 176: The code dereferences key.BedrockKeyConfig in multiple places (e.g.,
before calling signAWSRequest in bedrock.go) and can panic when it is nil;
ensure you normalize once per request by assigning a local config variable: if
key.BedrockKeyConfig == nil { config = &schemas.BedrockKeyConfig{} } else {
config = key.BedrockKeyConfig } and then use that local config everywhere
(Region, RoleARN, SessionToken, AccessKey, SecretKey, etc.) including before
calls to signAWSRequest and in the other mentioned sites (lines referenced:
around the signAWSRequest call and the other listed locations) so
IAM/default-chain fallbacks that rely on Value only do not cause panics.

In `@core/providers/ollama/ollama.go`:
- Around line 73-74: The code currently concatenates
provider.networkConfig.BaseURL directly when building Ollama endpoints; restore
per-key URL resolution by resolving the base URL for the given key first (e.g.,
call a helper like resolveKeyBaseURL(ctx, key) or provider.getBaseURLForKey(ctx,
key) that checks OllamaKeyConfig.URL.IsDefined()) and use that resolvedBase +
providerUtils.GetPathFromContext(ctx, "/v1/models") (and equivalent paths)
instead of provider.networkConfig.BaseURL; apply the same change to list-models,
text completion (+stream), chat completion (+stream), and embedding request
builders (the sites around the existing concatenations at the shown diffs and at
the other referenced spots).

In `@core/providers/runway/runway.go`:
- Around line 207-208: parseRunwayError(resp) already populates
BifrostError.ExtraFields.RawResponse via HandleProviderAPIError, so avoid
overwriting that structured payload: call parseRunwayError(resp) into a variable
(e.g., parsedErr) and if parsedErr != nil pass responseBody=nil into
providerUtils.EnrichError; only copy resp.Body() into rawErrBody and pass it to
providerUtils.EnrichError on branches where parseRunwayError returned nil
(decode-failure/unknown error). Apply the same change for the other occurrences
referenced (around the locations handling EnrichError at the two other spots).

In `@core/providers/vertex/errors.go`:
- Around line 13-20: The parseVertexError helper must preserve the provider
response payload so upstream enrichment can attach it to
BifrostError.ExtraFields.RawResponse; modify parseVertexError (and the other
error branches around lines referenced) to include the decodedBody in the
returned schemas.BifrostError (e.g., set ExtraFields.RawResponse = decodedBody)
or return a structure that allows
providerUtils.EnrichError/HandleProviderAPIError to receive decodedBody,
ensuring callers like providerUtils.EnrichError(ctx, parseVertexError(resp),
jsonBody, ...) get the original decodedBody attached; update all return paths in
parseVertexError to carry decodedBody into the BifrostError or pass it through
to the enrichment step.

In `@docs/providers/supported-providers/vertex.mdx`:
- Around line 539-542: The documented filtering rules for the "models" parameter
are inconsistent with the example caption: update either the bullet list near
the current "models" rules (the four bullets describing empty, aliases, ["*"],
and non-empty cases) or the example caption at line ~591 so both state the same
behavior for the no-`models` case; specifically decide whether "no models and
aliases configured" returns only aliased models or also returns custom models,
then edit the bullets and the example caption to match that chosen behavior and
use the same terminology ("models", "aliases", "custom models") to avoid
ambiguity.

In `@docs/quickstart/go-sdk/provider-configuration.mdx`:
- Around line 525-529: The example is inconsistent: the Models slice only
allowlists "fine-tuned-gemini-2.5-pro" but the note claims callers can also send
"vertex/<your-fine-tuned-model-id>". Fix by either adding the raw tuned-model ID
string to the Models slice (so Models includes both "fine-tuned-gemini-2.5-pro"
and "vertex/<your-fine-tuned-model-id>") or by narrowing the note to state that
only alias-based requests via Aliases/KeyAliases are allowed; update the Models,
Aliases (schemas.KeyAliases) usage or the surrounding note accordingly so the
example is internally consistent.

In `@framework/streaming/types.go`:
- Around line 388-396: The current assignment to
resp.TranscriptionResponse.ExtraFields replaces the previous ExtraFields and
only reattaches RawRequest, dropping p.Data.RawResponse and p.Data.CacheDebug;
update the fix in the block that sets resp.TranscriptionResponse.ExtraFields
(and/or immediately after) so you preserve or copy p.Data.RawResponse and
p.Data.CacheDebug into the new schemas.BifrostResponseExtraFields when present
(in addition to RequestType, Provider, OriginalModelRequested,
ResolvedModelUsed, Latency and RawRequest), or initialize the struct from the
existing resp.TranscriptionResponse.ExtraFields and then set/overwrite the other
fields to avoid losing RawResponse and CacheDebug.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 3876-3890: The post-hook runner currently calls
pipeline.RunPostLLMHooks but only restamps the original result/err in some
branches; ensure any response or error object returned by RunPostLLMHooks is
re-stamped with request metadata before returning. Update the postHookRunner
(and the analogous non-WS short-circuit branches that return resp or
newBifrostMessageChan(resp)) to call PopulateExtraFields on the resp and/or
bifrostErr returned from RunPostLLMHooks using req.RequestType, wsProvider,
wsModel (and wsModel for both OriginalModelRequested and ResolvedModelUsed), and
keep the IsFinalChunk/drainAndAttachPluginLogs behavior unchanged so
short-circuit responses get OriginalModelRequested/ResolvedModelUsed filled like
the worker streaming path.
- Around line 1192-1203: When constructing the BifrostError for nil provider
responses, use the provenance from response.GetExtraFields() when response is
non-nil and only fall back to req.Provider/req.Model if response or its
ExtraFields are nil; update the block that builds &schemas.BifrostError to set
Provider, OriginalModelRequested, and ResolvedModelUsed from
response.GetExtraFields() (with req as fallback), and apply the same pattern to
the sibling nil-response wrappers in the same file that build BifrostError for
other response types so all nil-response errors reflect the response's extra
fields when available.

In `@core/providers/azure/azure.go`:
- Around line 223-270: The code incorrectly re-determines Anthropic routing by
calling schemas.IsAnthropicModel on the alias-resolved request.Model; change the
logic to accept and use an explicit pre-alias model-family flag (e.g.,
isAnthropicPreAlias) instead of re-checking the resolved model string. Update
the Azure request builder functions in azure.go (the block that currently sets
isAnthropicModel and the branches that build URLs/headers) and any call sites
(including completeRequest) to take and propagate this boolean so routing,
header selection (anthropic-version vs api-version, x-api-key vs api-key), and
path construction always use the original model-family signal passed from core
rather than re-inferring from the resolved deployment name.

In `@core/providers/bedrock/bedrock.go`:
- Around line 1312-1313: The pooled bedrock response is returned with retained
data; update releaseBedrockChatResponse to clear all fields of the struct (e.g.,
set Responses slice to nil/zero length/capacity, reset any strings, maps,
numeric fields, nested structs and pointers) before calling
bedrockChatResponsePool.Put(), so acquireBedrockChatResponse always receives a
clean object; locate the releaseBedrockChatResponse function and explicitly zero
each field present on the response (including Responses) prior to pool.Put().

In `@core/providers/sgl/sgl.go`:
- Around line 69-80: The call in SGLProvider.listModelsByKey always concatenates
provider.networkConfig.BaseURL and ignores per-key SGLKeyConfig.URL; restore
per-key base URL resolution by computing a baseURLOrError from the key's
SGLKeyConfig.URL (falling back to provider.networkConfig.BaseURL) and use
baseURLOrError + providerUtils.GetPathFromContext(ctx, "/v1/models") when
calling openai.ListModelsByKey; apply the same fix to the other SGLProvider call
sites referenced (the blocks around lines 96-110, 117-132, 136-176, 203-216) so
each uses the per-key base URL instead of always using
provider.networkConfig.BaseURL, and keep the other parameters (provider.client,
key, request.Unfiltered, provider.networkConfig.ExtraHeaders,
provider.GetProviderKey, providerUtils.ShouldSendBackRawRequest/Response)
unchanged.

In `@docs/providers/supported-providers/azure.mdx`:
- Around line 171-179: The doc lists the allowlist field as `allowed_models`
which is inconsistent with the alias refactor examples that use the top-level
`models` key; update the configuration details to replace `allowed_models` with
`models` and adjust the brief description to say "List of allowed models to use
from this key (optional)" so the documented field name matches the examples and
other references (look for `allowed_models`, `models`, and `aliases` in the
azure OpenAI config section).

In `@framework/configstore/clientconfig.go`:
- Around line 585-592: key.Aliases is a map and sonic.Marshal on it yields
non-deterministic ordering, causing inconsistent config_hash; fix the block that
currently calls sonic.Marshal(key.Aliases) and hash.Write(data) by first
extracting the map entries from key.Aliases, sorting the keys lexicographically,
building a deterministic ordered representation (e.g., a slice of key/value
pairs or a sorted map) and then marshal that ordered structure (or marshal each
pair in sorted key order) before calling hash.Write; ensure this change is
applied where key.Aliases, sonic.Marshal, and hash.Write are used so config_hash
becomes deterministic across restarts.

In `@framework/streaming/transcription.go`:
- Around line 192-203: The code always takes the address of rawRequest and
assigns it to ProcessedStreamResponse.RawRequest which makes RawRequest non-nil
even when nothing was captured; change the logic so
ProcessedStreamResponse.RawRequest is only set when
result.TranscriptionStreamResponse.ExtraFields.RawRequest is present: keep the
existing check for result.TranscriptionStreamResponse.ExtraFields.RawRequest,
assign the value to a local rawRequest, create a pointer (e.g. rawRequestPtr)
only in that branch, and set RawRequest: rawRequestPtr; otherwise leave
RawRequest nil. Target the block that constructs the ProcessedStreamResponse and
the local rawRequest handling around
result.TranscriptionStreamResponse.ExtraFields.RawRequest.

---

Duplicate comments:
In `@core/bifrost.go`:
- Around line 6232-6244: The validateKey call is using providerKey and therefore
skips base-provider validations for custom providers; update both loops in
SelectKeyForProvider to pass baseProviderType (resolved from
ProviderConfig.CustomProviderConfig.BaseProviderType) into validateKey instead
of providerKey so key validation uses the centralized base-provider checks
(change validateKey(providerKey, &key) to validateKey(baseProviderType, &key) in
the loop shown and the corresponding second loop around lines 6248-6257).

In `@core/providers/bedrock/bedrock.go`:
- Around line 3319-3326: In getModelPath, the branch that builds path from the
raw model string does not URL-escape slash-containing ARNs or inference-profile
IDs, causing incorrect Bedrock routes; update the plain branch in getModelPath
to apply url.PathEscape to the model (same as the ARN branch does for the
ARN+model combo) before formatting the path so any '/' in model are
percent-encoded and the resulting path matches Bedrock's expected identifier
format.
- Line 2095: Update the error strings in core/providers/bedrock/bedrock.go to
follow Go error-string convention by lowercasing "S3" and "S3 URI" to "s3" and
"s3 uri" respectively wherever providerUtils.NewProviderAPIError (and similar
error constructors) is invoked; locate the offending messages (for example the
S3 upload error "S3 upload failed: %s" and other occurrences called out around
the given ranges) and replace the capitalized tokens with the lowercase forms,
ensuring no trailing punctuation and that formatting verbs (%s) and status codes
remain unchanged so the semantics of the NewProviderAPIError calls are
preserved.

In `@core/providers/nebius/nebius.go`:
- Around line 324-326: The decode-error branch calling providerUtils.EnrichError
when CheckAndDecodeBody fails should use the effective per-request raw-payload
flags instead of passing provider.sendBackRawRequest /
provider.sendBackRawResponse directly; update the EnrichError call in the
CheckAndDecodeBody error path (around CheckAndDecodeBody,
NewBifrostOperationError, schemas.ErrProviderResponseDecode) to pass the results
of ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest) and
ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) (the same values
used in the nearby HTTP-error branch) so request-scoped overrides are honored.

In `@core/providers/openai/models.go`:
- Around line 34-46: The loop appends models before checking for duplicates by
ResolvedID, causing duplicate public IDs when aliases collide; inside the loop
over response.Data and pipeline.FilterModel(model.ID) compute a resolvedKey
(e.g., strings.ToLower(result.ResolvedID) or the same canonical key used
elsewhere), check the included map for that key and continue if already present,
and only then create the schemas.Model entry, append to bifrostResponse.Data and
set included[resolvedKey] = true; update the block around response.Data,
pipeline.FilterModel, schemas.Model, bifrostResponse.Data, included and
result.ResolvedID accordingly.

In `@core/providers/replicate/models.go`:
- Around line 66-72: The converter currently calls
pipeline.BackfillModels(included) and appends to bifrostResponse.Data even when
deploymentsResponse.Next is set, causing backfilled/aliased models to be emitted
prematurely; change the logic so BackfillModels(included) is only invoked when
this is the final Replicate page (i.e., when deploymentsResponse.Next is nil /
no NextPageToken will be forwarded). Specifically, move or guard the
pipeline.BackfillModels(included) call behind the same check used for
deploymentsResponse.Next (or check bifrostResponse.NextPageToken empty) so
backfill runs only after the listing is fully drained and prevents
duplicate/partial emissions.

---

Nitpick comments:
In `@core/providers/nebius/nebius.go`:
- Around line 258-265: Remove the duplicated prompt/input validation from the
Nebius provider method: assume the core layer (core/bifrost.go) performs full
image-generation request validation before calling provider handlers, so delete
the checks that call providerUtils.NewBifrostOperationError for empty request or
prompt in nebius.go. If you want one defensive check in the provider method,
keep only a minimal nil-check for request with a short comment documenting the
invariant ("core guarantees validated request") and reference that
core/bifrost.go performs full validation; do not re-validate prompt or input
content in functions like the Nebius image-generation handler.

In `@core/providers/openai/files.go`:
- Around line 85-99: The ToBifrostFileRetrieveResponse function signature still
includes a now-unused providerName parameter; remove providerName from the
signature and from any internal references in
OpenAIFileResponse.ToBifrostFileRetrieveResponse, and update all call sites to
stop passing providerName (they should continue passing latency,
sendBackRawRequest, sendBackRawResponse, rawRequest, rawResponse). Ensure the
returned schemas.BifrostFileRetrieveResponse no longer relies on providerName
and that requestWorker (where extra fields are centralized) continues to
populate Provider/OriginalModelRequested/ResolvedModelUsed.

In `@core/providers/replicate/utils.go`:
- Around line 32-41: Replace the unnecessary fmt.Errorf usages with errors.New
in the NewBifrostOperationError calls in the replicate provider code: change
fmt.Errorf("%s", errorMsg) to errors.New(errorMsg) and fmt.Errorf("prediction
was canceled") to errors.New("prediction was canceled"); keep using
NewBifrostOperationError (provider/model fields are populated centrally by
(*BifrostError).PopulateExtraFields) so only the error value construction needs
simplification.

In `@core/schemas/bifrost.go`:
- Around line 801-1005: PopulateExtraFields duplicates the big switch in
GetExtraFields; replace the exhaustive per-response switch in
BifrostResponse.PopulateExtraFields with a single call to r.GetExtraFields(),
check for nil, compute resolvedModel like today, then set the returned
ExtraFields.RequestType, .Provider, .OriginalModelRequested and
.ResolvedModelUsed; keep the early nil receiver guard and ensure you handle the
case where GetExtraFields() returns nil (no-op) to avoid panics.

In `@plugins/governance/store.go`:
- Line 2253: The warning log in gs.logger.Warn currently only includes
rule.Name; update the call where routing program pre-compilation fails
(referenced by gs.logger.Warn) to include both rule.Name and rule.ID so
operators can unambiguously identify the rule; locate the warning sites that
reference rule.Name (e.g., the failing pre-compile log at the routing
compilation path and the similar occurrence around the 3734 area) and modify the
log message to include rule.ID alongside rule.Name and the error variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

Comment thread core/bifrost.go
Comment thread core/internal/llmtests/account.go
Comment thread core/providers/azure/files.go
Comment thread core/providers/bedrock/bedrock.go
Comment thread core/providers/ollama/ollama.go Outdated
Comment thread core/providers/runway/runway.go Outdated
Comment thread core/providers/vertex/errors.go
Comment thread docs/providers/supported-providers/vertex.mdx
Comment thread docs/quickstart/go-sdk/provider-configuration.mdx
Comment thread framework/streaming/types.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from 99bcebe to 7235fad Compare April 2, 2026 10:37
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-26-feat_adds_routing_tree_view branch from b023e6a to 1b012fc Compare April 2, 2026 10:37
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 03-27-feat_add_model_alias branch from 7235fad to 5fa7f30 Compare April 2, 2026 10:47
@akshaydeo akshaydeo changed the base branch from 03-26-feat_adds_routing_tree_view to graphite-base/2355 April 2, 2026 10:58
@akshaydeo akshaydeo changed the base branch from graphite-base/2355 to v1.5.0 April 2, 2026 10:58
@akshaydeo akshaydeo dismissed their stale review April 2, 2026 10:58

The base branch was changed.

@akshaydeo akshaydeo merged commit e15ec70 into v1.5.0 Apr 2, 2026
12 checks passed
@akshaydeo akshaydeo deleted the 03-27-feat_add_model_alias branch April 2, 2026 10:59
This was referenced Apr 6, 2026
@coderabbitai coderabbitai Bot mentioned this pull request Apr 16, 2026
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support renaming or aliasing models

3 participants