Skip to content

feat: auto-resolve provider when no provider prefix is given on inference routes#3067

Merged
akshaydeo merged 1 commit intomainfrom
04-26-feat_added_default_provider_selection_on_inference_routes
Apr 28, 2026
Merged

feat: auto-resolve provider when no provider prefix is given on inference routes#3067
akshaydeo merged 1 commit intomainfrom
04-26-feat_added_default_provider_selection_on_inference_routes

Conversation

@Pratham-Mishra04
Copy link
Copy Markdown
Collaborator

Summary

This PR introduces automatic provider resolution via the model catalog when a request's model field contains no provider prefix (e.g., gpt-4o instead of openai/gpt-4o). It also adds a new x-bf-disable-content-logging per-request header, fixes a response extra fields type corruption issue under high concurrency streaming, and consolidates duplicated request preparation logic across all inference endpoints into a single generic prepareRequest function.

Changes

  • Model catalog auto-resolution: When a request arrives with a bare model name (no provider/model prefix), resolveModelAndProvider queries the model catalog for matching providers. The first matching provider is selected and the resolution is stored on the fasthttp context. ConvertToBifrostContext picks this up centrally and emits a model-catalog routing engine log entry, including all available providers and the one selected. A new RoutingEngineModelCatalog = "model-catalog" constant is added to the schema.
  • Generic prepareRequest[T] function: All per-endpoint prepare*Request functions previously duplicated JSON unmarshaling, model/provider parsing, fallback parsing, and extra param extraction. These are now consolidated into a single generic function, with each endpoint only handling its own validation logic.
  • x-bf-disable-content-logging header support: ConvertToBifrostContext now recognizes this header and sets BifrostContextKeyDisableContentLogging on the bifrost context, enabling per-request content logging suppression.
  • ModelCatalogResolution struct and FastHTTPUserValueModelCatalogResolution context key: Introduced to pass resolution metadata from request preparation through to context conversion without coupling the two layers directly.
  • Removed redundant handlerStore field from CompletionHandler; all calls now go directly through h.config.
  • Bug fix: Corrected response extra fields request type corruption that occurred during streaming under high concurrency.

Type of change

  • Bug fix
  • Feature
  • Refactor

Affected areas

  • Core (Go)
  • Transports (HTTP)

How to test

go test ./...

Model catalog auto-resolution: Configure a model catalog with at least one model entry. Send a chat completion request with "model": "gpt-4o" (no provider prefix). The request should succeed, routing to the first provider listed for that model in the catalog. The routing engine log should include a model-catalog entry showing the available providers and the selected one.

Per-request content logging toggle: Send any inference request with the header x-bf-disable-content-logging: true. Verify that content is excluded from logs for that request.

No model catalog configured: Send a request with a bare model name when no model catalog is configured. Expect a 400 error: provider is required in model field (format: provider/model) and model catalog is not available.

No providers found: Send a request with a bare model name that has no matching entry in the catalog. Expect a 400 error indicating no providers were found for that model.

Breaking changes

  • No

Security considerations

Provider resolution from the model catalog uses only the first matching provider. Callers without a provider prefix will have their provider selected automatically, which should be considered when configuring model catalog entries to avoid unintended provider routing.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 26, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e66c9eeb-dc64-4905-bea8-8658612ed159

📥 Commits

Reviewing files that changed from the base of the PR and between 960f72c and 35e9e2c.

📒 Files selected for processing (7)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Auto-resolve model providers from catalog when model string lacks provider prefix.
    • Added x-bf-disable-content-logging header to disable per-request content logging.
    • Introduced model-catalog routing engine for tracking provider resolution metrics.
  • Bug Fixes

    • Fixed response extra fields corruption in streaming requests under high concurrency.

Walkthrough

Adds model-catalog provider-resolution and logging, threads config into async request preparation, centralizes JSON prepare logic with provider/model resolution, extends context metadata for model-catalog resolution and disable-content-logging header, and minor router/import cleanups and changelog entries.

Changes

Cohort / File(s) Summary
Schema constant
core/schemas/bifrost.go
Adds exported constant RoutingEngineModelCatalog = "model-catalog".
Async handlers
transports/bifrost-http/handlers/asyncinference.go
Passes h.config into each async POST handler's request-preparation functions (signature updates only).
Sync handlers & prepare refactor
transports/bifrost-http/handlers/inference.go
Introduces prepareRequest[T] generic, resolveModelAndProvider, removes handlerStore usage, normalizes provider/model resolution across JSON and multipart endpoints, maps legacy max_tokens -> max_completion_tokens, and consolidates responses/count-tokens preparation.
Context & routing metadata
transports/bifrost-http/lib/ctx.go
Adds FastHTTPUserValueModelCatalogResolution constant and ModelCatalogResolution type; ConvertToBifrostContext now appends RoutingEngineModelCatalog routing entries and supports x-bf-disable-content-logging.
Config surface
transports/bifrost-http/lib/config.go
Adds GetProvidersForModel(model string) []schemas.ModelProvider, returning deterministic sorted providers from ModelCatalog.
Router minor fixes
transports/bifrost-http/integrations/router.go
Cleans up imports (remove duplicate errors import) and simplifies handleAsyncRetrieve control flow by removing a redundant return.
Changelog
transports/changelog.md
Adds entries for streaming extra-fields fix, x-bf-disable-content-logging header, and auto-resolve model provider via model-catalog with model-catalog routing engine.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant FastHTTP as FastHTTP Handler
    participant Prepare as prepareRequest / resolveModelAndProvider
    participant Catalog as ModelCatalog (via Config)
    participant Context as ConvertToBifrostContext / Router

    Client->>FastHTTP: POST inference request (possibly model without provider)
    FastHTTP->>Prepare: Unmarshal JSON / collect unknown fields
    Prepare->>Catalog: GetProvidersForModel(model)
    Catalog-->>Prepare: provider options / resolved provider
    Prepare->>Context: attach ModelCatalogResolution to RequestCtx
    Context->>Context: append RoutingEngineModelCatalog routing log
    FastHTTP->>Context: submit job / respond
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • akshaydeo
  • danpiths

Poem

"I hopped through code in morning light,
A catalog of models in my sight,
I nibbled prefixes, fetched a guide,
Logged the journey, tucked extras inside,
🐇🥕 — model resolved, the pipeline's bright."

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main feature: automatic provider resolution from the model catalog when no provider prefix is given on inference routes.
Description check ✅ Passed The description covers all major sections: summary, changes with design rationale, type of change, affected areas, testing instructions, and security considerations. Most checklist items are addressed.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 04-26-feat_added_default_provider_selection_on_inference_routes

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 26, 2026

Confidence Score: 4/5

Safe to merge after addressing open thread concerns; no new critical issues found in this review pass.

No new P0/P1 issues found in this review. The generic prepareRequest[T] refactoring is behaviorally equivalent to the old per-function code, the catalog sort is appropriate given the underlying map storage, and the max_tokens deletion behaviour was confirmed intentional. Score is 4 rather than 5 because three concerns from prior review threads remain unaddressed: the transport-internal __bifrost_model_catalog_resolution struct leaking into bifrostCtx via VisitUserValuesAll, the silent no-op when x-bf-disable-content-logging receives a non-standard boolean string, and the identical error message for nil catalog vs. missing model (operator debuggability).

transports/bifrost-http/lib/ctx.go (open thread concerns), transports/bifrost-http/lib/config.go (error message distinction)

Important Files Changed

Filename Overview
core/schemas/bifrost.go Adds RoutingEngineModelCatalog = "model-catalog" constant to the routing engine set — minimal, safe change.
transports/bifrost-http/lib/config.go Adds GetProvidersForModel wrapper that alphabetically sorts results from the underlying catalog (which iterates a map, so non-deterministic order without sorting). Correct rationale, but alphabetical selection is surprising when users expect catalog-defined ordering; documented in comment.
transports/bifrost-http/lib/ctx.go Adds ModelCatalogResolution struct, FastHTTPUserValueModelCatalogResolution key, routing-engine log emission, and x-bf-disable-content-logging header handling. Previously-flagged concerns (struct leaking into bifrostCtx via VisitUserValuesAll, silent bool parse no-op) still open.
transports/bifrost-http/handlers/inference.go Consolidates all prepare*Request functions into a generic prepareRequest[T]; removes handlerStore field; adds resolveModelAndProvider for catalog-backed provider auto-resolution. Refactoring is clean and functionally equivalent to prior code.
transports/bifrost-http/handlers/asyncinference.go All prepare*Request calls updated to pass h.config — mechanical update, no logic changes.
transports/bifrost-http/integrations/router.go Removes bare return at end of handleAsyncRetrieve and reorders an import — trivial cleanup, no behavioral change.

Reviews (8): Last reviewed commit: "feat: added default provider selection o..." | Re-trigger Greptile

Comment thread transports/bifrost-http/lib/ctx.go
Comment thread transports/bifrost-http/lib/ctx.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
transports/bifrost-http/handlers/inference.go (1)

1149-1158: ⚠️ Potential issue | 🟡 Minor

Avoid trimming rerank inputs in the transport layer.

strings.TrimSpace here rejects whitespace-only queries/documents before the provider sees them, which is stricter than the passthrough policy used in this package. Check only for empty strings and let the provider reject unsupported content.

Suggested fix
-	if strings.TrimSpace(req.Query) == "" {
+	if req.Query == "" {
 		return nil, nil, fmt.Errorf("query is required for rerank")
 	}
@@
-		if strings.TrimSpace(doc.Text) == "" {
+		if doc.Text == "" {
 			return nil, nil, fmt.Errorf("document text is required for rerank at index %d", i)
 		}
 	}

Based on learnings, "In the Bifrost HTTP handlers under transports/bifrost-http/handlers, implement minimal input validation for user-provided fields (e.g., prompts) and avoid trimming whitespace. Treat Bifrost as a passthrough gateway: forward inputs as-is to providers if accepted, and return provider rejections to the caller."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 1149 - 1158, The
current validation in the rerank handler uses strings.TrimSpace on req.Query and
doc.Text which rejects whitespace-only inputs; change these checks to only test
for empty string (e.g., req.Query == "" and doc.Text == "") so inputs are
forwarded verbatim to providers; update the three validations that reference
strings.TrimSpace(req.Query), len(req.Documents) (keep this check), and
strings.TrimSpace(doc.Text) to use plain empty-string checks and keep the same
error messages and returned fmt.Errorf values so provider rejections are
preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 957-971: The code currently only consumes and deletes the legacy
"max_tokens" alias when req.ChatParameters.MaxCompletionTokens is nil, leaving
the alias in base.ExtraParams if the canonical field is already set; change the
logic in the block around base.ExtraParams handling so you always detect and
delete "max_tokens" from base.ExtraParams (using delete(base.ExtraParams,
"max_tokens")) regardless of whether req.ChatParameters.MaxCompletionTokens is
already set, and add an explicit conflict check in that same block comparing the
parsed max_tokens value to *req.ChatParameters.MaxCompletionTokens and
return/reject or log an error if they differ; ensure
req.ChatParameters.ExtraParams is still assigned to base.ExtraParams after this
removal.

In `@transports/bifrost-http/lib/ctx.go`:
- Around line 191-204: The block in ConvertToBifrostContext reads
FastHTTPUserValueModelCatalogResolution from the fasthttp context and
unconditionally appends logs/markers, causing duplicates on repeated
conversions; after you append the routing log via
bifrostCtx.AppendRoutingEngineLog and add the marker via
schemas.AppendToContextList (and only if res != nil), clear the stored
resolution from the fasthttp context (e.g., call
ctx.SetUserValue(FastHTTPUserValueModelCatalogResolution, nil) or otherwise
remove it) so subsequent ConvertToBifrostContext calls won't reprocess the same
ModelCatalogResolution and won't create duplicate entries; alternatively, you
can guard by checking BifrostContextKeyRoutingEnginesUsed in bifrostCtx before
appending, but prefer clearing FastHTTPUserValueModelCatalogResolution
immediately after use to keep behavior local to ModelCatalogResolution handling.

---

Outside diff comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 1149-1158: The current validation in the rerank handler uses
strings.TrimSpace on req.Query and doc.Text which rejects whitespace-only
inputs; change these checks to only test for empty string (e.g., req.Query == ""
and doc.Text == "") so inputs are forwarded verbatim to providers; update the
three validations that reference strings.TrimSpace(req.Query),
len(req.Documents) (keep this check), and strings.TrimSpace(doc.Text) to use
plain empty-string checks and keep the same error messages and returned
fmt.Errorf values so provider rejections are preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 566d65bf-9ed2-4310-a0ce-9db336474502

📥 Commits

Reviewing files that changed from the base of the PR and between 5bf2a03 and bad7b10.

📒 Files selected for processing (6)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md

Comment thread transports/bifrost-http/handlers/inference.go
Comment thread transports/bifrost-http/lib/ctx.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_add_request_level_support_for_disable_content_storage_in_loggin_plugin branch from 5bf2a03 to da454c3 Compare April 27, 2026 08:49
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_added_default_provider_selection_on_inference_routes branch from bad7b10 to 75a20e2 Compare April 27, 2026 08:49
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
transports/bifrost-http/handlers/inference.go (1)

1362-1421: ⚠️ Potential issue | 🟠 Major

Multipart transcription requests lost fallbacks handling.

This function now resolves the primary model through resolveModelAndProvider, but it never reads form.Value["fallbacks"] or propagates parsed fallbacks into the returned transcription request. JSON routes still preserve fallbacks via prepareRequest, and the other multipart image paths keep their fallback parsing, so transcriptions now silently lose failover behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 1362 - 1421,
prepareTranscriptionRequest is parsing multipart form data but never reads or
propagates form.Value["fallbacks"], so transcriptions lose configured failovers;
update prepareTranscriptionRequest to read the "fallbacks" form value (if
present), JSON-unmarshal it into the same fallback type used elsewhere (the
Bifrost request's fallback type), and assign it to the returned
*schemas.BifrostTranscriptionRequest (the Fallbacks field) before returning;
reuse the same parsing/validation logic as prepareRequest/other multipart
handlers and ensure any unmarshal error is returned (reference:
prepareTranscriptionRequest, resolveModelAndProvider, and
schemas.BifrostTranscriptionRequest.Fallbacks).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 68-82: The current auto-resolution selects providers[0] from
mc.GetProvidersForModel(modelName) which can pick providers not enabled in this
deployment; fix by filtering that providers slice against the deployment's
enabled providers from config.GetAvailableProviders() before choosing a match.
In the block using config.ModelCatalog and mc.GetProvidersForModel, call
config.GetAvailableProviders(), compute the intersection (keep only providers
present/allowed by GetAvailableProviders()), replace AllProviders with the
filtered list, set ResolvedProvider and provider to the first element of the
filtered list, and return the same "no providers found" error if the filtered
list is empty.

---

Outside diff comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 1362-1421: prepareTranscriptionRequest is parsing multipart form
data but never reads or propagates form.Value["fallbacks"], so transcriptions
lose configured failovers; update prepareTranscriptionRequest to read the
"fallbacks" form value (if present), JSON-unmarshal it into the same fallback
type used elsewhere (the Bifrost request's fallback type), and assign it to the
returned *schemas.BifrostTranscriptionRequest (the Fallbacks field) before
returning; reuse the same parsing/validation logic as prepareRequest/other
multipart handlers and ensure any unmarshal error is returned (reference:
prepareTranscriptionRequest, resolveModelAndProvider, and
schemas.BifrostTranscriptionRequest.Fallbacks).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07eae189-fe4f-4d89-bc58-a3cdb2965456

📥 Commits

Reviewing files that changed from the base of the PR and between bad7b10 and 75a20e2.

📒 Files selected for processing (6)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (4)
  • core/schemas/bifrost.go
  • transports/bifrost-http/integrations/router.go
  • transports/changelog.md
  • transports/bifrost-http/handlers/asyncinference.go

Comment thread transports/bifrost-http/handlers/inference.go Outdated
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_add_request_level_support_for_disable_content_storage_in_loggin_plugin branch from da454c3 to c2b6cc9 Compare April 27, 2026 09:57
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_added_default_provider_selection_on_inference_routes branch from 75a20e2 to ab521e0 Compare April 27, 2026 09:57
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
transports/bifrost-http/lib/ctx.go (1)

491-496: Remove the duplicated x-bf-disable-content-logging branch.

The same header handler already exists at Lines 511-516, so this copy is dead once the first match returns. Keeping both branches makes later changes easy to miss.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/lib/ctx.go` around lines 491 - 496, Remove the
duplicated header branch that checks keyStr == "x-bf-disable-content-logging" by
deleting the earlier block (the one that calls strconv.ParseBool(string(value))
and then bifrostCtx.SetValue(schemas.BifrostContextKeyDisableContentLogging, b)
and returns true), leaving only the single canonical handler (the later branch
at lines ~511-516) to avoid dead code and future maintenance mistakes; ensure
you keep the remaining handler that performs strconv.ParseBool and sets
schemas.BifrostContextKeyDisableContentLogging via bifrostCtx.SetValue.
transports/bifrost-http/handlers/inference.go (1)

72-81: Make bare-model auto-resolution choose a provider explicitly.

Picking providers[0] turns the catalog's returned slice order into routing policy. If that order ever changes, identical bare-model requests can silently switch providers. Please normalize the list first here—stable sort or config-backed priority—before persisting ResolvedProvider and logging “selecting first”.

Based on learnings: (*ModelCatalog).GetProvidersForModel(model) uses the modelPool map it iterates over.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 72 - 81, The
current bare-model auto-resolution in inference.go uses the catalog-returned
slice order (providers := mc.GetProvidersForModel(modelName)) and then persists
providers[0] into ModelCatalogResolution and provider, which can silently change
routing; fix this by normalizing/sorting the providers slice deterministically
before selecting a winner (e.g., stable sort by a unique provider field such as
Provider.Name or ID or apply a config-backed priority mapping), then set
ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution,
&lib.ModelCatalogResolution{ResolvedProvider: providers[0], AllProviders:
providers}) and log the selected provider explicitly so selection is
reproducible and auditable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 72-81: The current bare-model auto-resolution in inference.go uses
the catalog-returned slice order (providers :=
mc.GetProvidersForModel(modelName)) and then persists providers[0] into
ModelCatalogResolution and provider, which can silently change routing; fix this
by normalizing/sorting the providers slice deterministically before selecting a
winner (e.g., stable sort by a unique provider field such as Provider.Name or ID
or apply a config-backed priority mapping), then set
ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution,
&lib.ModelCatalogResolution{ResolvedProvider: providers[0], AllProviders:
providers}) and log the selected provider explicitly so selection is
reproducible and auditable.

In `@transports/bifrost-http/lib/ctx.go`:
- Around line 491-496: Remove the duplicated header branch that checks keyStr ==
"x-bf-disable-content-logging" by deleting the earlier block (the one that calls
strconv.ParseBool(string(value)) and then
bifrostCtx.SetValue(schemas.BifrostContextKeyDisableContentLogging, b) and
returns true), leaving only the single canonical handler (the later branch at
lines ~511-516) to avoid dead code and future maintenance mistakes; ensure you
keep the remaining handler that performs strconv.ParseBool and sets
schemas.BifrostContextKeyDisableContentLogging via bifrostCtx.SetValue.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 58e63db4-c8a1-4e2e-a721-62a7f2432dcc

📥 Commits

Reviewing files that changed from the base of the PR and between 75a20e2 and ab521e0.

📒 Files selected for processing (6)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (4)
  • transports/bifrost-http/integrations/router.go
  • core/schemas/bifrost.go
  • transports/changelog.md
  • transports/bifrost-http/handlers/asyncinference.go

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_add_request_level_support_for_disable_content_storage_in_loggin_plugin branch from c2b6cc9 to 6e1fb4a Compare April 27, 2026 11:33
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_added_default_provider_selection_on_inference_routes branch from ab521e0 to 25b640f Compare April 27, 2026 11:33
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
transports/bifrost-http/lib/ctx.go (1)

491-496: Remove the duplicate x-bf-disable-content-logging branch.

This block duplicates the existing handler at Line 511, so the later branch is now unreachable. Keeping a single branch here avoids drift the next time this header logic changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/lib/ctx.go` around lines 491 - 496, The duplicate
header-handling branch for "x-bf-disable-content-logging" should be removed:
locate the branch that checks if keyStr == "x-bf-disable-content-logging" (the
block that parses strconv.ParseBool(value) and calls bifrostCtx.SetValue with
schemas.BifrostContextKeyDisableContentLogging) and delete this
earlier/duplicate occurrence so only the remaining handler (the one at the later
location) remains; ensure no other logic in that branch is needed before
deletion and run tests to confirm behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 959-968: The code currently truncates fractional JSON numbers for
the legacy "max_tokens" alias; update the logic in extractExtraParams handling
base.ExtraParams["max_tokens"] so that if the value is a float64 you verify it
has no fractional part (e.g., value == math.Trunc(value) or value%1 == 0) and
return a validation error instead of converting/truncating when it does; for
integer types keep the existing conversion to set
req.ChatParameters.MaxCompletionTokens and still delete the "max_tokens" key
from base.ExtraParams. Ensure the function returns an appropriate error on
fractional values rather than silently assigning a truncated int (references:
base.ExtraParams, "max_tokens", req.ChatParameters.MaxCompletionTokens,
extractExtraParams).
- Around line 72-81: GetProvidersForModel returns providers from a map-backed
pool so choosing providers[0] is non-deterministic; before setting
ctx.ModelCatalogResolution.ResolvedProvider and local variable provider in
inference.go, sort or deterministically rank the providers slice (e.g., by
provider identifier string or Provider.Name) so the same candidate is always
chosen; update the resolution code that calls GetProvidersForModel (and sets
ResolvedProvider, AllProviders, and provider) to pick the first element after
applying that stable sort/ranking, referencing GetProvidersForModel,
ModelCatalog, ModelCatalogResolution, and the local provider variable.

---

Nitpick comments:
In `@transports/bifrost-http/lib/ctx.go`:
- Around line 491-496: The duplicate header-handling branch for
"x-bf-disable-content-logging" should be removed: locate the branch that checks
if keyStr == "x-bf-disable-content-logging" (the block that parses
strconv.ParseBool(value) and calls bifrostCtx.SetValue with
schemas.BifrostContextKeyDisableContentLogging) and delete this
earlier/duplicate occurrence so only the remaining handler (the one at the later
location) remains; ensure no other logic in that branch is needed before
deletion and run tests to confirm behavior unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 84752e29-793f-4749-9ba7-b11431106c74

📥 Commits

Reviewing files that changed from the base of the PR and between ab521e0 and 25b640f.

📒 Files selected for processing (6)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (3)
  • core/schemas/bifrost.go
  • transports/changelog.md
  • transports/bifrost-http/integrations/router.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • transports/bifrost-http/handlers/asyncinference.go

Comment thread transports/bifrost-http/handlers/inference.go Outdated
Comment thread transports/bifrost-http/handlers/inference.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_added_default_provider_selection_on_inference_routes branch from 25b640f to d9c76a7 Compare April 27, 2026 12:18
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_add_request_level_support_for_disable_content_storage_in_loggin_plugin branch from 6e1fb4a to cb62c54 Compare April 27, 2026 12:18
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
transports/bifrost-http/handlers/inference.go (2)

955-964: ⚠️ Potential issue | 🟡 Minor

Silent truncation of fractional max_tokens values.

The legacy max_tokens alias handling converts float64 values to int via truncation (line 960: int(maxTokensFloat)). A fractional value like 1.5 silently becomes 1. Since max_tokens should only accept whole numbers, consider validating that the value has no fractional part and returning an error for invalid inputs.

🔧 Suggested validation
 			if req.ChatParameters.MaxCompletionTokens == nil {
 				if maxTokensFloat, ok := maxTokensVal.(float64); ok {
+					if maxTokensFloat != float64(int(maxTokensFloat)) {
+						return nil, nil, fmt.Errorf("max_tokens must be a whole number, got %v", maxTokensFloat)
+					}
 					maxTokens := int(maxTokensFloat)
 					req.ChatParameters.MaxCompletionTokens = &maxTokens
 				} else if maxTokensInt, ok := maxTokensVal.(int); ok {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 955 - 964, The
handler currently silently truncates fractional max_tokens when reading
base.ExtraParams["max_tokens"]; update the logic in the block handling
base.ExtraParams -> max_tokens to validate that numeric values are whole
integers: if the value is a float64, check that math.Modf or equivalent shows
zero fractional part and return a clear error (e.g., "max_tokens must be an
integer") if not; only then convert to int and assign to
req.ChatParameters.MaxCompletionTokens (also preserve existing int handling and
still delete the key from base.ExtraParams).

67-78: ⚠️ Potential issue | 🟡 Minor

Non-deterministic provider selection from map-backed catalog.

GetProvidersForModel iterates over a map-backed modelPool, so providers[0] is not stable across requests or process restarts. The same bare model can resolve to different providers, causing flapping in routing logs and potentially inconsistent request routing.

Consider sorting or deterministically ranking candidates before selecting the first element.

🔧 Suggested fix
 	if provider == "" {
 		providers := config.GetProvidersForModel(modelName)
 		if len(providers) == 0 {
 			return "", "", fmt.Errorf("provider is required in model field (format: provider/model) — no providers found for model %q in model catalog", modelName)
 		}
+		// Sort providers for deterministic selection
+		sort.Slice(providers, func(i, j int) bool {
+			return string(providers[i]) < string(providers[j])
+		})
 		ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution, &lib.ModelCatalogResolution{
 			Model:            modelName,
 			ResolvedProvider: providers[0],
 			AllProviders:     providers,
 		})
 		provider = providers[0]
 	}

Note: You'll need to add "sort" to imports.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 67 - 78,
GetProvidersForModel currently returns a slice built from a map-backed modelPool
so providers[0] is non-deterministic; change the selection to a deterministic
one by sorting or applying a stable ranking before assigning provider.
Specifically, in the block that calls GetProvidersForModel(modelName) and uses
providers[0] (and sets
ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution,
&lib.ModelCatalogResolution{...})), sort the providers slice (or apply a
deterministic comparator) and then use the first element as the resolved
provider and for provider = providers[0]; update imports to include "sort" (or
the comparator helper) accordingly.
🧹 Nitpick comments (1)
transports/bifrost-http/lib/ctx.go (1)

491-496: Remove duplicated x-bf-disable-content-logging handling block.

Line [491]-[496] already handles this header and returns early, so the second branch at Line [511]-[516] is unreachable and should be removed.

♻️ Proposed cleanup
@@
-		if keyStr == "x-bf-disable-content-logging" {
-			if b, err := strconv.ParseBool(string(value)); err == nil {
-				bifrostCtx.SetValue(schemas.BifrostContextKeyDisableContentLogging, b)
-			}
-			return true
-		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/lib/ctx.go` around lines 491 - 496, A duplicate
branch handling the "x-bf-disable-content-logging" header exists — remove the
redundant second block that parses the header and sets
schemas.BifrostContextKeyDisableContentLogging (the branch that checks keyStr ==
"x-bf-disable-content-logging", parses with strconv.ParseBool and calls
bifrostCtx.SetValue) so only the original early-return handling remains; after
removal ensure the surrounding header-processing flow and return behavior
unaffected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 955-964: The handler currently silently truncates fractional
max_tokens when reading base.ExtraParams["max_tokens"]; update the logic in the
block handling base.ExtraParams -> max_tokens to validate that numeric values
are whole integers: if the value is a float64, check that math.Modf or
equivalent shows zero fractional part and return a clear error (e.g.,
"max_tokens must be an integer") if not; only then convert to int and assign to
req.ChatParameters.MaxCompletionTokens (also preserve existing int handling and
still delete the key from base.ExtraParams).
- Around line 67-78: GetProvidersForModel currently returns a slice built from a
map-backed modelPool so providers[0] is non-deterministic; change the selection
to a deterministic one by sorting or applying a stable ranking before assigning
provider. Specifically, in the block that calls GetProvidersForModel(modelName)
and uses providers[0] (and sets
ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution,
&lib.ModelCatalogResolution{...})), sort the providers slice (or apply a
deterministic comparator) and then use the first element as the resolved
provider and for provider = providers[0]; update imports to include "sort" (or
the comparator helper) accordingly.

---

Nitpick comments:
In `@transports/bifrost-http/lib/ctx.go`:
- Around line 491-496: A duplicate branch handling the
"x-bf-disable-content-logging" header exists — remove the redundant second block
that parses the header and sets schemas.BifrostContextKeyDisableContentLogging
(the branch that checks keyStr == "x-bf-disable-content-logging", parses with
strconv.ParseBool and calls bifrostCtx.SetValue) so only the original
early-return handling remains; after removal ensure the surrounding
header-processing flow and return behavior unaffected.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4cf7af02-3735-4a1c-b55c-b1144d44a72d

📥 Commits

Reviewing files that changed from the base of the PR and between 25b640f and d9c76a7.

📒 Files selected for processing (7)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (3)
  • core/schemas/bifrost.go
  • transports/bifrost-http/integrations/router.go
  • transports/changelog.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • transports/bifrost-http/handlers/asyncinference.go

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 27, 2026
Comment thread transports/bifrost-http/lib/config.go
Comment thread transports/bifrost-http/handlers/inference.go
Copy link
Copy Markdown
Contributor

akshaydeo commented Apr 28, 2026

Merge activity

@akshaydeo akshaydeo changed the base branch from 04-26-feat_add_request_level_support_for_disable_content_storage_in_loggin_plugin to graphite-base/3067 April 28, 2026 07:03
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 04-26-feat_added_default_provider_selection_on_inference_routes branch from d9c76a7 to 960f72c Compare April 28, 2026 07:07
@akshaydeo akshaydeo changed the base branch from graphite-base/3067 to main April 28, 2026 07:11
@akshaydeo akshaydeo dismissed coderabbitai[bot]’s stale review April 28, 2026 07:11

The base branch was changed.

@akshaydeo akshaydeo force-pushed the 04-26-feat_added_default_provider_selection_on_inference_routes branch from 960f72c to 35e9e2c Compare April 28, 2026 07:13
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
transports/bifrost-http/handlers/inference.go (2)

67-77: ⚠️ Potential issue | 🟠 Major

Make bare-model auto-resolution deterministic.

Picking providers[0] here makes the resolved provider depend on whatever order GetProvidersForModel returns. If that slice is assembled from map iteration, the selected provider—and the model-catalog routing log—can flap between equivalent candidates.

#!/bin/bash
set -euo pipefail

# Verify how GetProvidersForModel builds its result and whether map iteration
# influences the returned provider order.
rg -n -C3 'func .*GetProvidersForModel|for .*range .*modelPool|modelPool' framework/modelcatalog

Expected result: if the returned slice is populated from a range over a map-backed structure, sort or explicitly rank providers before storing ResolvedProvider and assigning provider = providers[0].

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 67 - 77,
GetProvidersForModel can return providers in nondeterministic order, so before
setting ModelCatalogResolution.ResolvedProvider and assigning provider =
providers[0] you must deterministically choose a provider: sort or apply an
explicit ranking to the returned providers slice (the one assigned to the local
variable providers from GetProvidersForModel) and then set
ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution,
&lib.ModelCatalogResolution{Model: modelName, ResolvedProvider: providers[0],
AllProviders: providers}) and provider = providers[0] using that sorted/ranked
slice; ensure the sorting/ranking is deterministic (e.g., alphabetical or
configured priority) so ResolvedProvider and the routing log are stable.

955-961: ⚠️ Potential issue | 🟡 Minor

Reject fractional max_tokens aliases instead of truncating them.

Line 960 silently turns values like 1.5 into 1. The legacy alias should only accept whole numbers and return a validation error otherwise.

Suggested fix
+import "math"
@@
 			delete(base.ExtraParams, "max_tokens")
 			if req.ChatParameters.MaxCompletionTokens == nil {
 				if maxTokensFloat, ok := maxTokensVal.(float64); ok {
+					if maxTokensFloat != math.Trunc(maxTokensFloat) {
+						return nil, nil, fmt.Errorf("max_tokens must be an integer")
+					}
 					maxTokens := int(maxTokensFloat)
 					req.ChatParameters.MaxCompletionTokens = &maxTokens
 				} else if maxTokensInt, ok := maxTokensVal.(int); ok {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/handlers/inference.go` around lines 955 - 961, The
code handling the legacy alias base.ExtraParams["max_tokens"] currently
truncates fractional values (e.g., 1.5 -> 1); update this in the inference
handler so fractional values are rejected with a validation error instead of
being silently truncated. Locate the block that inspects maxTokensVal (inference
handler around base.ExtraParams, maxTokensVal and
req.ChatParameters.MaxCompletionTokens), and change the logic to: only accept
whole integers (int types or float64 whose fractional part is zero), and if
maxTokensVal is a float64 with non-zero fractional part (or any non-integer
type), return/propagate a validation error indicating max_tokens must be a whole
number; do not delete the alias from base.ExtraParams or set MaxCompletionTokens
when the value is invalid. Ensure the error uses the same validation/error flow
as other parameter checks in this handler.
🧹 Nitpick comments (1)
transports/bifrost-http/lib/ctx.go (1)

505-510: Remove the second x-bf-disable-content-logging branch.

This adds a new handler for the header, but the same key is still handled again at Line 525. The lower branch is now unreachable, so future edits can diverge unless there is a single parse path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/lib/ctx.go` around lines 505 - 510, There is a
duplicate header handler for "x-bf-disable-content-logging"; remove the second
branch so the header is parsed in exactly one place. Locate the duplicated
conditional that checks keyStr == "x-bf-disable-content-logging" (the branch
that calls strconv.ParseBool on value and invokes bifrostCtx.SetValue with
schemas.BifrostContextKeyDisableContentLogging) and delete the redundant block,
keeping the original parse-and-set logic and its return true behavior intact so
only one code path handles that header.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@transports/bifrost-http/handlers/inference.go`:
- Around line 67-77: GetProvidersForModel can return providers in
nondeterministic order, so before setting
ModelCatalogResolution.ResolvedProvider and assigning provider = providers[0]
you must deterministically choose a provider: sort or apply an explicit ranking
to the returned providers slice (the one assigned to the local variable
providers from GetProvidersForModel) and then set
ctx.SetUserValue(lib.FastHTTPUserValueModelCatalogResolution,
&lib.ModelCatalogResolution{Model: modelName, ResolvedProvider: providers[0],
AllProviders: providers}) and provider = providers[0] using that sorted/ranked
slice; ensure the sorting/ranking is deterministic (e.g., alphabetical or
configured priority) so ResolvedProvider and the routing log are stable.
- Around line 955-961: The code handling the legacy alias
base.ExtraParams["max_tokens"] currently truncates fractional values (e.g., 1.5
-> 1); update this in the inference handler so fractional values are rejected
with a validation error instead of being silently truncated. Locate the block
that inspects maxTokensVal (inference handler around base.ExtraParams,
maxTokensVal and req.ChatParameters.MaxCompletionTokens), and change the logic
to: only accept whole integers (int types or float64 whose fractional part is
zero), and if maxTokensVal is a float64 with non-zero fractional part (or any
non-integer type), return/propagate a validation error indicating max_tokens
must be a whole number; do not delete the alias from base.ExtraParams or set
MaxCompletionTokens when the value is invalid. Ensure the error uses the same
validation/error flow as other parameter checks in this handler.

---

Nitpick comments:
In `@transports/bifrost-http/lib/ctx.go`:
- Around line 505-510: There is a duplicate header handler for
"x-bf-disable-content-logging"; remove the second branch so the header is parsed
in exactly one place. Locate the duplicated conditional that checks keyStr ==
"x-bf-disable-content-logging" (the branch that calls strconv.ParseBool on value
and invokes bifrostCtx.SetValue with
schemas.BifrostContextKeyDisableContentLogging) and delete the redundant block,
keeping the original parse-and-set logic and its return true behavior intact so
only one code path handles that header.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a9c78543-0872-4d17-8487-81e1f75b1642

📥 Commits

Reviewing files that changed from the base of the PR and between d9c76a7 and 960f72c.

📒 Files selected for processing (7)
  • core/schemas/bifrost.go
  • transports/bifrost-http/handlers/asyncinference.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/ctx.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (5)
  • core/schemas/bifrost.go
  • transports/bifrost-http/integrations/router.go
  • transports/changelog.md
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/handlers/asyncinference.go

@akshaydeo akshaydeo merged commit be7e011 into main Apr 28, 2026
13 of 17 checks passed
@akshaydeo akshaydeo deleted the 04-26-feat_added_default_provider_selection_on_inference_routes branch April 28, 2026 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants