Skip to content

feat: refactor model catalog pricing engine with unified cost calculation#1800

Merged
Pratham-Mishra04 merged 1 commit intomainfrom
02-26-refactor_pricing_module_refactor
Mar 13, 2026
Merged

feat: refactor model catalog pricing engine with unified cost calculation#1800
Pratham-Mishra04 merged 1 commit intomainfrom
02-26-refactor_pricing_module_refactor

Conversation

@Pratham-Mishra04
Copy link
Copy Markdown
Collaborator

Summary

Refactors the Model Catalog pricing system to consolidate cost calculation into a single unified entry point (CalculateCost) that handles all request types, semantic cache billing, and tiered pricing automatically. This replaces the previous multi-method approach with a cleaner, more maintainable architecture.

Changes

  • Unified cost calculation: Replaced CalculateCostFromUsage and CalculateCostWithCacheDebug with a single CalculateCost method that handles all scenarios
  • Restructured pricing schema: Reorganized TableModelPricing and PricingEntry structs with logical groupings (Text, Cache, Image, Audio/Video, Other costs) and updated field names to match datasheet schema
  • Enhanced pricing support: Added support for video processing, reranking, priority pricing, pixel-based image costs, and search query costs
  • Improved tiered pricing: Updated threshold from 128k to 200k tokens and simplified tiered rate logic
  • Better fallback handling: Enhanced pricing resolution with deployment fallback and improved provider-specific fallbacks
  • Comprehensive test coverage: Added extensive unit tests for all cost computation functions and integration scenarios

Key design decisions:

  • Per-request-type compute functions for cleaner separation of concerns
  • Usage extraction layer to normalize different response types
  • Preserved thread-safety and graceful error handling
  • Maintained backward compatibility for existing pricing data

Type of change

  • Refactor
  • Feature
  • Documentation

Affected areas

  • Core (Go)
  • Plugins
  • Docs

How to test

Validate the refactored pricing system:

# Core pricing tests
go test ./framework/modelcatalog/... -v

# Integration tests with plugins
go test ./plugins/governance/... -v
go test ./plugins/logging/... -v
go test ./plugins/telemetry/... -v

# Full test suite
go test ./...

# Verify documentation builds
cd docs && npm run build

Test scenarios to validate:

  • Basic chat completion pricing with GPT-4o
  • Claude 3.5 Sonnet with cache tokens and 200k tier
  • Semantic cache hits/misses with embedding costs
  • Image generation with per-image and token-based pricing
  • Audio/video processing with duration-based pricing
  • Provider fallbacks (Gemini→Vertex, Bedrock anthropic. prefix)

Breaking changes

  • No

The refactor maintains API compatibility. Existing CalculateCost calls continue to work unchanged, while deprecated methods are removed internally.

Related issues

Addresses pricing system maintainability and adds support for new AI operation types including video processing and enhanced image generation pricing models.

Security considerations

No security implications. Changes are limited to internal cost calculation logic without affecting authentication, secrets, or data handling.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

Copy link
Copy Markdown
Collaborator Author

Pratham-Mishra04 commented Feb 26, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1800

@Pratham-Mishra04 Pratham-Mishra04 marked this pull request as draft February 26, 2026 11:01
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 26, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07a9223f-cb84-4b40-ba6c-9718f47b3691

📥 Commits

Reviewing files that changed from the base of the PR and between 1df9e09 and 58e3f4a.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Logs now display request cost in the UI.
  • Improvements

    • Unified, cache-aware cost calculation with tiered pricing and richer multi‑modal rates (text, image, audio, video).
    • Improved pricing resolution with multi-step fallbacks and clearer cache semantics.
    • Request backfill for media/speech to improve media cost accuracy; streaming final-chunk cost now uses the unified calculation.
  • Tests

    • Comprehensive pricing test suite added.
  • Documentation

    • Consolidated Quick Start local build guidance.

Walkthrough

Replaces a flat pricing model with a detailed, categorized pricing schema; adds deployment-aware pricing resolution (resolvePricing) and a unified, cache-aware CalculateCost pipeline; updates DB migrations, response backfills, providers, streaming/tracing/logging call sites, tests, and docs.

Changes

Cohort / File(s) Summary
Pricing Schema & DB
framework/configstore/tables/modelpricing.go, framework/configstore/migrations.go
Expanded TableModelPricing with many new public tiered/cache/media fields, added TableName() and a migration add_pricing_refactor_columns to add/drop the new columns.
Model Catalog API & Entry
framework/modelcatalog/main.go, framework/modelcatalog/utils.go, docs/architecture/framework/model-catalog.mdx
Reworked PricingEntry into segmented categories (Text, 128k/200k tiers, Cache, Image, Audio/Video, Other) with UnmarshalJSON; added resolvePricing(...) and documented unified CalculateCost and multi-step fallbacks.
Pricing Engine & Resolution
framework/modelcatalog/pricing.go, framework/modelcatalog/overrides.go, framework/modelcatalog/overrides_test.go
Introduced costInput, unified CalculateCost with cache-aware branch (calculateCostWithCache) and calculateBaseCost; many per-request-type calculators and tiered-rate helpers; added resolvePricing and changed getPricing to return pricing+bool; removed many 128k override mappings.
Tests
framework/modelcatalog/pricing_test.go, framework/modelcatalog/overrides_test.go
Added extensive pricing tests across modalities, tiering, cache scenarios and fallbacks; some override tests now skipped or adjusted for new schema/fields.
Call Sites / Plugins
framework/streaming/*.go, framework/tracing/tracer.go, plugins/logging/operations.go, plugins/logging/main.go, plugins/telemetry/main.go, plugins/governance/main.go
Replaced CalculateCostWithCacheDebug usages with unified CalculateCost; logging plugin centralizes cost computation by building a BifrostResponse and delegating to pricing; removed some cache-specific helpers.
Schemas & Backfills
core/schemas/images.go, core/schemas/speech.go, core/bifrost.go
Added BackfillParams helpers and fields (NumInputImages, Size, InputChars) and helper funcs to derive request-derived values so responses/streams include cost-relevant parameters.
Providers
core/providers/...
Injected BackfillParams into final streaming chunks for speech/image/video; inlined/refactored HuggingFace streaming (removed helper), standardized logging, and (re)introduced Rerank stubs in some providers.
Schemas & Transport Validation
core/schemas/provider.go, transports/bifrost-http/handlers/providers.go
Removed several character-based override fields and their non-negativity validation entries.
UI
ui/app/workspace/logs/sheets/logDetailsSheet.tsx
Added Cost field to Log Details Tokens section showing $<cost> or -.
Utilities & Conversions
framework/modelcatalog/utils.go, assorted files
Added normalizeStreamRequestType, expanded convertTableModelPricing↔PricingEntry conversions to populate the full pricing surface and removed older safe-float fallbacks.

Sequence Diagram(s)

sequenceDiagram
    participant Handler as Request Handler
    participant Pricing as Pricing Manager
    participant Cache as Semantic Cache
    participant DB as Pricing DB/Resolver
    participant Calc as Cost Calculator

    Handler->>Pricing: CalculateCost(bifrostResult)
    Pricing->>Cache: inspect CacheDebug / hit state
    alt direct cache hit
        Cache-->>Pricing: direct hit (no cost)
        Pricing-->>Handler: cost = 0
    else semantic cache hit
        Cache-->>Pricing: semantic hit (embedding)
        Pricing->>DB: resolvePricing(provider, model, deployment, requestType)
        DB-->>Pricing: TableModelPricing (with fallbacks)
        Pricing->>Calc: computeCacheEmbeddingCost(...)
        Calc-->>Pricing: embedding_cost
        Pricing-->>Handler: cost = embedding_cost
    else cache miss / no cache debug
        Pricing->>DB: resolvePricing(provider, model, deployment, requestType)
        DB-->>Pricing: TableModelPricing (with fallbacks)
        Pricing->>Calc: dispatch per-request-type calculator (text/image/audio/video/embedding/etc.)
        Calc-->>Pricing: computed_cost
        Pricing-->>Handler: computed_cost
    end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐰 I hopped through structs and tidy fields,
Tiers and caches counted in careful yields,
Backfills whisper request details small,
One CalculateCost now answers all,
A rabbit nods — pricing stands tall!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: refactoring the model catalog pricing engine with unified cost calculation, which aligns with the changeset.
Description check ✅ Passed The PR description comprehensively covers all required sections: summary, detailed changes, type of change, affected areas, testing instructions, breaking changes, related issues, security considerations, and a complete checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 02-26-refactor_pricing_module_refactor
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/governance/main.go (1)

555-555: ⚠️ Potential issue | 🟠 Major

Replace raw context key "model" with a typed key constant.

ctx.SetValue("model", ...) uses a plain string key at lines 555 and 694. Define a typed context key constant (e.g., governanceModelContextKey schemas.BifrostContextKey = "bf-governance-model") at the top of the file alongside other governance keys, then use it consistently in both set/get operations. This follows the pattern already used for governanceRejectedContextKey, governanceIsCacheReadContextKey, and governanceIsBatchContextKey.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/governance/main.go` at line 555, Replace the raw string key "model"
in ctx.SetValue and any ctx.Get/Lookup usages with a typed context key constant:
declare governanceModelContextKey of type schemas.BifrostContextKey (e.g.,
governanceModelContextKey := schemas.BifrostContextKey("bf-governance-model")
next to the other governance* keys), then update ctx.SetValue("model",
newModelWithRequestSuffix) to ctx.SetValue(governanceModelContextKey,
newModelWithRequestSuffix) and any corresponding retrievals to use
governanceModelContextKey so both set/get use the typed constant (also update
the other occurrence around where ctx is read at the second spot referenced).
🧹 Nitpick comments (2)
plugins/logging/operations.go (1)

772-780: Avoid silently treating unknown request types as chat.

Line 773’s fallback can produce incorrect costs for newly introduced modalities. Prefer returning an explicit unsupported-type error and skipping recalculation for that row.

♻️ Suggested refactor
-func buildResponseForRequestType(requestType schemas.RequestType, usage *schemas.BifrostLLMUsage, extra schemas.BifrostResponseExtraFields) *schemas.BifrostResponse {
+func buildResponseForRequestType(requestType schemas.RequestType, usage *schemas.BifrostLLMUsage, extra schemas.BifrostResponseExtraFields) (*schemas.BifrostResponse, error) {
 	switch requestType {
 	...
 	default:
-		// Default to chat response for unknown or chat request types
-		return &schemas.BifrostResponse{
-			ChatResponse: &schemas.BifrostChatResponse{
-				Usage:       usage,
-				ExtraFields: extra,
-			},
-		}
+		return nil, fmt.Errorf("unsupported request type: %s", requestType)
 	}
 }
-resp := buildResponseForRequestType(requestType, usage, extraFields)
-return p.pricingManager.CalculateCost(resp), nil
+resp, err := buildResponseForRequestType(requestType, usage, extraFields)
+if err != nil {
+	return 0, err
+}
+return p.pricingManager.CalculateCost(resp), nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 772 - 780, The default branch
currently maps unknown request types to schemas.BifrostResponse with a
BifrostChatResponse, which can produce incorrect cost calculations; instead,
modify the switch default in the function returning *schemas.BifrostResponse to
return an explicit unsupported-type error (or a nil response plus an error) so
the caller can skip recalculation for that row and log the unsupported modality;
update callers that expect a non-nil response to handle the error/empty result
accordingly and ensure any code that uses BifrostChatResponse (e.g., usage
recalculation) is not invoked for unsupported types.
framework/modelcatalog/pricing_test.go (1)

16-17: Use bifrost.Ptr() for pointer literals to match repo convention.

The local ptr/intPtr helpers reintroduce &value usage in tests; this repo prefers bifrost.Ptr(...) consistently across codepaths.

Based on learnings: In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Replace the
local pointer helper functions ptr and intPtr with calls to the repository
utility bifrost.Ptr to match project conventions; locate the functions named ptr
and intPtr and remove them, then update test usages to call
bifrost.Ptr(float64Value) and bifrost.Ptr(intValue) (or the appropriate
generic/typed bifrost.Ptr variants) so all pointer literals use bifrost.Ptr(...)
consistently across the tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 199-200: The doc currently uses two different phrases for the same
behavior: "// Returns 0.0 if pricing data is not found (logs a warning)" and
another occurrence stating "logs a debug message"; pick one level and make them
consistent—update the phrase in model-catalog.mdx so both occurrences use the
same wording (e.g., replace "(logs a warning)" or "(logs a debug message)" so
both instances match), ensuring the change applies to the text fragments that
mention logging when pricing data is missing.

In `@framework/modelcatalog/overrides_test.go`:
- Line 26: Remove the unconditional t.Skip() calls that disable the
override-precedence test suite (located in overrides_test.go); instead enable
the tests by deleting those t.Skip() lines or, if skipping is required
conditionally, replace them with a conditional check and t.Skipf including a
clear reason (e.g., check for required env/config) so that functions like the
override-precedence test cases run in CI and only skip when prerequisites are
missing.

In `@framework/modelcatalog/pricing.go`:
- Around line 321-335: The audio-duration branch currently only uses
InputCostPerAudioPerSecond/InputCostPerSecond for input and always computes
outputCost from CompletionTokens, which misses models billed by generated audio
duration; in the function handling audioSeconds, when computing outputCost check
if pricing.OutputCostPerSecond is non-nil and, if so, compute outputCost =
float64(*audioSeconds) * *pricing.OutputCostPerSecond (falling back to
float64(usage.CompletionTokens) * pricing.OutputCostPerToken only if
OutputCostPerSecond is nil and usage is non-nil), ensure nil checks for usage
and pricing.OutputCostPerSecond, then return inputCost + outputCost.
- Around line 400-450: computeImageCost currently omits pixel-based pricing
fields; update the function to apply InputCostPerPixel and OutputCostPerPixel
(and any size-tier image rates) when present: in the per-image branch (where it
checks pricing.InputCostPerImage / pricing.OutputCostPerImage and computes
inputCost/outputCost) add logic to compute pixel-based costs using
imageUsage.InputTokensDetails/OutputTokensDetails image dimensions or n-pixel
counts and multiply by pricing.InputCostPerPixel / pricing.OutputCostPerPixel
(falling back to per-image costs when pixel fields are nil); likewise, when
doing token-based pricing (after computing
inputTextTokens/inputImageTokens/outputTextTokens/outputImageTokens and calling
tieredInputRate/tieredOutputRate) include pixel-cost components by converting
image token details to pixel counts or applying image-size-tier rates from
pricing before summing inputCost + outputCost; use the existing symbols
imageUsage, pricing, InputCostPerImage, OutputCostPerImage, InputCostPerPixel,
OutputCostPerPixel, InputTokensDetails, OutputTokensDetails, tieredInputRate and
tieredOutputRate to locate where to insert the pixel-pricing calculations.
- Around line 267-282: The subtraction that computes nonCachedPrompt and
nonCachedCompletion can produce negative values and underbill; change the
calculations in this block so nonCachedPrompt and nonCachedCompletion are
clamped to a minimum of zero (e.g., replace direct subtraction with max(0,
promptTokens - cachedPromptTokens) and max(0, completionTokens -
cachedCompletionTokens)), then use those clamped values when computing inputCost
and outputCost and when adding cached-token costs (still guard cached-cost
additions with pricing.CacheReadInputTokenCost and
pricing.CacheCreationInputTokenCost checks); update any references to
nonCachedPrompt/nonCachedCompletion so negative values are never used in billing
math.

---

Outside diff comments:
In `@plugins/governance/main.go`:
- Line 555: Replace the raw string key "model" in ctx.SetValue and any
ctx.Get/Lookup usages with a typed context key constant: declare
governanceModelContextKey of type schemas.BifrostContextKey (e.g.,
governanceModelContextKey := schemas.BifrostContextKey("bf-governance-model")
next to the other governance* keys), then update ctx.SetValue("model",
newModelWithRequestSuffix) to ctx.SetValue(governanceModelContextKey,
newModelWithRequestSuffix) and any corresponding retrievals to use
governanceModelContextKey so both set/get use the typed constant (also update
the other occurrence around where ctx is read at the second spot referenced).

---

Nitpick comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Replace the local pointer helper functions ptr and intPtr
with calls to the repository utility bifrost.Ptr to match project conventions;
locate the functions named ptr and intPtr and remove them, then update test
usages to call bifrost.Ptr(float64Value) and bifrost.Ptr(intValue) (or the
appropriate generic/typed bifrost.Ptr variants) so all pointer literals use
bifrost.Ptr(...) consistently across the tests.

In `@plugins/logging/operations.go`:
- Around line 772-780: The default branch currently maps unknown request types
to schemas.BifrostResponse with a BifrostChatResponse, which can produce
incorrect cost calculations; instead, modify the switch default in the function
returning *schemas.BifrostResponse to return an explicit unsupported-type error
(or a nil response plus an error) so the caller can skip recalculation for that
row and log the unsupported modality; update callers that expect a non-nil
response to handle the error/empty result accordingly and ensure any code that
uses BifrostChatResponse (e.g., usage recalculation) is not invoked for
unsupported types.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0529c90 and 9681d2b.

📒 Files selected for processing (19)
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
💤 Files with no reviewable changes (1)
  • framework/modelcatalog/overrides.go

Comment thread docs/architecture/framework/model-catalog.mdx Outdated
Comment thread framework/modelcatalog/overrides_test.go
Comment thread framework/modelcatalog/pricing.go Outdated
Comment thread framework/modelcatalog/pricing.go Outdated
Comment thread framework/modelcatalog/pricing.go Outdated
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch 2 times, most recently from e9264ab to 47d7bbc Compare March 3, 2026 11:23
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 47d7bbc to 651fcc7 Compare March 10, 2026 13:39
@Pratham-Mishra04 Pratham-Mishra04 marked this pull request as ready for review March 10, 2026 13:40
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (1)
docs/architecture/framework/model-catalog.mdx (1)

195-200: ⚠️ Potential issue | 🟡 Minor

Keep the missing-pricing log level wording consistent.

Line 200 still says “logs a warning”, but Lines 369-370 say the fallback is a debug log. Pick one level and use it in both spots.

Also applies to: 369-370

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/framework/model-catalog.mdx` around lines 195 - 200, The
docs have inconsistent wording about the missing-pricing log level: update both
occurrences so they match (either "logs a warning" or "logs a debug"). Edit the
comment around modelCatalog.CalculateCost (the block mentioning cache
hits/misses and pricing fallback) and the later fallback wording at lines
corresponding to the other mention (around the fallback note at lines 369-370)
so both use the same log level phrase; ensure the chosen level is used verbatim
in both places for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/speech.go`:
- Around line 29-35: Add a nil-receiver guard and use rune counting in
BackfillParams: return immediately if r == nil; only allocate r.Usage when
request.Input is non-nil (to avoid creating empty usage for raw-body requests),
and set r.Usage.InputChars = utf8.RuneCountInString(request.Input.Input) instead
of len(...); reference BifrostSpeechResponse.BackfillParams,
BifrostSpeechRequest, and SpeechUsage and ensure you import or reference
utf8.RuneCountInString for correct Unicode character counting.

In `@framework/configstore/migrations.go`:
- Around line 4125-4214: The migration migrationAddPricingRefactorColumns is
missing three pricing fields (output_cost_per_character,
input_cost_per_character_above_128k_tokens,
output_cost_per_character_above_128k_tokens); add these names to both the
Migrate and Rollback columns slices in that function, then add matching struct
fields to tables.TableModelPricing in
framework/configstore/tables/modelpricing.go and wire them through the
conversion functions in framework/modelcatalog/utils.go (ensure the converters
map PricingEntry <-> TableModelPricing for these three fields so values survive
upgrades and rollbacks).

In `@framework/configstore/tables/modelpricing.go`:
- Around line 11-18: TableModelPricing is missing new character-pricing fields
so values from framework/modelcatalog/main.go (output_cost_per_character,
input_cost_per_character_above_128k_tokens,
output_cost_per_character_above_128k_tokens) are dropped; add matching pointer
float64 fields to the TableModelPricing struct (e.g., OutputCostPerCharacter,
InputCostPerCharacterAbove128kTokens, OutputCostPerCharacterAbove128kTokens)
with appropriate gorm tags and json names, then update the DB migration that
creates/updates the model_pricings table and the pricing conversion functions
(the code that maps between TableModelPricing and the pricing engine structs) so
the new columns are persisted and converted end-to-end.

In `@framework/modelcatalog/pricing_test.go`:
- Around line 124-145: The test assumes cached-write tokens are charged on the
completion side and that missing cache-specific rates make cached reads free,
but computeTextCost treats CacheCreationInputTokenCost as an input-side rate and
falls back to the base input rate for missing cache rates; to fix, adjust the
test so CachedWriteTokens are treated as input-side (keep them in
PromptTokensDetails but move those tokens from CompletionTokens into
PromptTokens so prompt/completion totals reflect input-side cache creation), and
update the expected assertion for computeTextCost to the corrected total (use
0.0096 for this case). Reference computeTextCost, schemas.BifrostLLMUsage,
schemas.ChatPromptTokensDetails, CachedReadTokens, CachedWriteTokens,
CacheCreationInputTokenCost when making the change.

In `@framework/modelcatalog/pricing.go`:
- Around line 467-499: computeImageOutputCost currently ignores the new
per-image size/premium pricing columns (e.g., OutputCostPerImagePremiumImage,
OutputCostPerImageAbove512x512, OutputCostPerImageAbove1024x1024) so models
using those columns are billed incorrectly; update computeImageOutputCost to
select the correct per-image rate when falling back to per-image pricing by
inspecting imageUsage and the image size (use imageUsage.ImageSize or the
imageSize value already available) and any premium flag in imageUsage (or
ImageUsage.OutputTokensDetails) to choose between OutputCostPerImage,
OutputCostPerImagePremiumImage, OutputCostPerImageAbove512x512, and
OutputCostPerImageAbove1024x1024, then multiply that selected rate by the number
of output images (use NImages if present, default 1); ensure this selection
logic is applied in both the per-pixel fallback branch (when calculating
per-image counts for pixels) and the final per-image branch so the correct
size-tier/premium column is used.
- Around line 279-282: The cache-tier selection functions are being passed the
cached segment sizes instead of the full request size; change the calls in
pricing.go so tieredCacheReadInputTokenRate and
tieredCacheCreationInputTokenRate receive totalTokens (the full request token
count) rather than cachedReadTokens / cachedWriteTokens, keeping inputRate :=
tieredInputRate(pricing, totalTokens) and outputRate :=
tieredOutputRate(pricing, totalTokens) unchanged.
- Around line 152-158: The per-character TTS pricing is using tieredInputRate()
(per-token rates) for audioTextInputChars; implement a new helper
tieredCharacterRate(pricing, totalTokens) that mirrors tieredInputRate logic but
returns the appropriate InputCostPerCharacter /
InputCostPerCharacterAbove128kCharacters /
InputCostPerCharacterAbove200kCharacters from the Pricing schema, then replace
the multiplication at the cost calculation (where audioTextInputChars is used)
to use float64(audioTextInputChars) * tieredCharacterRate(pricing, totalTokens);
ensure both branches that set input.audioTextInputChars (in the SpeechResponse
and SpeechStreamResponse handling and the subsequent cost computation) use this
new helper and that InputCostPerCharacter fields are referenced instead of
token-rate fields.

---

Duplicate comments:
In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 195-200: The docs have inconsistent wording about the
missing-pricing log level: update both occurrences so they match (either "logs a
warning" or "logs a debug"). Edit the comment around modelCatalog.CalculateCost
(the block mentioning cache hits/misses and pricing fallback) and the later
fallback wording at lines corresponding to the other mention (around the
fallback note at lines 369-370) so both use the same log level phrase; ensure
the chosen level is used verbatim in both places for consistency.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9d67c958-c63f-4662-af53-f1568c2a570b

📥 Commits

Reviewing files that changed from the base of the PR and between 9681d2b and 651fcc7.

📒 Files selected for processing (18)
  • core/bifrost.go
  • core/schemas/images.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
💤 Files with no reviewable changes (1)
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • framework/streaming/audio.go
  • framework/modelcatalog/overrides_test.go
  • docs/contributing/setting-up-repo.mdx

Comment thread core/schemas/speech.go Outdated
Comment thread framework/configstore/migrations.go
Comment thread framework/configstore/tables/modelpricing.go
Comment thread framework/modelcatalog/pricing_test.go Outdated
Comment thread framework/modelcatalog/pricing.go
Comment thread framework/modelcatalog/pricing.go Outdated
Comment thread framework/modelcatalog/pricing.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 651fcc7 to 3ead6e1 Compare March 11, 2026 14:06
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
framework/modelcatalog/overrides_test.go (1)

25-26: ⚠️ Potential issue | 🟠 Major

Re-enable the override-precedence tests.

These unconditional t.Skip() calls still disable the entire override suite, so precedence and patching regressions won't be caught in CI.

Also applies to: 37-38, 77-78, 110-111, 144-145, 171-172, 198-199, 230-231, 264-265, 297-298, 330-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` around lines 25 - 26, Remove the
unconditional t.Skip() calls that are disabling the override-precedence test
suite; specifically edit the test functions such as
TestSetProviderPricingOverrides_InvalidRegex,
TestSetProviderPricingOverrides_... (and the other tests flagged at the same
positions) to either delete the t.Skip() line or replace it with a conditional
skip (e.g., t.Skipf(...) behind a clearly documented condition) so the tests run
in CI; ensure you only remove the unconditional skips and run the test file
(overrides_test.go) locally to verify the suite now executes and fails/pass as
expected.
🧹 Nitpick comments (2)
ui/app/workspace/logs/sheets/logDetailsSheet.tsx (1)

287-296: Move cost out of the token-only section.

This stack adds priced request types that do not have token usage. In those cases this will render a "Tokens" block with three "-" values and a cost, which is misleading. Consider surfacing Cost in Request Details or a separate Usage/Pricing section whenever log.cost is present.

As per coding guidelines, "always check the stack if there is one for the current PR."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx` around lines 287 - 296, The
"Cost" field is currently rendered inside the token-only block causing
misleading output for non-token priced requests; update the render logic in
logDetailsSheet.tsx so the Tokens section (the DottedSeparator + BlockHeader
"Tokens" and the three LogEntryDetailsView entries for Input/Output/Total) only
appears when log.token_usage exists (e.g., check log.token_usage?.total_tokens
or similar) and remove the LogEntryDetailsView for "Cost" from that block;
instead render Cost separately (either inside the Request Details area or a new
Usage/Pricing block) when log.cost != null so cost is always shown but never
alongside empty token values.
framework/modelcatalog/pricing_test.go (1)

16-17: Use bifrost.Ptr() in the local pointer helpers.

This file reintroduces an &value helper even though the repo has standardized on bifrost.Ptr(...). Keeping the tests on the same convention avoids another pointer helper to maintain.

♻️ Suggested cleanup
+import bifrost "github.com/maximhq/bifrost/core"
+
-func ptr(v float64) *float64 { return &v }
-func intPtr(v int) *int      { return &v }
+func ptr(v float64) *float64 { return bifrost.Ptr(v) }
+func intPtr(v int) *int      { return bifrost.Ptr(v) }
Based on learnings, in the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, The local
pointer helpers ptr and intPtr reintroduce &-based helpers; replace their
implementations to call bifrost.Ptr for consistency (i.e., change func ptr(v
float64) *float64 to return bifrost.Ptr(v) and func intPtr(v int) *int to return
bifrost.Ptr(v)), and add/import the bifrost package if it’s not already imported
so the tests compile and follow the repo-wide pointer helper convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/provider.go`:
- Around line 368-376: The change removed legacy character-override fields from
the ProviderPricingOverride struct which breaks Go callers and existing JSON
configs; restore the removed character-override fields (re-add the
InputCostPerCharacter and any previously-removed character override fields with
their original names and json tags) into ProviderPricingOverride, mark them with
a deprecation comment if desired, and ensure JSON unmarshalling still accepts
those keys so existing configs continue to work without changing the new pricing
logic (keep current new fields like InputCostPerTokenAbove128kTokens alongside
the restored character fields).

In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 108-154: The PricingEntry struct shown is incomplete and omits
several schema fields (e.g., InputCostPerCharacter, InputCostPerImageToken,
OutputCostPerImageToken, CacheReadInputImageTokenCost); update the docs either
by including those missing fields in the PricingEntry definition or clearly
indicate this is an excerpt and link to/append the full schema, and if adding
fields, add the corresponding struct fields with proper json tags matching the
rest of the file (e.g., InputCostPerCharacter float64
`json:"input_cost_per_character,omitempty"`, InputCostPerImageToken *float64
`json:"input_cost_per_image_token,omitempty"`, OutputCostPerImageToken *float64
`json:"output_cost_per_image_token,omitempty"`, CacheReadInputImageTokenCost
*float64 `json:"cache_read_input_image_token_cost,omitempty"`) so readers can
see all knobs.

In `@framework/configstore/migrations.go`:
- Around line 4136-4161: The migration is missing the base video-input pricing
column present on TableModelPricing; update the columns slice in the migration
(the list that currently contains
"input_cost_per_video_per_second_above_128k_tokens" variants) to also include
"input_cost_per_video_per_second" so upgraded DBs get the base video per-second
pricing column; apply the same addition in the second identical columns block
referenced around the subsequent range (the other slice near lines 4176-4201) so
both migrations add the base column for video pricing.

In `@plugins/logging/operations.go`:
- Around line 948-1045: buildResponseForRequestType currently falls through
video_generation to the default chat branch so video duration never reaches
pricing; add explicit cases for schemas.VideoGenerationRequest and
schemas.VideoGenerationStreamRequest that construct a schemas.BifrostResponse
with VideoGenerationResponse set to a &schemas.BifrostVideoGenerationResponse
containing a properly converted *schemas.VideoUsage (map
InputTokens/OutputTokens/TotalTokens from usage and include the duration field
from the logged usage) and pass through extra; update the conversion logic
similar to ImageGenerationRequest/ImageUsage but populate duration so
CalculateCost's video path receives the duration.
- Around line 923-926: The early-return currently in CalculateCost that errors
when usage == nil and not a cache hit prevents computing costs for modalities
billed by output count/duration (e.g., images/videos); change the guard so it
only errors when token usage is required: check the log entry's modality or
billing unit (use the same enum/field your billing logic uses) and only return
fmt.Errorf("token usage not available for log %s", logEntry.ID) when that
modality/billing type requires token counts; otherwise proceed to compute cost
from output count/duration. Ensure you update the condition referencing usage,
cacheDebug.CacheHit, and logEntry.ID inside CalculateCost so RecalculateCosts
can handle non-token modalities.

---

Duplicate comments:
In `@framework/modelcatalog/overrides_test.go`:
- Around line 25-26: Remove the unconditional t.Skip() calls that are disabling
the override-precedence test suite; specifically edit the test functions such as
TestSetProviderPricingOverrides_InvalidRegex,
TestSetProviderPricingOverrides_... (and the other tests flagged at the same
positions) to either delete the t.Skip() line or replace it with a conditional
skip (e.g., t.Skipf(...) behind a clearly documented condition) so the tests run
in CI; ensure you only remove the unconditional skips and run the test file
(overrides_test.go) locally to verify the suite now executes and fails/pass as
expected.

---

Nitpick comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: The local pointer helpers ptr and intPtr reintroduce
&-based helpers; replace their implementations to call bifrost.Ptr for
consistency (i.e., change func ptr(v float64) *float64 to return bifrost.Ptr(v)
and func intPtr(v int) *int to return bifrost.Ptr(v)), and add/import the
bifrost package if it’s not already imported so the tests compile and follow the
repo-wide pointer helper convention.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx`:
- Around line 287-296: The "Cost" field is currently rendered inside the
token-only block causing misleading output for non-token priced requests; update
the render logic in logDetailsSheet.tsx so the Tokens section (the
DottedSeparator + BlockHeader "Tokens" and the three LogEntryDetailsView entries
for Input/Output/Total) only appears when log.token_usage exists (e.g., check
log.token_usage?.total_tokens or similar) and remove the LogEntryDetailsView for
"Cost" from that block; instead render Cost separately (either inside the
Request Details area or a new Usage/Pricing block) when log.cost != null so cost
is always shown but never alongside empty token values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cb4c3f15-88c4-4c1d-a3a9-3f307267e63b

📥 Commits

Reviewing files that changed from the base of the PR and between 651fcc7 and 3ead6e1.

📒 Files selected for processing (26)
  • core/bifrost.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • framework/streaming/images.go
  • framework/tracing/tracer.go
  • framework/streaming/audio.go
  • plugins/logging/main.go
  • framework/streaming/chat.go
  • core/schemas/images.go
  • framework/streaming/responses.go
  • core/bifrost.go
  • docs/contributing/setting-up-repo.mdx
  • framework/streaming/transcription.go

Comment thread core/schemas/provider.go
Comment thread docs/architecture/framework/model-catalog.mdx
Comment thread framework/configstore/migrations.go
Comment thread plugins/logging/operations.go
Comment thread plugins/logging/operations.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 3ead6e1 to 47dde62 Compare March 11, 2026 16:08
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
framework/modelcatalog/overrides_test.go (1)

26-26: ⚠️ Potential issue | 🟠 Major

Please re-enable these override regression tests.

These unconditional skips still turn off the override-precedence and partial-patch suite, so this refactor loses coverage for one of its riskiest paths. If any case really needs to skip, make it conditional with a concrete prerequisite instead of disabling the whole set in CI.

Also applies to: 38-38, 78-78, 111-111, 145-145, 172-172, 199-199, 231-231, 265-265, 298-298, 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` at line 26, The tests in
framework/modelcatalog/overrides_test.go were unconditionally skipped via
t.Skip() which disables the override-precedence and partial-patch suites; remove
the unconditional skips and instead guard each test with a concrete prerequisite
check (for example, check an explicit environment variable like
os.Getenv("RUN_OVERRIDES_TESTS") == "1", a feature flag, or a TestMain/setup
function that verifies required capabilities) and call t.Skipf with a clear
reason only when that prerequisite is not met; apply this change to every
t.Skip() occurrence in overrides_test.go so the suites run in CI unless the
explicit prerequisite is absent.
plugins/logging/operations.go (2)

948-1045: ⚠️ Potential issue | 🟠 Major

Rehydrate modality-specific billing inputs here, not just generic token usage.

buildResponseForRequestType still reconstructs only usage + extra, but the pricing engine's image/video paths depend on response fields beyond BifrostLLMUsage — e.g. image count from ImageGenerationResponse.Data and video duration from VideoGenerationResponse.Seconds (framework/modelcatalog/pricing_test.go:1435-1464 and framework/modelcatalog/pricing_test.go:638-672). As written, multi-image backfills collapse to the single-image fallback, and video_generation still falls through to the chat branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 948 - 1045,
buildResponseForRequestType currently only restores generic BifrostLLMUsage and
loses modality-specific fields (e.g. image count and video duration), causing
image/video pricing to fallback incorrectly; update buildResponseForRequestType
to also rehydrate modality-specific response fields when usage is non-nil by
populating BifrostImageGenerationResponse.Data (preserving original image
count/metadata) and BifrostVideoGenerationResponse.Seconds (and any other
modality-specific fields used by pricing) from the source log/extra payload
before returning, ensuring cases
ImageGenerationRequest/ImageGenerationStreamRequest and
VideoGenerationRequest/VideoGenerationStreamRequest create full
BifrostImageGenerationResponse and BifrostVideoGenerationResponse structs (not
just ImageUsage or generic Usage) so the pricing engine (which reads
ImageGenerationResponse.Data and VideoGenerationResponse.Seconds) receives the
required fields.

923-925: ⚠️ Potential issue | 🟠 Major

Don't require token_usage before per-unit pricing gets a chance to run.

This guard still drops any non-cache log whose billable unit is not token-based. CalculateCost now supports request types priced from response shape, so RecalculateCosts will keep skipping image/video entries that legitimately have no token_usage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 923 - 925, The early-return in
RecalculateCosts/operations.go incorrectly blocks logs with nil `usage` even
when their pricing isn't token-based; update the guard so we only error when
token usage is actually required: replace the unconditional `if usage == nil &&
(cacheDebug == nil || !cacheDebug.CacheHit) { return 0, fmt.Errorf("token usage
not available for log %s", logEntry.ID) }` with logic that checks whether
CalculateCost (or the pricing metadata for this log) expects token-based billing
— if the billable unit is tokens then return the error, otherwise allow
processing (so other per-unit pricing by response shape or media types can run)
and call CalculateCost/RecalculateCosts as usual.
🧹 Nitpick comments (1)
framework/modelcatalog/pricing_test.go (1)

16-17: Prefer the shared pointer helper over local ptr helpers.

This file is introducing a second pointer-construction pattern for the test suite. Using bifrost.Ptr() here keeps the tests aligned with the rest of the repo and lets you drop both helpers entirely.

Based on learnings, "In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Replace the
local pointer helper functions ptr and intPtr in pricing_test.go with the shared
helper by removing those functions and using bifrost.Ptr() for float64 values
and bifrost.PtrInt() (or bifrost.Ptr with type inference if available) for int
values wherever ptr(...) and intPtr(...) are used; update all test call sites to
call the repository's shared pointer helpers (referencing the removed local
functions ptr and intPtr and the shared symbol bifrost.Ptr) so the file relies
on the central utility instead of defining duplicate helpers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 30-33: Update the "Image Processing" bullet in model-catalog.mdx
to mention per-pixel pricing alongside per-image and token-based pricing:
reference the new schema fields input_cost_per_pixel and output_cost_per_pixel
and state that pricing can be per-image, per-pixel
(input_cost_per_pixel/output_cost_per_pixel), or token-based with text/image
token breakdown. Ensure the phrasing aligns with the other modality bullets and
explicitly calls out input/output per-pixel costs so the feature is
discoverable.

In `@framework/modelcatalog/main.go`:
- Around line 62-123: GetPricingEntryForModel currently only recognizes
text/chat/response/embedding/rerank/speech/transcription modes so requests for
image or video models return nil even when PricingEntry rows exist; update the
lookup in GetPricingEntryForModel to include the image and video-related modes
(e.g., "image", "image-generation", "vision", "video", "video-generation" or
whatever mode strings you use) and treat them like other modes when matching
BaseModel/Provider/Mode. Ensure the match logic returns the populated
PricingEntry (which now contains fields such as InputCostPerImage,
OutputCostPerImage, InputCostPerImageToken, InputCostPerVideoPerSecond,
InputCostPerImageAbove128kTokens, etc.) so callers get non-nil pricing for
image/video models.

---

Duplicate comments:
In `@framework/modelcatalog/overrides_test.go`:
- Line 26: The tests in framework/modelcatalog/overrides_test.go were
unconditionally skipped via t.Skip() which disables the override-precedence and
partial-patch suites; remove the unconditional skips and instead guard each test
with a concrete prerequisite check (for example, check an explicit environment
variable like os.Getenv("RUN_OVERRIDES_TESTS") == "1", a feature flag, or a
TestMain/setup function that verifies required capabilities) and call t.Skipf
with a clear reason only when that prerequisite is not met; apply this change to
every t.Skip() occurrence in overrides_test.go so the suites run in CI unless
the explicit prerequisite is absent.

In `@plugins/logging/operations.go`:
- Around line 948-1045: buildResponseForRequestType currently only restores
generic BifrostLLMUsage and loses modality-specific fields (e.g. image count and
video duration), causing image/video pricing to fallback incorrectly; update
buildResponseForRequestType to also rehydrate modality-specific response fields
when usage is non-nil by populating BifrostImageGenerationResponse.Data
(preserving original image count/metadata) and
BifrostVideoGenerationResponse.Seconds (and any other modality-specific fields
used by pricing) from the source log/extra payload before returning, ensuring
cases ImageGenerationRequest/ImageGenerationStreamRequest and
VideoGenerationRequest/VideoGenerationStreamRequest create full
BifrostImageGenerationResponse and BifrostVideoGenerationResponse structs (not
just ImageUsage or generic Usage) so the pricing engine (which reads
ImageGenerationResponse.Data and VideoGenerationResponse.Seconds) receives the
required fields.
- Around line 923-925: The early-return in RecalculateCosts/operations.go
incorrectly blocks logs with nil `usage` even when their pricing isn't
token-based; update the guard so we only error when token usage is actually
required: replace the unconditional `if usage == nil && (cacheDebug == nil ||
!cacheDebug.CacheHit) { return 0, fmt.Errorf("token usage not available for log
%s", logEntry.ID) }` with logic that checks whether CalculateCost (or the
pricing metadata for this log) expects token-based billing — if the billable
unit is tokens then return the error, otherwise allow processing (so other
per-unit pricing by response shape or media types can run) and call
CalculateCost/RecalculateCosts as usual.

---

Nitpick comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Replace the local pointer helper functions ptr and intPtr
in pricing_test.go with the shared helper by removing those functions and using
bifrost.Ptr() for float64 values and bifrost.PtrInt() (or bifrost.Ptr with type
inference if available) for int values wherever ptr(...) and intPtr(...) are
used; update all test call sites to call the repository's shared pointer helpers
(referencing the removed local functions ptr and intPtr and the shared symbol
bifrost.Ptr) so the file relies on the central utility instead of defining
duplicate helpers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 46a25046-5801-4869-9e85-f925dab4c5a9

📥 Commits

Reviewing files that changed from the base of the PR and between 3ead6e1 and 47dde62.

📒 Files selected for processing (26)
  • core/bifrost.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • framework/modelcatalog/overrides.go
  • transports/bifrost-http/handlers/providers.go
🚧 Files skipped from review as they are similar to previous changes (11)
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • docs/contributing/setting-up-repo.mdx
  • core/bifrost.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • plugins/telemetry/main.go
  • core/schemas/speech.go
  • framework/streaming/chat.go
  • framework/streaming/transcription.go

Comment thread docs/architecture/framework/model-catalog.mdx Outdated
Comment thread framework/modelcatalog/main.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 47dde62 to b6d51a5 Compare March 12, 2026 08:09
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/telemetry/main.go (1)

428-540: ⚠️ Potential issue | 🔴 Critical

Snapshot response data before launching the goroutine.

This closure still reads result after PostLLMHook returns (CalculateCost, usage extraction, cache-hit inspection). In streaming paths those responses can be pooled/reused, so the metrics worker can race with pool reuse and panic or publish corrupted cost/token values. Pull the cost/tokens/cache snapshot out synchronously and pass only plain values into the goroutine.

Based on learnings: In Go streaming handlers that reuse pooled response objects via ProcessAndSendResponse, do not release them back to their pools while asynchronous readers (PostLLMHook goroutines) may still access them.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/telemetry/main.go` around lines 428 - 540, Snapshot all data read
from the pooled response synchronously before launching the goroutine: call
p.pricingManager.CalculateCost(result) (if p.pricingManager != nil && result !=
nil) and extract plain values for inputTokens, outputTokens, cacheHit (bool) and
cacheType (string) from result.GetExtraFields(), plus any
extraFields.ChunkIndex/Latency used for streaming; store them in local primitive
variables (ints, floats, bools, strings) and then launch the goroutine which
must only reference these primitives (and not the original result or its nested
response structs like TextCompletionResponse, ChatResponse, ResponsesResponse,
ResponsesStreamResponse, EmbeddingResponse, SpeechStreamResponse,
TranscriptionResponse, TranscriptionStreamResponse) so you avoid races with
pooled response reuse.
♻️ Duplicate comments (2)
framework/modelcatalog/overrides_test.go (1)

26-26: ⚠️ Potential issue | 🟠 Major

Re-enable the override regression suite.

These unconditional t.Skip() calls still disable the precedence/patch tests entirely, so the pricing override refactor can ship without CI coverage for the exact behaviors this file is supposed to lock down. If a prerequisite is missing, gate the skip on that condition and use t.Skipf with the reason instead of skipping unconditionally.

Also applies to: 38-38, 78-78, 111-111, 145-145, 172-172, 199-199, 231-231, 265-265, 298-298, 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` at line 26, Replace each
unconditional t.Skip() in overrides_test.go with a conditional check that
verifies the prerequisite and calls t.Skipf(reason) when that prerequisite is
not present; locate the t.Skip() usages and either call a helper (e.g.,
requirePrerequisiteOrSkip(t, "reason")) or inline the condition (if !prereq {
t.Skipf("skipping: %s", "describe missing prerequisite") }) so the
precedence/patch tests run in CI unless the specific precondition truly fails.
docs/architecture/framework/model-catalog.mdx (1)

49-49: ⚠️ Potential issue | 🟡 Minor

Document the 128k tier too, or narrow this claim.

The code and schema still honor *_above_128k_tokens rates, so this sentence currently reads as if tiered pricing only exists above 200k. Either mention both thresholds or scope the statement to the providers that only use the 200k tier.

📝 Suggested wording
-The system automatically applies different pricing rates for high-token contexts (above 200k tokens), reflecting real provider pricing models.
+The system automatically applies different pricing rates for high-token contexts (for example, above 128k or 200k tokens, depending on the provider pricing row), reflecting real provider pricing models.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/framework/model-catalog.mdx` at line 49, Update the
sentence claiming tiered pricing only applies "above 200k tokens" to either list
both thresholds or scope it to providers using 200k; specifically reference and
include the existing *_above_128k_tokens rate tier used in the code/schema
(e.g., mention both 128k and 200k tiers or explicitly state "for providers that
use the 200k tier only") so the doc aligns with the *_above_128k_tokens and
*_above_200k_tokens entries used elsewhere.
🧹 Nitpick comments (3)
core/providers/huggingface/huggingface.go (1)

1282-1282: Consider caching raw request/response flags for consistency.

ImageEditStream (lines 1487-1488) caches sendBackRawRequest and sendBackRawResponse upfront, while ImageGenerationStream calls the helper functions inline multiple times. Consider extracting these to local variables at the start of the goroutine for consistency and minor efficiency gains.

♻️ Suggested refactor
 go func() {
   defer providerUtils.ReleaseStreamingResponse(resp)
   defer close(responseChan)
+
+  sendBackRawRequest := providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest)
+  sendBackRawResponse := providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse)

   if resp.BodyStream() == nil {

Then replace the inline calls with the cached variables.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` at line 1282, Cache the
raw-request/response flags at the start of the ImageGenerationStream goroutine
(similar to ImageEditStream) instead of calling
providerUtils.ShouldSendBackRawRequest and
providerUtils.ShouldSendBackRawResponse inline repeatedly; create local
variables like sendBackRawRequest and sendBackRawResponse in the
ImageGenerationStream function/goroutine and replace all inline calls with those
locals to ensure consistency and minor efficiency gains across
ImageGenerationStream and ImageEditStream.
ui/app/workspace/logs/sheets/logDetailsSheet.tsx (1)

341-341: Use the existing formatCost utility for consistent currency formatting.

The codebase has an established formatCost function in ui/app/workspace/dashboard/utils/chartUtils.ts that intelligently formats costs with 4 decimal places for values < $0.01 and 2 decimal places otherwise. Use this instead of displaying raw cost values to ensure consistent and professional currency presentation across the application.

💡 Suggested fix
+import { formatCost } from "@/app/workspace/dashboard/utils/chartUtils";
+
-<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost}` : "-"} />
+<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? formatCost(log.cost) : "-"} />
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx` at line 341, Replace the
raw cost display with the shared formatter: call the existing formatCost utility
from ui/app/workspace/dashboard/utils/chartUtils.ts when rendering the Cost
LogEntryDetailsView (replace value={log.cost != null ? `$${log.cost}` : "-"}).
Ensure you handle null/undefined the same way (render "-" when log.cost is
null/undefined) and pass the formatted string returned by formatCost to the
value prop of LogEntryDetailsView so cost presentation matches the rest of the
app.
core/providers/openai/openai.go (1)

3363-3365: Consider centralizing the synthetic image backfill request.

Both streaming paths hand-build a partial schemas.BifrostRequest just to call BackfillParams. A tiny helper would keep that contract in one place if BackfillParams ever needs more top-level fields.

Also applies to: 4610-4612

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 3363 - 3365, The diff shows
repeated construction of a partial schemas.BifrostRequest just to call
chunk.BackfillParams in multiple places (e.g., the snippet at openai.go around
chunk.BackfillParams and the similar block at lines ~4610-4612); centralize this
by adding a small helper (e.g., BuildImageBackfillRequest or
MakeBifrostImageRequest) that takes the ImageGenerationRequest (or the current
request variable) and returns a fully formed schemas.BifrostRequest, then
replace the inline constructions with calls to that helper before invoking
chunk.BackfillParams to keep the contract in one place and avoid duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@framework/modelcatalog/pricing.go`:
- Around line 105-123: The switch on requestType currently only routes
schemas.ImageGenerationRequest to computeImageCost so ImageEditRequest and
ImageVariationRequest (and the edit stream variant if present after
normalizeStreamRequestType) fall through to default and return 0; update the
dispatcher in the switch to include schemas.ImageEditRequest,
schemas.ImageVariationRequest (and schemas.ImageEditStreamRequest or the
specific edit-stream constant your normalizeStreamRequestType produces) as cases
that call computeImageCost(pricing, input.imageUsage, input.imageSize) so all
image response types use the same image-cost path.
- Around line 619-627: tieredVideoInputPerSecondRate currently returns
pricing.InputCostPerVideoPerSecond (or 0) and never falls back to the generic
input-per-second columns; update it to mirror the output-side fallback by
returning pricing.InputCostPerSecond when InputCostPerVideoPerSecond is nil, and
likewise for the >128k tier return pricing.InputCostPerSecondAbove128kTokens if
InputCostPerVideoPerSecondAbove128kTokens is nil; modify the function
tieredVideoInputPerSecondRate to check those generic fields (InputCostPerSecond,
InputCostPerSecondAbove128kTokens) as fallbacks instead of returning 0.

In `@framework/modelcatalog/utils.go`:
- Around line 45-48: normalizeRequestType and normalizeStreamRequestType are
inconsistent: ImageEditStreamRequest is classified as "image_edit" in one path
but not collapsed to ImageEditRequest in the other, causing pricing to miss
shared image-edit logic; update the normalization so ImageEditStreamRequest maps
to the same base type and canonical request as ImageEditRequest (e.g., treat
schemas.ImageEditStreamRequest equivalent to schemas.ImageEditRequest in both
normalizeRequestType and normalizeStreamRequestType), ensuring references to
baseType "image_edit" and any downstream cost/branching logic use the unified
symbol(s) (schemas.ImageEditRequest, "image_edit") wherever
ImageEditStreamRequest appears.

In `@plugins/logging/operations.go`:
- Around line 1008-1069: The synthetic-response branches currently rebuild
modality usages from only aggregate BifrostLLMUsage totals (in the
Responses/Speech/Transcription/ImageGeneration cases), which drops
billing-critical details and causes CalculateCost/RecalculateCosts underbilling;
fix by reconstructing each modality-specific usage/output from the original
parsed modality structs (not just token_usage): for Responses populate
cached/search token details and any InputTokenDetails into
schemas.ResponsesResponseUsage; for Transcription attach Seconds and
InputTokenDetails into schemas.TranscriptionUsage; for ImageGeneration set
OutputCount/OutputBytes and per-image size fields into schemas.ImageUsage;
ensure these fields are pulled from the parsed modality-specific objects you
already have available and passed through to the returned BifrostResponse so
downstream CalculateCost/RecalculateCosts receive full detail.

---

Outside diff comments:
In `@plugins/telemetry/main.go`:
- Around line 428-540: Snapshot all data read from the pooled response
synchronously before launching the goroutine: call
p.pricingManager.CalculateCost(result) (if p.pricingManager != nil && result !=
nil) and extract plain values for inputTokens, outputTokens, cacheHit (bool) and
cacheType (string) from result.GetExtraFields(), plus any
extraFields.ChunkIndex/Latency used for streaming; store them in local primitive
variables (ints, floats, bools, strings) and then launch the goroutine which
must only reference these primitives (and not the original result or its nested
response structs like TextCompletionResponse, ChatResponse, ResponsesResponse,
ResponsesStreamResponse, EmbeddingResponse, SpeechStreamResponse,
TranscriptionResponse, TranscriptionStreamResponse) so you avoid races with
pooled response reuse.

---

Duplicate comments:
In `@docs/architecture/framework/model-catalog.mdx`:
- Line 49: Update the sentence claiming tiered pricing only applies "above 200k
tokens" to either list both thresholds or scope it to providers using 200k;
specifically reference and include the existing *_above_128k_tokens rate tier
used in the code/schema (e.g., mention both 128k and 200k tiers or explicitly
state "for providers that use the 200k tier only") so the doc aligns with the
*_above_128k_tokens and *_above_200k_tokens entries used elsewhere.

In `@framework/modelcatalog/overrides_test.go`:
- Line 26: Replace each unconditional t.Skip() in overrides_test.go with a
conditional check that verifies the prerequisite and calls t.Skipf(reason) when
that prerequisite is not present; locate the t.Skip() usages and either call a
helper (e.g., requirePrerequisiteOrSkip(t, "reason")) or inline the condition
(if !prereq { t.Skipf("skipping: %s", "describe missing prerequisite") }) so the
precedence/patch tests run in CI unless the specific precondition truly fails.

---

Nitpick comments:
In `@core/providers/huggingface/huggingface.go`:
- Line 1282: Cache the raw-request/response flags at the start of the
ImageGenerationStream goroutine (similar to ImageEditStream) instead of calling
providerUtils.ShouldSendBackRawRequest and
providerUtils.ShouldSendBackRawResponse inline repeatedly; create local
variables like sendBackRawRequest and sendBackRawResponse in the
ImageGenerationStream function/goroutine and replace all inline calls with those
locals to ensure consistency and minor efficiency gains across
ImageGenerationStream and ImageEditStream.

In `@core/providers/openai/openai.go`:
- Around line 3363-3365: The diff shows repeated construction of a partial
schemas.BifrostRequest just to call chunk.BackfillParams in multiple places
(e.g., the snippet at openai.go around chunk.BackfillParams and the similar
block at lines ~4610-4612); centralize this by adding a small helper (e.g.,
BuildImageBackfillRequest or MakeBifrostImageRequest) that takes the
ImageGenerationRequest (or the current request variable) and returns a fully
formed schemas.BifrostRequest, then replace the inline constructions with calls
to that helper before invoking chunk.BackfillParams to keep the contract in one
place and avoid duplication.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx`:
- Line 341: Replace the raw cost display with the shared formatter: call the
existing formatCost utility from ui/app/workspace/dashboard/utils/chartUtils.ts
when rendering the Cost LogEntryDetailsView (replace value={log.cost != null ?
`$${log.cost}` : "-"}). Ensure you handle null/undefined the same way (render
"-" when log.cost is null/undefined) and pass the formatted string returned by
formatCost to the value prop of LogEntryDetailsView so cost presentation matches
the rest of the app.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 06919461-87c3-4b69-8b0b-42ba3eb4ba35

📥 Commits

Reviewing files that changed from the base of the PR and between 47dde62 and b6d51a5.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • framework/modelcatalog/overrides.go
  • transports/bifrost-http/handlers/providers.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • framework/streaming/responses.go
  • framework/streaming/images.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • framework/configstore/migrations.go
  • plugins/logging/main.go
  • framework/streaming/chat.go
  • core/bifrost.go
  • docs/contributing/setting-up-repo.mdx

Comment thread framework/modelcatalog/pricing.go
Comment thread framework/modelcatalog/pricing.go
Comment thread framework/modelcatalog/utils.go
Comment thread plugins/logging/operations.go Outdated
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from b6d51a5 to e0cecca Compare March 12, 2026 09:08
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/huggingface/huggingface.go (1)

1115-1133: ⚠️ Potential issue | 🟠 Major

Use MakeRequestWithContext for context-aware SSE connection.

Line 1116 calls provider.client.Do(req, resp) directly, but fasthttp.Client.Do does not observe ctx. This means the context cancellation and deadline handling at lines 1120–1124 is ineffective during stream establishment—canceled or timed-out requests can hang until the socket-level timeout fires. Replace the direct client.Do call with providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp), which is the context-aware pattern already used throughout the codebase (including elsewhere in this file) and properly handles context cancellation via a goroutine+select pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` around lines 1115 - 1133, Replace
the direct fasthttp call provider.client.Do(req, resp) with the context-aware
helper providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp) so
that context cancellation/deadlines are observed during SSE/stream
establishment; keep the existing error handling and ensure you still defer
providerUtils.ReleaseStreamingResponse(resp) on errors and pass providerName
into providerUtils.NewBifrostOperationError as before.
core/providers/openai/openai.go (1)

4597-4613: ⚠️ Potential issue | 🟡 Minor

Use end-to-end latency for the terminal image-edit chunk.

Line 4609 currently measures from lastChunkTime, so the final chunk reports only the delta from the previous chunk instead of the full stream duration. That diverges from the adjacent comment and from the image-generation path, and it will skew any downstream tracing/metrics that read the completion chunk.

Suggested fix
-				// For completed chunk, use total latency from start
-				chunk.ExtraFields.Latency = time.Since(lastChunkTime).Milliseconds()
+				// For completed chunk, use total latency from start
+				chunk.ExtraFields.Latency = time.Since(startTime).Milliseconds()
 				chunk.BackfillParams(&schemas.BifrostRequest{
 					ImageEditRequest: request,
 				})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 4597 - 4613, The terminal
image-edit chunk is measuring latency with time.Since(lastChunkTime) which gives
only the delta from the previous chunk; change it to measure end-to-end latency
from the stream start (e.g., use the overall stream start timestamp variable
instead of lastChunkTime) and assign that value to chunk.ExtraFields.Latency so
the completion chunk reflects full duration; update the time.Since(...) call and
ensure you reference the same stream start variable used elsewhere in this flow
(replace lastChunkTime with the stream start variable) in the block that sets
chunk.ExtraFields.Latency.
framework/modelcatalog/main.go (1)

303-326: ⚠️ Potential issue | 🟠 Major

Use the canonical pricing resolver in this public lookup.

GetPricingEntryForModel still probes pricingData directly, so it bypasses the provider/model fallbacks that getPricing now applies for pricing resolution. Callers can still get nil here for models that CalculateCost can price successfully, e.g. provider aliases or request-mode fallbacks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 303 - 326,
GetPricingEntryForModel currently reads mc.pricingData directly (using makeKey
and pricingData) which bypasses the canonical resolver getPricing and its
provider/model fallbacks; change the implementation to iterate the same modes
but call mc.getPricing(model, provider, mode) (or the appropriate getPricing
signature) for each normalized request type, and return
convertTableModelPricingToPricingData(...) on the result when non-nil/ok; remove
direct makeKey/pricingData access so provider aliases and request-mode fallbacks
are honored (refer to GetPricingEntryForModel, getPricing, makeKey,
convertTableModelPricingToPricingData).
♻️ Duplicate comments (4)
plugins/logging/operations.go (2)

1113-1119: ⚠️ Potential issue | 🟠 Major

Preserve video token totals when rebuilding the synthetic response.

This branch patches Seconds later but drops usage entirely. That means token-threshold video pricing never sees TotalTokens, so large video backfills will always use the base per-second rate.

♻️ Proposed fix
 	case schemas.VideoGenerationRequest:
-		// Seconds is not stored in BifrostLLMUsage; the caller must patch it in from
-		// the stored VideoGenerationOutputParsed after this function returns.
+		var videoUsage *schemas.VideoUsage
+		if usage != nil {
+			videoUsage = &schemas.VideoUsage{
+				InputTokens:  usage.PromptTokens,
+				OutputTokens: usage.CompletionTokens,
+				TotalTokens:  usage.TotalTokens,
+			}
+		}
 		return &schemas.BifrostResponse{
 			VideoGenerationResponse: &schemas.BifrostVideoGenerationResponse{
+				Usage:       videoUsage,
 				ExtraFields: extra,
 			},
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 1113 - 1119, In the
VideoGenerationRequest branch inside the function handling
schemas.VideoGenerationRequest, the code builds a schemas.BifrostResponse but
omits the Usage field (dropping TotalTokens), so token-based video pricing can't
see TotalTokens; modify the construction of schemas.BifrostResponse /
schemas.BifrostVideoGenerationResponse to include and propagate the existing
usage (e.g., the original usage or usage.TotalTokens from the incoming parsed
object such as VideoGenerationOutputParsed) while keeping Seconds patched later,
ensuring Usage.TotalTokens is preserved and returned.

949-958: ⚠️ Potential issue | 🟠 Major

Don't require token_usage before rebuilding per-unit modalities.

This still skips image/video-style backfills when usage is nil, even though this stack can bill some requests from stored output count/duration. The earlier deserialization check only keys off token/cache fields, so these rows return before the later patch block can restore Seconds or image usage from the serialized outputs. Derive requestType first and only require usage for request types that are actually token-billed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 949 - 958, Move the requestType
derivation (requestType := schemas.RequestType(logEntry.Object)) above the
early-return that checks usage/cacheDebug, and change the error/skip logic so
that you only require usage when requestType corresponds to token-billed
modalities (use the schemas.RequestType enum/values for text/token-based
requests); for image/video or output-count/duration modalities allow usage==nil
so the later patch block can rebuild Seconds/image usage from serialized outputs
(keep references to logEntry, cacheDebug and usage in the adjusted condition).
framework/modelcatalog/overrides_test.go (1)

26-26: ⚠️ Potential issue | 🟠 Major

Re-enable the override-precedence tests.

These unconditional skips still remove the regression coverage for override matching and patching while this stack is actively changing that logic. If a case genuinely needs gating, make the skip conditional and state the prerequisite instead of disabling the suite wholesale.

Also applies to: 38-38, 78-78, 111-111, 145-145, 172-172, 199-199, 231-231, 265-265, 298-298, 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` at line 26, Remove the
unconditional t.Skip() calls in framework/modelcatalog/overrides_test.go (search
for all occurrences of t.Skip()) and re-enable the tests; if a test truly needs
gating, replace t.Skip() with a conditional skip that checks a clear
prerequisite (e.g., an environment variable, feature flag, or helper function
like requireOverrideFeatureEnabled()) and call t.Skipf with a descriptive
message stating that prerequisite, so override-precedence tests run by default
and only skip when the explicit condition is unmet.
framework/modelcatalog/pricing.go (1)

619-627: ⚠️ Potential issue | 🟠 Major

Mirror the generic per-second fallback on the video input side.

The output leg already falls back to OutputCostPerSecond, but this helper still returns 0 when only InputCostPerSecond is populated. That underbills video rows that use the shared per-second input column.

Suggested fix
func tieredVideoInputPerSecondRate(pricing *configstoreTables.TableModelPricing, totalTokens int) float64 {
	if totalTokens > TokenTierAbove128K && pricing.InputCostPerVideoPerSecondAbove128kTokens != nil {
		return *pricing.InputCostPerVideoPerSecondAbove128kTokens
	}
	if pricing.InputCostPerVideoPerSecond != nil {
		return *pricing.InputCostPerVideoPerSecond
	}
+	if pricing.InputCostPerSecond != nil {
+		return *pricing.InputCostPerSecond
+	}
	return 0
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 619 - 627,
tieredVideoInputPerSecondRate currently returns 0 when only the shared
per-second input column is populated; update it to mirror the output-side
fallback by checking pricing.InputCostPerSecond and returning its value if
non-nil (i.e., after checking InputCostPerVideoPerSecond and
InputCostPerVideoPerSecondAbove128kTokens, return *pricing.InputCostPerSecond
when present) so video input rows use the shared per-second input rate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/images.go`:
- Around line 74-80: Backfill only when the usage field is missing: in the
BackfillParams methods, change the logic that sets r.Usage.NumInputImages so it
does not overwrite a provider-supplied value — only assign when r.Usage is nil
or r.Usage.NumInputImages is zero/unset. Locate the assignments referencing
r.Usage.NumInputImages and numInputImages in both BackfillParams implementations
and add the guard (check existing r.Usage and r.Usage.NumInputImages) before
writing the value.

In `@framework/configstore/migrations.go`:
- Around line 4174-4214: The Rollback currently iterates over columns and will
drop them if present (using mg.DropColumn on tables.TableModelPricing), which
can destructively remove columns on pristine installs; make the rollback
non-destructive by replacing the DropColumn logic with a no-op (simply return
nil) or otherwise only undo changes that this migration explicitly added (e.g.,
check a migration-specific marker added in Up) — update the Rollback function
(the anonymous Rollback in this migration) to avoid dropping those columns
unconditionally.

In `@framework/modelcatalog/pricing.go`:
- Around line 59-68: computeCacheEmbeddingCost (and sibling functions
computeEmbeddingCost and computeRerankCost) currently multiply the base
per-token columns directly and therefore ignore tiered rates (the
*_above_128k_tokens and *_above_200k_tokens fields); change each function to
compute cost using the existing tiering helper on the pricing row instead of
straight multiplication — i.e., after you fetch pricing via
mc.getPricing(*Model, *Provider, ...), ask the pricing/tier helper (the function
that applies the *_above_128k_tokens and *_above_200k_tokens rules) for the
effective per-token cost or total token cost for the given token count and use
that value to compute the return; update computeCacheEmbeddingCost,
computeEmbeddingCost, and computeRerankCost to call that tier helper (or a small
wrapper on the pricing object) so billing switches rates once thresholds are
crossed.
- Around line 815-823: The getPricing lookup needs to also fallback to
image_generation for variations: when requestType equals
schemas.ImageVariationRequest, add the same retry logic that checks
mc.pricingData[makeKey(model, provider,
normalizeRequestType(schemas.ImageGenerationRequest))] and return that pricing
if found; mirror the existing block used for
ImageEditRequest/ImageEditStreamRequest, including the mc.logger.Debug message,
so calculateBaseCost -> computeImageCost will receive the image_generation
pricing instead of nil.
- Around line 25-36: In CalculateCost, avoid unguarded dereference of
result.GetExtraFields(): first fetch extra := result.GetExtraFields() and check
extra != nil before reading extra.CacheDebug; if extra is nil fall back to
mc.calculateBaseCost(result), otherwise pass the extracted cacheDebug into
mc.calculateCostWithCache(result, cacheDebug) so you no longer risk a nil
pointer when accessing CacheDebug.
- Around line 352-359: The current TTS branch gives up tiered rates when
InputCostPerCharacter is nil by multiplying audioTextInputChars with
pricing.InputCostPerToken; instead keep the TTS fallback on the tiered path—when
pricing.InputCostPerCharacter == nil set inputCost = tieredInputRate(pricing,
totalTokens) (preserving the existing check that per-character rate wins if
non-nil), using the same totalTokens value used elsewhere so TTS 128k/200k tiers
are applied.

In `@plugins/logging/main.go`:
- Around line 758-760: The code currently calls
p.pricingManager.CalculateCost(result) and always sets entry.Cost to a non-nil
pointer, collapsing “unknown” and real zero-cost; change the pricing API and
call site so missing pricing is preserved: update CalculateCost to return
(float64, bool) or an *float64 (e.g., CalculateCost(result) (float64, bool) or
CalculateCostMaybe) and in the logging code (where p.pricingManager is used and
entry.Cost is set) only assign entry.Cost = &cost when the bool is true (or the
pointer is non-nil); leave entry.Cost nil when the lookup missed. Ensure
references to p.pricingManager, CalculateCost, and entry.Cost are updated
accordingly.

In `@plugins/logging/operations.go`:
- Around line 1097-1112: The switch handling request types currently maps only
schemas.ImageGenerationRequest and ImageGenerationStreamRequest to the image
response path, causing ImageEditRequest, ImageEditStreamRequest and
ImageVariationRequest to fall through to the chat/default branch and miss
image-specific usage extraction; update the switch to include
schemas.ImageEditRequest, schemas.ImageEditStreamRequest, and
schemas.ImageVariationRequest so they return a schemas.BifrostResponse with
ImageGenerationResponse (populating schemas.ImageUsage from usage as done for
generation) and ensure you do NOT overwrite extra.RequestType so downstream
pricing can still read values like "image_edit"; apply the same inclusion/fix to
the similar block referenced around lines 1121-1128.

---

Outside diff comments:
In `@core/providers/huggingface/huggingface.go`:
- Around line 1115-1133: Replace the direct fasthttp call
provider.client.Do(req, resp) with the context-aware helper
providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp) so that
context cancellation/deadlines are observed during SSE/stream establishment;
keep the existing error handling and ensure you still defer
providerUtils.ReleaseStreamingResponse(resp) on errors and pass providerName
into providerUtils.NewBifrostOperationError as before.

In `@core/providers/openai/openai.go`:
- Around line 4597-4613: The terminal image-edit chunk is measuring latency with
time.Since(lastChunkTime) which gives only the delta from the previous chunk;
change it to measure end-to-end latency from the stream start (e.g., use the
overall stream start timestamp variable instead of lastChunkTime) and assign
that value to chunk.ExtraFields.Latency so the completion chunk reflects full
duration; update the time.Since(...) call and ensure you reference the same
stream start variable used elsewhere in this flow (replace lastChunkTime with
the stream start variable) in the block that sets chunk.ExtraFields.Latency.

In `@framework/modelcatalog/main.go`:
- Around line 303-326: GetPricingEntryForModel currently reads mc.pricingData
directly (using makeKey and pricingData) which bypasses the canonical resolver
getPricing and its provider/model fallbacks; change the implementation to
iterate the same modes but call mc.getPricing(model, provider, mode) (or the
appropriate getPricing signature) for each normalized request type, and return
convertTableModelPricingToPricingData(...) on the result when non-nil/ok; remove
direct makeKey/pricingData access so provider aliases and request-mode fallbacks
are honored (refer to GetPricingEntryForModel, getPricing, makeKey,
convertTableModelPricingToPricingData).

---

Duplicate comments:
In `@framework/modelcatalog/overrides_test.go`:
- Line 26: Remove the unconditional t.Skip() calls in
framework/modelcatalog/overrides_test.go (search for all occurrences of
t.Skip()) and re-enable the tests; if a test truly needs gating, replace
t.Skip() with a conditional skip that checks a clear prerequisite (e.g., an
environment variable, feature flag, or helper function like
requireOverrideFeatureEnabled()) and call t.Skipf with a descriptive message
stating that prerequisite, so override-precedence tests run by default and only
skip when the explicit condition is unmet.

In `@framework/modelcatalog/pricing.go`:
- Around line 619-627: tieredVideoInputPerSecondRate currently returns 0 when
only the shared per-second input column is populated; update it to mirror the
output-side fallback by checking pricing.InputCostPerSecond and returning its
value if non-nil (i.e., after checking InputCostPerVideoPerSecond and
InputCostPerVideoPerSecondAbove128kTokens, return *pricing.InputCostPerSecond
when present) so video input rows use the shared per-second input rate.

In `@plugins/logging/operations.go`:
- Around line 1113-1119: In the VideoGenerationRequest branch inside the
function handling schemas.VideoGenerationRequest, the code builds a
schemas.BifrostResponse but omits the Usage field (dropping TotalTokens), so
token-based video pricing can't see TotalTokens; modify the construction of
schemas.BifrostResponse / schemas.BifrostVideoGenerationResponse to include and
propagate the existing usage (e.g., the original usage or usage.TotalTokens from
the incoming parsed object such as VideoGenerationOutputParsed) while keeping
Seconds patched later, ensuring Usage.TotalTokens is preserved and returned.
- Around line 949-958: Move the requestType derivation (requestType :=
schemas.RequestType(logEntry.Object)) above the early-return that checks
usage/cacheDebug, and change the error/skip logic so that you only require usage
when requestType corresponds to token-billed modalities (use the
schemas.RequestType enum/values for text/token-based requests); for image/video
or output-count/duration modalities allow usage==nil so the later patch block
can rebuild Seconds/image usage from serialized outputs (keep references to
logEntry, cacheDebug and usage in the adjusted condition).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6fedc341-3cf9-4e1b-9fc3-65391a1d8d5e

📥 Commits

Reviewing files that changed from the base of the PR and between b6d51a5 and e0cecca.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (12)
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/bifrost.go
  • framework/streaming/images.go
  • framework/streaming/chat.go
  • docs/contributing/setting-up-repo.mdx
  • core/providers/anthropic/anthropic.go
  • plugins/governance/main.go
  • framework/streaming/responses.go
  • framework/streaming/audio.go
  • core/schemas/provider.go
  • core/schemas/speech.go

Comment thread core/schemas/images.go
Comment thread framework/configstore/migrations.go
Comment thread framework/modelcatalog/pricing.go
Comment thread framework/modelcatalog/pricing.go
Comment thread framework/modelcatalog/pricing.go Outdated
Comment thread framework/modelcatalog/pricing.go
Comment thread framework/modelcatalog/pricing.go
Comment thread plugins/logging/main.go Outdated
Comment thread plugins/logging/operations.go Outdated
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from e0cecca to de6493a Compare March 12, 2026 09:56
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
plugins/governance/main.go (2)

1275-1315: ⚠️ Potential issue | 🟠 Major

Treat non-token cost as usage data.

With the unified CalculateCost path, image/audio/video/search-style requests can now produce a non-zero cost while tokensUsed stays 0. HasUsageData: tokensUsed > 0 means those requests will be treated as if they had no billable usage, so governance cost/budget tracking undercounts the new pricing modes.

Suggested fix
 		usageUpdate := &UsageUpdate{
 			VirtualKey:   virtualKey,
 			Provider:     provider,
 			Model:        model,
 			Success:      success,
 			TokensUsed:   int64(tokensUsed),
 			Cost:         cost,
 			RequestID:    requestID,
 			UserID:       userID,
 			IsStreaming:  isStreaming,
 			IsFinalChunk: isFinalChunk,
-			HasUsageData: tokensUsed > 0,
+			HasUsageData: tokensUsed > 0 || cost > 0,
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/governance/main.go` around lines 1275 - 1315, The HasUsageData flag
is only set from tokensUsed, so requests that produce non-zero cost but zero
tokens (e.g., images/audio/video/search) are treated as having no usage; update
the UsageUpdate construction (around the UsageUpdate literal where variables
cost and tokensUsed are set) to set HasUsageData to true when either tokensUsed
> 0 or cost > 0.0 (i.e., HasUsageData: tokensUsed > 0 || cost > 0.0) so
non-token billable requests are counted by governance/budget tracking.

1112-1116: ⚠️ Potential issue | 🔴 Critical

Avoid reading result from the background worker.

This goroutine still dereferences the live *schemas.BifrostResponse, and the new CalculateCost(result) call walks even more of that object. In streaming paths, that can race with pooled-response release and produce nondeterministic reads or panics. Snapshot the primitive usage/cost fields before spawning the goroutine, then pass only the derived values into postHookWorker.

Based on learnings: In Go streaming handlers that reuse pooled response objects, do not release them while asynchronous readers (PostLLMHook goroutines) may still access them.

Also applies to: 1267-1278

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/governance/main.go` around lines 1112 - 1116, The goroutine is
capturing and dereferencing the live *schemas.BifrostResponse (result) which can
race with pooled-response reuse; before calling p.wg.Add/starting the goroutine,
snapshot the primitive fields and derived values you need (e.g., usage fields,
cost from CalculateCost(result), any booleans/IDs like requestID, userID,
isCacheRead, isBatch, isFinalChunk, provider, model, requestType, effectiveVK)
into local variables, then pass only those primitives into postHookWorker
instead of the live result pointer; do the same refactor for the other
occurrence around lines 1267-1278 to avoid asynchronous reads of pooled objects.
plugins/telemetry/main.go (1)

428-450: ⚠️ Potential issue | 🔴 Critical

Snapshot response fields before the metrics goroutine.

You already copy the context labels before go func(), but the goroutine still reads the live result for cost, tokens, cache hits, and stream latency. In streaming paths that response may come from a pool, so this async read can race with release. Please compute the derived primitives synchronously, then hand only those values to the goroutine.

Based on learnings: In Go streaming handlers that reuse pooled response objects, do not release them while asynchronous readers (PostLLMHook goroutines) may still access them.

Also applies to: 481-539

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/telemetry/main.go` around lines 428 - 450, The goroutine is reading
live fields off result (and calling p.pricingManager.CalculateCost(result))
which can race with pooled response reuse; before launching the anonymous
goroutine in PostLLMHook, synchronously snapshot the derived primitives you need
(e.g. cost from pricingManager.CalculateCost(result), token counts, cache hit
booleans, extraFields.Latency/ChunkIndex from result.GetExtraFields(), and any
flags like hasFinalChunkIndicator/isFinalChunk and promLabelValues) into local
variables, then pass those primitives into the goroutine closure and stop
referencing result or calling methods on it inside the goroutine; this ensures
p.StreamFirstTokenLatencySeconds, p.StreamInterTokenLatencySeconds and the
cost/tokens/cache metrics are recorded from safe, copied values.
plugins/logging/operations.go (1)

941-945: ⚠️ Potential issue | 🟠 Major

Deserialize the stored modality payloads before reading them back.

The later patches read logEntry.TranscriptionOutputParsed, ImageGenerationOutputParsed, and VideoGenerationOutputParsed, but this branch only calls DeserializeFields() when token_usage or cache_debug are missing. For persisted logs where those two fields are already loaded, the modality-specific parsed structs stay nil and recalculation loses seconds/image counts/token details.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 941 - 945, The current condition
only calls logEntry.DeserializeFields() when TokenUsageParsed or
CacheDebugParsed are missing, causing modality payloads to remain nil; update
the branch that checks parsed fields so it also calls DeserializeFields()
whenever any of TranscriptionOutputParsed, ImageGenerationOutputParsed, or
VideoGenerationOutputParsed is nil while their corresponding raw fields
(TranscriptionOutput, ImageGenerationOutput, VideoGenerationOutput) are
non-empty; specifically, extend the boolean condition around the call to include
checks for these parsed fields (referencing logEntry.DeserializeFields,
logEntry.TranscriptionOutputParsed, logEntry.ImageGenerationOutputParsed,
logEntry.VideoGenerationOutputParsed and the raw fields) so deserialization runs
whenever any persisted modality payload exists but its parsed struct is nil.
♻️ Duplicate comments (2)
plugins/logging/operations.go (1)

951-953: ⚠️ Potential issue | 🟠 Major

Don't require token_usage for per-unit modalities.

This guard still drops any recalculation path whose billable input comes from modality payloads instead of token_usage. VideoGeneration logs, for example, can reach Line 1002 with Seconds available but never get there because TokenUsageParsed is normally nil for that flow, so those backfills will keep being skipped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 951 - 953, The current guard in
the cost calculation early-return requires token usage (usage/TokenUsageParsed)
even for per-unit modalities like VideoGeneration that bill by Seconds; update
the check around usage and cacheDebug (the if using usage == nil && (cacheDebug
== nil || !cacheDebug.CacheHit)) to allow continuing when the log's modality
payload contains per-unit billing fields (e.g., Seconds for VideoGeneration) by
adding a condition that treats those modalities as valid even if
TokenUsageParsed/usage is nil; locate the cost calculation function that
references usage, cacheDebug and logEntry.ID and modify the guard to only return
an error when neither token usage nor any modality-specific billable field
(Seconds, etc.) is present.
framework/modelcatalog/overrides_test.go (1)

25-26: ⚠️ Potential issue | 🟠 Major

Re-enable the override-precedence suite.

These unconditional t.Skip() calls remove the regression coverage for override matching, precedence, deletion, and patching on the current branch. If this is being handled in a follow-up stacked PR, please keep the skips out of this PR or gate them behind a real prerequisite check instead of disabling the tests outright.

Also applies to: 37-38, 77-78, 110-111, 144-145, 171-172, 198-199, 230-231, 264-265, 297-298, 330-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` around lines 25 - 26,
Unconditionally skipping tests in overrides_test.go (e.g.,
TestSetProviderPricingOverrides_InvalidRegex) removes important regression
coverage; remove the t.Skip() calls and either re-enable the tests or replace
the unconditional skip with a gated check that only skips when a real
prerequisite is missing (for example, a runtime feature flag or environment
variable). Locate each test function in overrides_test.go that calls t.Skip(),
delete the t.Skip() line, and if a genuine precondition exists, add a clear
conditional (e.g., if !featureEnabled { t.Skip("reason") }) so the test suite
runs by default while still allowing intentional gating.
🧹 Nitpick comments (2)
ui/app/workspace/logs/sheets/logDetailsSheet.tsx (1)

341-341: Consider formatting the cost for better readability.

API costs are often very small decimals (e.g., 0.000123456789). Displaying the raw value might be hard to read. Consider applying some formatting for consistency:

💡 Suggested formatting options

Option 1 — Fixed precision for small values:

-<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost}` : "-"} />
+<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost.toFixed(6)}` : "-"} />

Option 2 — Dynamic formatting based on magnitude:

-<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost}` : "-"} />
+<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost < 0.01 ? log.cost.toPrecision(4) : log.cost.toFixed(2)}` : "-"} />
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx` at line 341, The
LogEntryDetailsView is rendering raw log.cost which can be tiny decimals and
hard to read; add a small formatting helper (e.g., formatCost) and use it where
LogEntryDetailsView is passed value (the spot using log.cost) to produce
consistent output: for example, use Intl.NumberFormat with significant digits or
a conditional that shows fixed 2-6 decimals for values >= 0.0001 and switches to
scientific or trimmed significant-figure formatting for very small values, and
include the dollar sign there (`$${formatCost(log.cost)}`) so all cost displays
use the helper.
framework/modelcatalog/main.go (1)

307-324: Consider routing this helper through getPricing.

Adding more modes here helps, but the method still bypasses the provider/model fallbacks that CalculateCost now relies on. That means GetPricingEntryForModel can still return nil for entries that the runtime resolver would find via Gemini→Vertex, Bedrock anthropic. prefix, or provider/model normalization.

♻️ Possible simplification
func (mc *ModelCatalog) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *PricingEntry {
-	mc.mu.RLock()
-	defer mc.mu.RUnlock()
 	// Check all modes
 	for _, mode := range []schemas.RequestType{
 		schemas.TextCompletionRequest,
 		schemas.ChatCompletionRequest,
 		schemas.ResponsesRequest,
 		schemas.EmbeddingRequest,
 		schemas.RerankRequest,
 		schemas.SpeechRequest,
 		schemas.TranscriptionRequest,
 		schemas.ImageGenerationRequest,
 		schemas.ImageEditRequest,
 		schemas.ImageVariationRequest,
 		schemas.VideoGenerationRequest,
 	} {
-		key := makeKey(model, string(provider), normalizeRequestType(mode))
-		pricing, ok := mc.pricingData[key]
+		pricing, ok := mc.getPricing(model, string(provider), mode)
 		if ok {
-			return convertTableModelPricingToPricingData(&pricing)
+			return convertTableModelPricingToPricingData(pricing)
 		}
 	}
 	return nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 307 - 324,
GetPricingEntryForModel currently reads mc.pricingData directly (using makeKey +
normalizeRequestType) which bypasses the resolver and fallbacks used by
CalculateCost; update GetPricingEntryForModel to call the existing getPricing
function (or the equivalent resolver used by CalculateCost) for each mode
instead of direct map lookups so provider/model normalization and fallback rules
(e.g., Gemini→Vertex, Bedrock prefixes) are honored; specifically replace the
mc.pricingData[key] lookup in the loop with a call to getPricing (or the
resolver function) and then return convertTableModelPricingToPricingData on the
returned pricing entry when non-nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/huggingface/huggingface.go`:
- Around line 1091-1093: The streaming URL is built from the raw modelName which
can bypass the resolved provider-specific model ID mapping used by
ImageGeneration via completeRequestWithModelAliasCache, causing stream 404s;
update the code to use the resolved provider model ID (the value returned/used
by completeRequestWithModelAliasCache) when constructing the fal-ai stream path
instead of the original modelName — locate where modelName is interpolated into
defaultPath (used with provider.buildRequestURL and
schemas.ImageGenerationStreamRequest) and replace it with the resolved
provider-specific model identifier that ImageGeneration uses so alias/backfilled
catalog entries resolve consistently for both streaming and non-streaming paths.
- Around line 1115-1117: The current call to provider.client.Do(req, resp) in
core/providers/huggingface/huggingface.go does not honor context cancellation
during the initial HTTP handshake; replace it with
providerUtils.MakeRequestWithContext (passing the request's context,
provider.client, req, resp) so the context deadline/cancellation is respected
before the body stream is available, and keep the existing
SetupStreamCancellation usage for post-handshake stream cancellation (ensure you
import/resolve providerUtils and preserve the same req/resp handling).

In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 48-49: The doc text currently says rates change “above 200k
tokens” but the pricing engine consults both *_above_128k_tokens and
*_above_200k_tokens; update the paragraph to document both active token tiers
and clarify that pricing may use the 128k threshold as well as the 200k
threshold (i.e., the engine checks *_above_128k_tokens and *_above_200k_tokens
and applies the appropriate tiered rate based on which threshold the context
length crosses), referencing the tier keys *_above_128k_tokens and
*_above_200k_tokens so readers know both are consulted.
- Around line 337-344: The documentation for getPricing is missing the image
edit/variation fallback step; update the comment above function getPricing
(signature: func (mc *ModelCatalog) getPricing(model, provider string,
requestType schemas.RequestType)) to include the additional fallback where, for
edit and variation request types, the function retries with requestType
image_generation (i.e., "image_generation" fallback for edit/variation) as part
of the multi-step chain so the doc matches the implementation.
- Around line 30-34: The video pricing bullet omits duration-based input
pricing; update the model catalog text to state that video supports
duration-based input pricing as well as duration-based output pricing by
mentioning that computeVideoCost can apply InputCostPerVideoPerSecond when
prompt-token usage is absent; reference the computeVideoCost logic and the
InputCostPerVideoPerSecond configuration so readers know the input leg can be
priced per-second even without prompt tokens.

In `@framework/modelcatalog/pricing_test.go`:
- Around line 1227-1244: Add a test case to TestNormalizeStreamRequestType that
asserts normalizeStreamRequestType maps schemas.ImageEditStreamRequest to
schemas.ImageEditRequest; locate the table in TestNormalizeStreamRequestType and
add the pair {schemas.ImageEditStreamRequest, schemas.ImageEditRequest}
alongside the other stream→non-stream expectations so the ImageEdit stream
branch in normalizeStreamRequestType is covered.

In `@framework/modelcatalog/pricing.go`:
- Around line 648-666: tieredCacheReadInputTokenRate and
tieredCacheCreationInputTokenRate currently fall back directly to
pricing.InputCostPerToken when cache-specific rates are nil, skipping the
128k/200k tier; change both helpers so that when their cache-specific fields are
nil they defer to the same tiered input-token logic (e.g. call a shared
tieredInputTokenRate(pricing, totalTokens) or replicate the TokenTierAbove200K
check against pricing.InputCostPerTokenAbove200kTokens) so cached tokens use the
active text tier; update tieredCacheReadInputTokenRate and
tieredCacheCreationInputTokenRate to reference TokenTierAbove200K and the
input-tier fallback instead of directly returning InputCostPerToken.

In `@plugins/logging/operations.go`:
- Around line 1067-1080: The current construction of schemas.SpeechUsage in the
SpeechResponse branch overwrites richer request-derived usage data; instead,
prefer the already-parsed usage when present: if logEntry.SpeechOutputParsed !=
nil && logEntry.SpeechOutputParsed.Usage != nil, set speechUsage =
logEntry.SpeechOutputParsed.Usage, otherwise fall back to building a
schemas.SpeechUsage from usage.PromptTokens/CompletionTokens/TotalTokens as
before; apply this change where the BifrostResponse.SpeechResponse is created so
BifrostSpeechResponse.Usage reuses the original parsed usage instead of losing
character-count or other request-derived fields.

---

Outside diff comments:
In `@plugins/governance/main.go`:
- Around line 1275-1315: The HasUsageData flag is only set from tokensUsed, so
requests that produce non-zero cost but zero tokens (e.g.,
images/audio/video/search) are treated as having no usage; update the
UsageUpdate construction (around the UsageUpdate literal where variables cost
and tokensUsed are set) to set HasUsageData to true when either tokensUsed > 0
or cost > 0.0 (i.e., HasUsageData: tokensUsed > 0 || cost > 0.0) so non-token
billable requests are counted by governance/budget tracking.
- Around line 1112-1116: The goroutine is capturing and dereferencing the live
*schemas.BifrostResponse (result) which can race with pooled-response reuse;
before calling p.wg.Add/starting the goroutine, snapshot the primitive fields
and derived values you need (e.g., usage fields, cost from
CalculateCost(result), any booleans/IDs like requestID, userID, isCacheRead,
isBatch, isFinalChunk, provider, model, requestType, effectiveVK) into local
variables, then pass only those primitives into postHookWorker instead of the
live result pointer; do the same refactor for the other occurrence around lines
1267-1278 to avoid asynchronous reads of pooled objects.

In `@plugins/logging/operations.go`:
- Around line 941-945: The current condition only calls
logEntry.DeserializeFields() when TokenUsageParsed or CacheDebugParsed are
missing, causing modality payloads to remain nil; update the branch that checks
parsed fields so it also calls DeserializeFields() whenever any of
TranscriptionOutputParsed, ImageGenerationOutputParsed, or
VideoGenerationOutputParsed is nil while their corresponding raw fields
(TranscriptionOutput, ImageGenerationOutput, VideoGenerationOutput) are
non-empty; specifically, extend the boolean condition around the call to include
checks for these parsed fields (referencing logEntry.DeserializeFields,
logEntry.TranscriptionOutputParsed, logEntry.ImageGenerationOutputParsed,
logEntry.VideoGenerationOutputParsed and the raw fields) so deserialization runs
whenever any persisted modality payload exists but its parsed struct is nil.

In `@plugins/telemetry/main.go`:
- Around line 428-450: The goroutine is reading live fields off result (and
calling p.pricingManager.CalculateCost(result)) which can race with pooled
response reuse; before launching the anonymous goroutine in PostLLMHook,
synchronously snapshot the derived primitives you need (e.g. cost from
pricingManager.CalculateCost(result), token counts, cache hit booleans,
extraFields.Latency/ChunkIndex from result.GetExtraFields(), and any flags like
hasFinalChunkIndicator/isFinalChunk and promLabelValues) into local variables,
then pass those primitives into the goroutine closure and stop referencing
result or calling methods on it inside the goroutine; this ensures
p.StreamFirstTokenLatencySeconds, p.StreamInterTokenLatencySeconds and the
cost/tokens/cache metrics are recorded from safe, copied values.

---

Duplicate comments:
In `@framework/modelcatalog/overrides_test.go`:
- Around line 25-26: Unconditionally skipping tests in overrides_test.go (e.g.,
TestSetProviderPricingOverrides_InvalidRegex) removes important regression
coverage; remove the t.Skip() calls and either re-enable the tests or replace
the unconditional skip with a gated check that only skips when a real
prerequisite is missing (for example, a runtime feature flag or environment
variable). Locate each test function in overrides_test.go that calls t.Skip(),
delete the t.Skip() line, and if a genuine precondition exists, add a clear
conditional (e.g., if !featureEnabled { t.Skip("reason") }) so the test suite
runs by default while still allowing intentional gating.

In `@plugins/logging/operations.go`:
- Around line 951-953: The current guard in the cost calculation early-return
requires token usage (usage/TokenUsageParsed) even for per-unit modalities like
VideoGeneration that bill by Seconds; update the check around usage and
cacheDebug (the if using usage == nil && (cacheDebug == nil ||
!cacheDebug.CacheHit)) to allow continuing when the log's modality payload
contains per-unit billing fields (e.g., Seconds for VideoGeneration) by adding a
condition that treats those modalities as valid even if TokenUsageParsed/usage
is nil; locate the cost calculation function that references usage, cacheDebug
and logEntry.ID and modify the guard to only return an error when neither token
usage nor any modality-specific billable field (Seconds, etc.) is present.

---

Nitpick comments:
In `@framework/modelcatalog/main.go`:
- Around line 307-324: GetPricingEntryForModel currently reads mc.pricingData
directly (using makeKey + normalizeRequestType) which bypasses the resolver and
fallbacks used by CalculateCost; update GetPricingEntryForModel to call the
existing getPricing function (or the equivalent resolver used by CalculateCost)
for each mode instead of direct map lookups so provider/model normalization and
fallback rules (e.g., Gemini→Vertex, Bedrock prefixes) are honored; specifically
replace the mc.pricingData[key] lookup in the loop with a call to getPricing (or
the resolver function) and then return convertTableModelPricingToPricingData on
the returned pricing entry when non-nil.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx`:
- Line 341: The LogEntryDetailsView is rendering raw log.cost which can be tiny
decimals and hard to read; add a small formatting helper (e.g., formatCost) and
use it where LogEntryDetailsView is passed value (the spot using log.cost) to
produce consistent output: for example, use Intl.NumberFormat with significant
digits or a conditional that shows fixed 2-6 decimals for values >= 0.0001 and
switches to scientific or trimmed significant-figure formatting for very small
values, and include the dollar sign there (`$${formatCost(log.cost)}`) so all
cost displays use the helper.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 705c2dd6-22c0-455a-a845-c8c2c284686e

📥 Commits

Reviewing files that changed from the base of the PR and between e0cecca and de6493a.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • framework/streaming/chat.go
  • core/providers/gemini/gemini.go
  • framework/streaming/transcription.go
  • framework/streaming/audio.go
  • core/bifrost.go
  • core/schemas/images.go
  • core/providers/anthropic/anthropic.go
  • core/providers/openai/openai.go
  • docs/contributing/setting-up-repo.mdx
  • framework/tracing/tracer.go

Comment thread core/providers/huggingface/huggingface.go
Comment thread core/providers/huggingface/huggingface.go
Comment thread docs/architecture/framework/model-catalog.mdx Outdated
Comment thread docs/architecture/framework/model-catalog.mdx
Comment thread framework/modelcatalog/pricing_test.go
Comment thread framework/modelcatalog/pricing.go
Comment thread plugins/logging/operations.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
framework/modelcatalog/main.go (1)

303-323: ⚠️ Potential issue | 🟠 Major

Reuse the shared fallback chain in GetPricingEntryForModel.

framework/modelcatalog/pricing.go:getPricing now resolves Gemini→Vertex, Bedrock anthropic. aliases, and the request-type fallbacks, but this public helper still scans pricingData by exact key only. That means GetPricingEntryForModel can return nil for models that CalculateCost can already price successfully.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 303 - 323,
GetPricingEntryForModel currently scans mc.pricingData by exact key and misses
the shared fallback logic in getPricing; update GetPricingEntryForModel to call
the shared getPricing fallback for each mode instead of directly indexing
mc.pricingData. Specifically, inside the loop over schemas.RequestType values
call mc.getPricing(model, provider, normalizeRequestType(mode)) (or the
appropriate getPricing signature), and if it returns a table pricing result use
convertTableModelPricingToPricingData(&pricing) and return that; keep the
existing mutex usage and return nil if no mode yields a price.
♻️ Duplicate comments (2)
framework/modelcatalog/overrides_test.go (1)

26-26: ⚠️ Potential issue | 🟠 Major

Re-enable the pricing override test suite.

These unconditional t.Skip() calls still remove regression coverage for the override precedence and patching rules that this pricing stack is changing. If a subset truly needs to stay disabled temporarily, gate it on an explicit prerequisite instead of skipping unconditionally.

Also applies to: 38-38, 78-78, 111-111, 145-145, 172-172, 199-199, 231-231, 265-265, 298-298, 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` at line 26, Unskip the
unconditional t.Skip() calls in overrides_test.go so the pricing override test
suite runs again: remove each t.Skip() (or replace it with a conditional skip)
at the listed locations and restore the tests that assert override precedence
and patching rules; if a subset truly must remain disabled, gate them on an
explicit prerequisite (e.g., a named env var check like
os.Getenv("PRICING_OVERRIDE_TESTS") or a helper function prerequisitesMet(t)
that calls t.Skipf with a clear reason) rather than skipping unconditionally,
and update any test helper names (e.g., prerequisitesMet, ensureEnvironment)
referenced in the tests accordingly.
plugins/logging/operations.go (1)

941-953: ⚠️ Potential issue | 🟠 Major

Per-unit log backfills still bail out before the new pricing reconstruction can run.

This path only deserializes token_usage/cache_debug, then immediately returns when usage == nil. That means image/video/transcription/speech logs without stored token usage never reach the modality-specific patching below, even though their serialized output payloads can still carry the billable units needed by CalculateCost. Deserialize the modality outputs first and only require token usage for request types that actually depend on it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 941 - 953, The current check
deserializes only token/cache fields and bails when TokenUsageParsed == nil,
preventing modality-specific logs from reaching cost reconstruction; change the
flow in the same block that calls logEntry.DeserializeFields() so you first
deserialize the modality outputs (e.g., any method on the logEntry that loads
serialized outputs or response payloads) before deciding to return, and tighten
the error condition so you only error when a request type that needs tokens
actually lacks TokenUsageParsed (leave CacheDebugParsed/CacheDebug.CacheHit
logic intact); ensure CalculateCost is invoked for
image/video/transcription/speech logs by allowing those paths through after
outputs are deserialized even if TokenUsageParsed is nil.
🧹 Nitpick comments (4)
ui/app/workspace/logs/sheets/logDetailsSheet.tsx (1)

341-341: Format the cost before rendering.

Line 341 interpolates the raw float directly, so tiny prices can show up as scientific notation or precision noise (for example, $1e-7 or $0.30000000000000004). A small formatter will make this field much more readable.

Possible cleanup
+const formatCost = (cost: number) =>
+	cost.toLocaleString("en-US", {
+		useGrouping: false,
+		minimumFractionDigits: 0,
+		maximumFractionDigits: 8,
+	});
+
 ...
-									<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost}` : "-"} />
+									<LogEntryDetailsView
+										className="w-full"
+										label="Cost"
+										value={log.cost != null ? `$${formatCost(log.cost)}` : "-"}
+									/>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx` at line 341, The Cost field
is rendering raw floats (log.cost) which can display scientific notation or
precision noise; before passing to LogEntryDetailsView, format log.cost (e.g.,
use Intl.NumberFormat or toFixed with appropriate decimals) and prepend the
dollar sign, while preserving the existing null check (keep "-" when cost is
null/undefined). Update the usage where LogEntryDetailsView is called (the line
with label="Cost" and value={log.cost ...}) to compute a formattedCost string
(or inline formatted expression) that reliably displays currency like "$0.30"
instead of raw floats.
framework/configstore/migrations.go (1)

4136-4205: Share the pricing-column list between Migrate and Rollback.

This list is maintained twice in the same function, so the next pricing field addition can easily land in only one path and silently skew rollback behavior. Pull the slice up once and iterate over the same source in both branches.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 4136 - 4205, The pricing
column slice is duplicated inside the Migrate and Rollback closures; extract the
slice (e.g., columns := []string{...}) into a single variable shared by both
Migrate and Rollback (declare it once in the surrounding scope of the migration
entry), then have both Migrate and Rollback iterate over that single columns
variable when calling mg.HasColumn/mg.AddColumn and tx.Migrator().DropColumn (or
similar) for tables.TableModelPricing so additions/removals stay consistent
between the two paths.
framework/modelcatalog/pricing_test.go (1)

16-17: Use the shared pointer helper instead of local & wrappers.

These helpers just repackage &v. Switching the call sites to bifrost.Ptr(...) keeps the tests aligned with the repo-wide pointer convention.

Based on learnings, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Remove the
local pointer helper functions ptr and intPtr and switch their call sites to the
shared repository helper bifrost.Ptr(...); replace all uses of ptr(x) and
intPtr(x) with bifrost.Ptr(x) (using the appropriate typed literal where needed)
and delete the ptr and intPtr functions so the test follows the repo-wide
pointer convention.
framework/configstore/tables/modelpricing.go (1)

12-13: Consider adding explicit column: tags for consistency.

Lines 12-13 rely on GORM's auto-generated column names, while all other cost fields (lines 14+) have explicit column: tags. Adding explicit tags here would improve consistency and make the schema more self-documenting.

♻️ Suggested change
 	// Costs - Text
-	InputCostPerToken          float64  `gorm:"not null" json:"input_cost_per_token"`
-	OutputCostPerToken         float64  `gorm:"not null" json:"output_cost_per_token"`
+	InputCostPerToken          float64  `gorm:"not null;column:input_cost_per_token" json:"input_cost_per_token"`
+	OutputCostPerToken         float64  `gorm:"not null;column:output_cost_per_token" json:"output_cost_per_token"`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/tables/modelpricing.go` around lines 12 - 13, Add
explicit GORM `column:` tags to the InputCostPerToken and OutputCostPerToken
struct fields to match the other cost fields for consistency and clarity; update
the struct fields named InputCostPerToken and OutputCostPerToken in
modelpricing.go to include `gorm:"column:input_cost_per_token;not null"` and
`gorm:"column:output_cost_per_token;not null"` respectively so the schema and
JSON tags remain clear and consistent with the rest of the model.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/openai/openai.go`:
- Around line 4608-4612: The final image-edit chunk is using
time.Since(lastChunkTime) which measures only the gap since the previous chunk;
replace that with the end-to-end stream duration (e.g.,
time.Since(streamStartTime).Milliseconds() or the existing request/start-time
variable used when the stream began) and set chunk.ExtraFields.Latency to that
total duration; if no stream-start timestamp exists, capture one at the start of
the image-edit stream and use it here (leave
chunk.BackfillParams(&schemas.BifrostRequest{ImageEditRequest: request})
unchanged).

In `@core/schemas/images.go`:
- Around line 71-89: The BackfillParams method on BifrostImageGenerationResponse
(and the other image backfill method on the same type around lines 181-195)
currently dereference the receiver and will panic if r is nil; add a nil-check
at the top of each backfill function (e.g., if r == nil { return }) so the
function returns early when the response is nil before touching r. Ensure you
apply this guard in BackfillParams and the other image backfill method on
BifrostImageGenerationResponse to prevent nil dereference.

In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 30-34: The modality summary is inconsistent with the billing code:
update the Text Operations bullet to remove any implication that embeddings
inherit provider-level cache token pricing (computeEmbeddingCost does not use
cache token details), and expand the Video Processing bullet to reflect that
computeVideoCost prefers token-based output pricing and only falls back to
per-second output pricing when completion/output tokens are not reported (and
that input can be billed by duration via input_cost_per_video_per_second when
prompt-token usage is absent). Edit the bullets referencing computeEmbeddingCost
and computeVideoCost to match those behaviors and use the same terms/field names
(e.g., input_cost_per_video_per_second, output_cost_per_video_per_second) so the
doc aligns with the billing functions.

In `@framework/modelcatalog/pricing.go`:
- Around line 462-468: The per-pixel pricing fallback incorrectly defaults
imageUsage.NumInputImages to 1, causing text-to-image and omitted counts to be
billed; update the block that checks pricing.InputCostPerPixel and pixels to
treat NumInputImages == 0 as zero billed images (i.e., return 0) instead of
defaulting to 1, then compute the cost as float64(pixels * NumInputImages) *
*pricing.InputCostPerPixel when NumInputImages > 0; refer to
pricing.InputCostPerPixel, pixels, and imageUsage.NumInputImages to locate and
change the logic.

---

Outside diff comments:
In `@framework/modelcatalog/main.go`:
- Around line 303-323: GetPricingEntryForModel currently scans mc.pricingData by
exact key and misses the shared fallback logic in getPricing; update
GetPricingEntryForModel to call the shared getPricing fallback for each mode
instead of directly indexing mc.pricingData. Specifically, inside the loop over
schemas.RequestType values call mc.getPricing(model, provider,
normalizeRequestType(mode)) (or the appropriate getPricing signature), and if it
returns a table pricing result use
convertTableModelPricingToPricingData(&pricing) and return that; keep the
existing mutex usage and return nil if no mode yields a price.

---

Duplicate comments:
In `@framework/modelcatalog/overrides_test.go`:
- Line 26: Unskip the unconditional t.Skip() calls in overrides_test.go so the
pricing override test suite runs again: remove each t.Skip() (or replace it with
a conditional skip) at the listed locations and restore the tests that assert
override precedence and patching rules; if a subset truly must remain disabled,
gate them on an explicit prerequisite (e.g., a named env var check like
os.Getenv("PRICING_OVERRIDE_TESTS") or a helper function prerequisitesMet(t)
that calls t.Skipf with a clear reason) rather than skipping unconditionally,
and update any test helper names (e.g., prerequisitesMet, ensureEnvironment)
referenced in the tests accordingly.

In `@plugins/logging/operations.go`:
- Around line 941-953: The current check deserializes only token/cache fields
and bails when TokenUsageParsed == nil, preventing modality-specific logs from
reaching cost reconstruction; change the flow in the same block that calls
logEntry.DeserializeFields() so you first deserialize the modality outputs
(e.g., any method on the logEntry that loads serialized outputs or response
payloads) before deciding to return, and tighten the error condition so you only
error when a request type that needs tokens actually lacks TokenUsageParsed
(leave CacheDebugParsed/CacheDebug.CacheHit logic intact); ensure CalculateCost
is invoked for image/video/transcription/speech logs by allowing those paths
through after outputs are deserialized even if TokenUsageParsed is nil.

---

Nitpick comments:
In `@framework/configstore/migrations.go`:
- Around line 4136-4205: The pricing column slice is duplicated inside the
Migrate and Rollback closures; extract the slice (e.g., columns :=
[]string{...}) into a single variable shared by both Migrate and Rollback
(declare it once in the surrounding scope of the migration entry), then have
both Migrate and Rollback iterate over that single columns variable when calling
mg.HasColumn/mg.AddColumn and tx.Migrator().DropColumn (or similar) for
tables.TableModelPricing so additions/removals stay consistent between the two
paths.

In `@framework/configstore/tables/modelpricing.go`:
- Around line 12-13: Add explicit GORM `column:` tags to the InputCostPerToken
and OutputCostPerToken struct fields to match the other cost fields for
consistency and clarity; update the struct fields named InputCostPerToken and
OutputCostPerToken in modelpricing.go to include
`gorm:"column:input_cost_per_token;not null"` and
`gorm:"column:output_cost_per_token;not null"` respectively so the schema and
JSON tags remain clear and consistent with the rest of the model.

In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Remove the local pointer helper functions ptr and intPtr
and switch their call sites to the shared repository helper bifrost.Ptr(...);
replace all uses of ptr(x) and intPtr(x) with bifrost.Ptr(x) (using the
appropriate typed literal where needed) and delete the ptr and intPtr functions
so the test follows the repo-wide pointer convention.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx`:
- Line 341: The Cost field is rendering raw floats (log.cost) which can display
scientific notation or precision noise; before passing to LogEntryDetailsView,
format log.cost (e.g., use Intl.NumberFormat or toFixed with appropriate
decimals) and prepend the dollar sign, while preserving the existing null check
(keep "-" when cost is null/undefined). Update the usage where
LogEntryDetailsView is called (the line with label="Cost" and value={log.cost
...}) to compute a formattedCost string (or inline formatted expression) that
reliably displays currency like "$0.30" instead of raw floats.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1bbaac46-b62e-4178-afa8-b4a8357ebf5c

📥 Commits

Reviewing files that changed from the base of the PR and between de6493a and 1e98928.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • framework/modelcatalog/overrides.go
  • transports/bifrost-http/handlers/providers.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • plugins/governance/main.go
  • plugins/logging/main.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • core/providers/azure/azure.go
  • docs/contributing/setting-up-repo.mdx
  • core/bifrost.go
  • framework/modelcatalog/utils.go

Comment thread core/providers/openai/openai.go
Comment thread core/schemas/images.go
Comment thread docs/architecture/framework/model-catalog.mdx Outdated
Comment thread framework/modelcatalog/pricing.go Outdated
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 1e98928 to 0af7f9d Compare March 12, 2026 11:47
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
framework/modelcatalog/overrides_test.go (1)

26-26: ⚠️ Potential issue | 🟠 Major

Re-enable the override pricing tests.

These unconditional t.Skip() calls remove regression coverage for override precedence and patch behavior in the same pricing refactor. Please either delete them or make the skip conditional on a real prerequisite so CI still exercises this suite.

Also applies to: 38-38, 78-78, 111-111, 145-145, 172-172, 199-199, 231-231, 265-265, 298-298, 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` at line 26, Multiple tests in
overrides_test.go are being unconditionally skipped via t.Skip(), removing
regression coverage; remove those unconditional t.Skip() calls (or replace them
with a real conditional check) so CI runs the tests: delete each plain t.Skip()
occurrence or change it to something like if !prereqSatisfied() {
t.Skip("reason: missing prerequisite") } where prereqSatisfied checks a real
condition (env var, feature flag, or runtime capability). Ensure you update or
add a small helper (e.g., prereqSatisfied or requireFeature) referenced by the
test functions to detect the actual prerequisite so tests only skip when
appropriate and CI executes them otherwise.
core/providers/openai/openai.go (1)

4400-4401: ⚠️ Potential issue | 🟡 Minor

Move startTime to the request boundary.

Line 4400 starts the timer only after client.Do(req, resp) has already returned, so the latency written at Line 4610 still excludes multipart upload and TTFB. HandleOpenAIImageGenerationStreaming already measures from before Do and seeds lastChunkTime from that same start, and image-edit should match that behavior.

Suggested fix
-	// Make the request
-	err := client.Do(req, resp)
+	startTime := time.Now()
+	// Make the request
+	err := client.Do(req, resp)
@@
-		startTime := time.Now()
-		lastChunkTime := time.Now()
+		lastChunkTime := startTime

Also applies to: 4610-4613

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 4400 - 4401, The startTime is
being set after client.Do(req, resp) so measured latency omits upload and TTFB;
move the startTime initialization to just before making the request (before
client.Do(req, resp)) and initialize lastChunkTime from that same startTime,
matching HandleOpenAIImageGenerationStreaming behavior; then ensure the later
latency/reporting logic that writes the elapsed time (the block that logs/writes
latency around the response/chunk loop) uses this moved startTime so multipart
upload and TTFB are included in the reported metric.
plugins/logging/operations.go (2)

1122-1128: ⚠️ Potential issue | 🟠 Major

Preserve video token usage in the synthetic response.

This constructor discards the usage argument, so token-priced video logs can only bill from duration and fall back to 0 when seconds are missing. Populate the video response's usage field here the same way the image, speech, and transcription branches do, then keep patching Seconds afterward.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 1122 - 1128, The VideoGeneration
branch currently discards the usage argument; update the constructor that
returns &schemas.BifrostResponse{ VideoGenerationResponse:
&schemas.BifrostVideoGenerationResponse{ ExtraFields: extra, ... } } to populate
the VideoGenerationResponse.Usage field with the provided usage value (same
pattern used in the image/speech/transcription branches), so token-priced video
logs carry usage instead of defaulting to 0, and continue to let the caller
patch Seconds from VideoGenerationOutputParsed afterward.

951-954: ⚠️ Potential issue | 🟠 Major

Don't bail out before the modality-specific patching runs.

This return still skips image, video, speech, and transcription logs that can be priced from the parsed modality outputs restored below (TranscriptionOutputParsed, ImageGenerationOutputParsed, VideoGenerationOutputParsed, SpeechOutputParsed). As written, RecalculateCosts keeps failing whenever token_usage is absent even though the later branches have enough billing data. Make this guard request-type aware, or move it after the synthetic response has been patched.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 951 - 954, The current early
return in RecalculateCosts (checking usage == nil && (cacheDebug == nil ||
!cacheDebug.CacheHit)) prevents modality-specific cost patching for
image/video/speech/transcription logs; update the guard to be request-type aware
or move it after the synthetic-response patching: either (a) only return an
error if token usage is missing AND the log has no parsed modality outputs
(TranscriptionOutputParsed, ImageGenerationOutputParsed,
VideoGenerationOutputParsed, SpeechOutputParsed) and no cache hit, or (b) move
this check to after the code that restores/patches synthetic responses so those
parsed outputs can supply billing info; reference the existing variables usage,
cacheDebug, logEntry.ID and the parsed fields named above when implementing the
fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@framework/configstore/tables/modelpricing.go`:
- Around line 45-49: The GORM column tag for fields like
OutputCostPerImageAbove1024x1024PixelsPremium (and
OutputCostPerImageAbove512x512Pixels / Above512x512PixelsPremium) is too long
for MySQL/MariaDB (exceeds 64 chars) and will break deployments; update the
`gorm:"column:..."` aliases for those struct fields
(OutputCostPerImagePremiumImage, OutputCostPerImageAbove512x512Pixels,
OutputCostPerImageAbove512x512PixelsPremium,
OutputCostPerImageAbove1024x1024Pixels,
OutputCostPerImageAbove1024x1024PixelsPremium) to shorter, unique column names
under 64 characters (e.g., remove redundant words or shorten "and" to
ampersand-like tokens) while leaving the `json:"..."` tags unchanged to preserve
the API contract.

In `@framework/modelcatalog/pricing.go`:
- Around line 186-190: The video branch parses Seconds but never sets token
usage, so computeVideoCost never sees prompt/completion tokens; update the block
that handles result.VideoGenerationResponse (the same place where
input.videoSeconds is set) to extract the video response usage and assign it to
input.usage before routing to computeVideoCost — e.g., read the usage structure
on result.VideoGenerationResponse (the field used elsewhere for other response
types) and set input.usage = &usage (or appropriate type) so computeVideoCost
and CalculateCost receive token usage for video responses.

---

Duplicate comments:
In `@core/providers/openai/openai.go`:
- Around line 4400-4401: The startTime is being set after client.Do(req, resp)
so measured latency omits upload and TTFB; move the startTime initialization to
just before making the request (before client.Do(req, resp)) and initialize
lastChunkTime from that same startTime, matching
HandleOpenAIImageGenerationStreaming behavior; then ensure the later
latency/reporting logic that writes the elapsed time (the block that logs/writes
latency around the response/chunk loop) uses this moved startTime so multipart
upload and TTFB are included in the reported metric.

In `@framework/modelcatalog/overrides_test.go`:
- Line 26: Multiple tests in overrides_test.go are being unconditionally skipped
via t.Skip(), removing regression coverage; remove those unconditional t.Skip()
calls (or replace them with a real conditional check) so CI runs the tests:
delete each plain t.Skip() occurrence or change it to something like if
!prereqSatisfied() { t.Skip("reason: missing prerequisite") } where
prereqSatisfied checks a real condition (env var, feature flag, or runtime
capability). Ensure you update or add a small helper (e.g., prereqSatisfied or
requireFeature) referenced by the test functions to detect the actual
prerequisite so tests only skip when appropriate and CI executes them otherwise.

In `@plugins/logging/operations.go`:
- Around line 1122-1128: The VideoGeneration branch currently discards the usage
argument; update the constructor that returns &schemas.BifrostResponse{
VideoGenerationResponse: &schemas.BifrostVideoGenerationResponse{ ExtraFields:
extra, ... } } to populate the VideoGenerationResponse.Usage field with the
provided usage value (same pattern used in the image/speech/transcription
branches), so token-priced video logs carry usage instead of defaulting to 0,
and continue to let the caller patch Seconds from VideoGenerationOutputParsed
afterward.
- Around line 951-954: The current early return in RecalculateCosts (checking
usage == nil && (cacheDebug == nil || !cacheDebug.CacheHit)) prevents
modality-specific cost patching for image/video/speech/transcription logs;
update the guard to be request-type aware or move it after the
synthetic-response patching: either (a) only return an error if token usage is
missing AND the log has no parsed modality outputs (TranscriptionOutputParsed,
ImageGenerationOutputParsed, VideoGenerationOutputParsed, SpeechOutputParsed)
and no cache hit, or (b) move this check to after the code that restores/patches
synthetic responses so those parsed outputs can supply billing info; reference
the existing variables usage, cacheDebug, logEntry.ID and the parsed fields
named above when implementing the fix.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 65e9041c-7f67-41e5-91ef-344d52d58c02

📥 Commits

Reviewing files that changed from the base of the PR and between 1e98928 and 0af7f9d.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • framework/modelcatalog/overrides.go
  • transports/bifrost-http/handlers/providers.go
🚧 Files skipped from review as they are similar to previous changes (13)
  • docs/contributing/setting-up-repo.mdx
  • framework/tracing/tracer.go
  • framework/configstore/migrations.go
  • plugins/logging/main.go
  • core/schemas/provider.go
  • core/schemas/images.go
  • plugins/governance/main.go
  • framework/modelcatalog/pricing_test.go
  • core/providers/gemini/gemini.go
  • core/providers/anthropic/anthropic.go
  • core/bifrost.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • framework/streaming/audio.go

Comment thread framework/configstore/tables/modelpricing.go Outdated
Comment thread framework/modelcatalog/pricing.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 0af7f9d to a29ad2f Compare March 12, 2026 12:18
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
plugins/logging/operations.go (1)

941-945: ⚠️ Potential issue | 🟠 Major

Deserialize the modality payloads before patching them back in.

The new restore logic below reads TranscriptionOutputParsed, ImageGenerationOutputParsed, VideoGenerationOutputParsed, and SpeechOutputParsed, but this gate only calls DeserializeFields() for token_usage and cache_debug. For image/video-only logs those parsed fields stay nil, so the pricing-critical data never reaches CalculateCost.

♻️ Suggested fix
-	if (logEntry.TokenUsageParsed == nil && logEntry.TokenUsage != "") ||
-		(logEntry.CacheDebugParsed == nil && logEntry.CacheDebug != "") {
+	if (logEntry.TokenUsageParsed == nil && logEntry.TokenUsage != "") ||
+		(logEntry.CacheDebugParsed == nil && logEntry.CacheDebug != "") ||
+		(logEntry.TranscriptionOutputParsed == nil && logEntry.TranscriptionOutput != "") ||
+		(logEntry.ImageGenerationOutputParsed == nil && logEntry.ImageGenerationOutput != "") ||
+		(logEntry.VideoGenerationOutputParsed == nil && logEntry.VideoGenerationOutput != "") ||
+		(logEntry.SpeechOutputParsed == nil && logEntry.SpeechOutput != "") {
 		if err := logEntry.DeserializeFields(); err != nil {
 			return 0, fmt.Errorf("failed to deserialize fields for log %s: %w", logEntry.ID, err)
 		}
 	}

Also applies to: 976-1012

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 941 - 945, The current
conditional only calls logEntry.DeserializeFields() when TokenUsageParsed or
CacheDebugParsed are missing, leaving TranscriptionOutputParsed,
ImageGenerationOutputParsed, VideoGenerationOutputParsed, and SpeechOutputParsed
nil for modality-only logs; update the gate around DeserializeFields() to also
check for those parsed fields (TranscriptionOutputParsed,
ImageGenerationOutputParsed, VideoGenerationOutputParsed, SpeechOutputParsed) OR
simply call logEntry.DeserializeFields() whenever any of the raw payload fields
(TokenUsage, CacheDebug, TranscriptionOutput, ImageGenerationOutput,
VideoGenerationOutput, SpeechOutput) are non-empty, so that all parsed modality
payloads are populated before passing the logEntry into CalculateCost.
core/providers/openai/openai.go (1)

3361-3378: ⚠️ Potential issue | 🟠 Major

Keep the image stream open until the last image completes.

This branch sets the stream-end flag and returns on the first completed event. For multi-image streaming requests, that closes the channel after the first image finishes, so later images never reach the caller and the new backfill/cost path runs on partial usage. The same isCompleted/return pattern at Line 4598 has the same failure mode.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 3361 - 3378, The handler
currently sets the stream-end flag and immediately returns when isCompleted is
true, which closes the stream after the first finished image; instead, change
the logic in this block (referencing chunk, isCompleted, sendBackRawRequest,
ctx.SetValue, providerUtils.ProcessAndSendResponse) to only mark stream-end and
return when the final image has finished (e.g., detect finality via a
total-images count or a last-image flag on chunk.ExtraFields), otherwise
continue streaming subsequent images; specifically, remove the unconditional
return on isCompleted, add a check for "this is the last image" before setting
schemas.BifrostContextKeyStreamEndIndicator and returning, and ensure
providerUtils.ParseAndSetRawRequest and providerUtils.ProcessAndSendResponse
still run for intermediate completed chunks.
core/bifrost.go (2)

4972-4999: ⚠️ Potential issue | 🟠 Major

Backfill is only wired for sync responses right now.

This updates the non-streaming path only. handleProviderStreamRequest still forwards speech/image stream responses without applying the new request-derived backfill, so pricing/usage fields can now differ between sync and stream requests for the same model.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4972 - 4999, The streaming path in
handleProviderStreamRequest is missing the same BackfillParams calls you added
for the sync cases, so streamed speech/image responses never get request-derived
backfill (pricing/usage) applied; update handleProviderStreamRequest to call the
same BackfillParams on the streamed response wrappers (e.g., speechResponse,
imageResponse, imageEditResponse, imageVariationResponse) using the incoming
*req.BifrostRequest before forwarding chunks to the client, ensuring you invoke
the BackfillParams method at the point you construct or wrap each provider
stream response so streamed and non-streamed flows receive identical backfill
data.

4968-4999: ⚠️ Potential issue | 🔴 Critical

Guard BackfillParams against nil provider responses.

Line 4972, Line 4985, Line 4992, and Line 4999 call into the provider response before checking whether it is nil. The image request wrappers already expect handleRequest can surface a nil response object, so these new calls can panic instead of returning the intended BifrostError. The speech path now has the same regression.

Proposed fix
 	case schemas.SpeechRequest:
 		speechResponse, bifrostError := provider.Speech(req.Context, key, req.BifrostRequest.SpeechRequest)
 		if bifrostError != nil {
 			return nil, bifrostError
 		}
-		speechResponse.BackfillParams(req.BifrostRequest.SpeechRequest)
+		if speechResponse != nil {
+			speechResponse.BackfillParams(req.BifrostRequest.SpeechRequest)
+		}
 		response.SpeechResponse = speechResponse
@@
 	case schemas.ImageGenerationRequest:
 		imageResponse, bifrostError := provider.ImageGeneration(req.Context, key, req.BifrostRequest.ImageGenerationRequest)
 		if bifrostError != nil {
 			return nil, bifrostError
 		}
-		imageResponse.BackfillParams(&req.BifrostRequest)
+		if imageResponse != nil {
+			imageResponse.BackfillParams(&req.BifrostRequest)
+		}
 		response.ImageGenerationResponse = imageResponse
@@
 	case schemas.ImageEditRequest:
 		imageEditResponse, bifrostError := provider.ImageEdit(req.Context, key, req.BifrostRequest.ImageEditRequest)
 		if bifrostError != nil {
 			return nil, bifrostError
 		}
-		imageEditResponse.BackfillParams(&req.BifrostRequest)
+		if imageEditResponse != nil {
+			imageEditResponse.BackfillParams(&req.BifrostRequest)
+		}
 		response.ImageGenerationResponse = imageEditResponse
@@
 	case schemas.ImageVariationRequest:
 		imageVariationResponse, bifrostError := provider.ImageVariation(req.Context, key, req.BifrostRequest.ImageVariationRequest)
 		if bifrostError != nil {
 			return nil, bifrostError
 		}
-		imageVariationResponse.BackfillParams(&req.BifrostRequest)
+		if imageVariationResponse != nil {
+			imageVariationResponse.BackfillParams(&req.BifrostRequest)
+		}
 		response.ImageGenerationResponse = imageVariationResponse
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4968 - 4999, Several branches call
BackfillParams on provider responses without checking for nil (provider.Speech,
provider.Transcription branch uses transcriptionResponse,
provider.ImageGeneration -> imageResponse, provider.ImageEdit ->
imageEditResponse, provider.ImageVariation -> imageVariationResponse); this can
panic if the provider returned a nil response. After the existing bifrostError
check for each call, add a nil check (e.g., if speechResponse != nil {
speechResponse.BackfillParams(...) }) before invoking BackfillParams so you only
call it when the response is non-nil and preserve returning the existing
bifrostError/response behavior.
framework/modelcatalog/main.go (1)

302-323: 🛠️ Refactor suggestion | 🟠 Major

Use the shared fallback resolver in GetPricingEntryForModel.

This helper still reads mc.pricingData directly, so it skips the fallback chain now implemented in getPricing/resolvePricing (for example Gemini→Vertex, Bedrock anthropic. prefix handling, and Responses→Chat fallback). Callers can get nil here for models that CalculateCost can already price correctly.

♻️ Suggested fix
 func (mc *ModelCatalog) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *PricingEntry {
-	mc.mu.RLock()
-	defer mc.mu.RUnlock()
 	// Check all modes
 	for _, mode := range []schemas.RequestType{
 		schemas.TextCompletionRequest,
 		schemas.ChatCompletionRequest,
@@
 		schemas.ImageVariationRequest,
 		schemas.VideoGenerationRequest,
 	} {
-		key := makeKey(model, string(provider), normalizeRequestType(mode))
-		pricing, ok := mc.pricingData[key]
+		pricing, ok := mc.getPricing(model, string(provider), mode)
 		if ok {
-			return convertTableModelPricingToPricingData(&pricing)
+			return convertTableModelPricingToPricingData(pricing)
 		}
 	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 302 - 323,
GetPricingEntryForModel currently reads mc.pricingData directly and bypasses the
shared fallback resolver (getPricing/resolvePricing), causing misses for
fallbacks like Gemini→Vertex, Bedrock anthropic. prefix handling, and
Responses→Chat; update GetPricingEntryForModel to use the existing resolver
(call mc.getPricing or mc.resolvePricing with model, provider, and request type
via normalizeRequestType) for each mode instead of direct map lookup, then if a
pricing is returned convert it with convertTableModelPricingToPricingData and
return it; ensure callers like CalculateCost that rely on resolver behavior now
get the same results.
♻️ Duplicate comments (3)
plugins/logging/operations.go (2)

1122-1128: ⚠️ Potential issue | 🟠 Major

Add the streamed video request type to the video branch.

schemas.VideoGenerationStreamRequest still falls through to the default chat path here, so the later Seconds patch never reaches the video compute function and duration-based backfills remain wrong/zero for streamed video logs.

♻️ Suggested fix
-	case schemas.VideoGenerationRequest:
+	case schemas.VideoGenerationRequest, schemas.VideoGenerationStreamRequest:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 1122 - 1128, The video branch
handles schemas.VideoGenerationRequest but misses
schemas.VideoGenerationStreamRequest, causing streamed video requests to fall
through to the chat path and skip the later Seconds patch; update the
switch/case that returns a *schemas.BifrostResponse with a
VideoGenerationResponse (the block that currently references
schemas.VideoGenerationRequest and returns &schemas.BifrostResponse{
VideoGenerationResponse: &schemas.BifrostVideoGenerationResponse{ ExtraFields:
extra, } }) to also match schemas.VideoGenerationStreamRequest so streamed video
requests return the same VideoGenerationResponse shape and allow the later
Seconds/backfill from VideoGenerationOutputParsed to run.

951-953: ⚠️ Potential issue | 🟠 Major

Don't require token_usage for non-token request types.

This early return still skips image/video backfills before the request-type-specific branches can price from seconds or output metadata, so RecalculateCosts will keep missing valid non-cache logs whose billable unit is not tokens. Move this check after requestType resolution and only enforce it for request types that actually need token counts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 951 - 953, The early-return that
requires `usage` (token counts) blocks non-token request types from being
priced; move the block that checks `if usage == nil && (cacheDebug == nil ||
!cacheDebug.CacheHit) { return 0, fmt.Errorf("token usage not available for log
%s", logEntry.ID) }` so it runs after `requestType` is resolved inside
RecalculateCosts, and change the condition to only enforce token usage for
token-based request types (e.g., chat/completion/embedding) — for image/video or
other time/metadata-based types, skip this check and allow pricing logic that
uses seconds or output metadata to proceed. Ensure you reference `requestType`,
`usage`, `cacheDebug`, and `logEntry.ID` in the updated conditional so non-token
backfills are not prematurely returned.
framework/modelcatalog/pricing.go (1)

13-20: ⚠️ Potential issue | 🟠 Major

Priority pricing is still unreachable.

The normalized billing input still carries no priority/service-tier signal, so the downstream helpers never have a way to select InputCostPerTokenPriority, OutputCostPerTokenPriority, or CacheReadInputTokenCostPriority. Priority requests will continue to bill at the standard rates, which leaves this part of the new pricing surface dead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 13 - 20, The costInput struct
lacks a priority/service-tier field, so downstream pricing helpers cannot select
InputCostPerTokenPriority, OutputCostPerTokenPriority, or
CacheReadInputTokenCostPriority; add a priority (e.g., priority or serviceTier)
field to costInput and populate it wherever the normalized billing input is
constructed, then update any callers and pricing helpers that accept costInput
to read this new field and choose the corresponding
InputCostPerTokenPriority/OutputCostPerTokenPriority/CacheReadInputTokenCostPriority
rates accordingly (ensure symbols costInput, InputCostPerTokenPriority,
OutputCostPerTokenPriority, CacheReadInputTokenCostPriority are used to locate
changes).
🧹 Nitpick comments (2)
ui/app/workspace/logs/sheets/logDetailsSheet.tsx (1)

341-341: Consider formatting the cost value with fixed decimal places.

The current implementation displays the raw cost value (e.g., $0.00012345678), which may be visually inconsistent. Consider rounding or formatting to a reasonable number of decimal places for better readability.

💡 Suggested formatting improvement
-<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost}` : "-"} />
+<LogEntryDetailsView className="w-full" label="Cost" value={log.cost != null ? `$${log.cost.toFixed(6)}` : "-"} />

Alternatively, for very small costs, you might want a smarter formatter that adapts precision based on magnitude.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx` at line 341, The cost
display currently uses the raw value in LogEntryDetailsView (value={log.cost !=
null ? `$${log.cost}` : "-"}) which can produce long decimals; update the value
expression to format log.cost to a fixed number of decimal places (e.g., using
toFixed(2) or a helper formatter that adapts precision) before prefixing with
"$", ensuring null/undefined still renders "-" and update any helper name (e.g.,
formatCost) if you extract formatting logic for reuse.
framework/modelcatalog/pricing_test.go (1)

16-17: Prefer bifrost.Ptr() in these local pointer helpers.

If we keep local wrappers here, routing them through bifrost.Ptr() keeps the new fixtures aligned with the repo-wide pointer convention used elsewhere in Bifrost. Based on learnings: prefer using bifrost.Ptr() to create pointers instead of the address operator (&), including in test utilities.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Replace the
local pointer helper functions ptr and intPtr with calls to the repo-standard
bifrost.Ptr() helper: remove or stop using the functions ptr(v float64) *float64
and intPtr(v int) *int and instead create pointers in tests via bifrost.Ptr(v)
for float64 and bifrost.Ptr(v) for int to align with the repo convention; update
any test fixtures in pricing_test.go that call ptr(...) or intPtr(...) to call
bifrost.Ptr(...) and import or reference the bifrost package if not already
imported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/images.go`:
- Around line 99-105: The image-generation backfill currently only reads
Params.Size from req.ImageGenerationRequest and ignores Params.Resolution;
update the handling in the switch branch that checks req.ImageGenerationRequest
so it also checks for and propagates Params.Resolution (from
ImageGenerationParameters) into the response/dimension metadata alongside
Params.Size, ensuring requests that set Resolution but not Size still produce
correct dimension metadata used for pricing and cost-bucketing. Locate
references to req.ImageGenerationRequest, ImageGenerationParameters, Params.Size
and add analogous logic to read Params.Resolution and set the appropriate
response fields.

In `@framework/configstore/migrations.go`:
- Line 4149: The migration adds a column named
"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image" that does
not match the GORM mapping in TableModelPricing (field
OutputCostPerImageAbove1024x1024PixelsPremium which maps to
"output_cost_per_image_above_1024_and_1024_pixels_premium"); update the
migration (in migrations.go) to use the exact column identifier used by
TableModelPricing, or alternatively update TableModelPricing's column tag to the
new name so both sides use the same column name, and mirror the same name in the
rollback entry as well.

In `@framework/modelcatalog/pricing.go`:
- Around line 86-89: The current check in the cost computation ignores
provider-supplied zero costs by requiring input.usage.Cost.TotalCost > 0; update
the condition to treat any non-nil input.usage and input.usage.Cost as
authoritative and return input.usage.Cost.TotalCost regardless of its value.
Locate the early-return in the pricing computation that references input.usage
and input.usage.Cost (and the TotalCost field) and remove the > 0 gate so the
function returns the provider-supplied TotalCost even when it is zero.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 4972-4999: The streaming path in handleProviderStreamRequest is
missing the same BackfillParams calls you added for the sync cases, so streamed
speech/image responses never get request-derived backfill (pricing/usage)
applied; update handleProviderStreamRequest to call the same BackfillParams on
the streamed response wrappers (e.g., speechResponse, imageResponse,
imageEditResponse, imageVariationResponse) using the incoming
*req.BifrostRequest before forwarding chunks to the client, ensuring you invoke
the BackfillParams method at the point you construct or wrap each provider
stream response so streamed and non-streamed flows receive identical backfill
data.
- Around line 4968-4999: Several branches call BackfillParams on provider
responses without checking for nil (provider.Speech, provider.Transcription
branch uses transcriptionResponse, provider.ImageGeneration -> imageResponse,
provider.ImageEdit -> imageEditResponse, provider.ImageVariation ->
imageVariationResponse); this can panic if the provider returned a nil response.
After the existing bifrostError check for each call, add a nil check (e.g., if
speechResponse != nil { speechResponse.BackfillParams(...) }) before invoking
BackfillParams so you only call it when the response is non-nil and preserve
returning the existing bifrostError/response behavior.

In `@core/providers/openai/openai.go`:
- Around line 3361-3378: The handler currently sets the stream-end flag and
immediately returns when isCompleted is true, which closes the stream after the
first finished image; instead, change the logic in this block (referencing
chunk, isCompleted, sendBackRawRequest, ctx.SetValue,
providerUtils.ProcessAndSendResponse) to only mark stream-end and return when
the final image has finished (e.g., detect finality via a total-images count or
a last-image flag on chunk.ExtraFields), otherwise continue streaming subsequent
images; specifically, remove the unconditional return on isCompleted, add a
check for "this is the last image" before setting
schemas.BifrostContextKeyStreamEndIndicator and returning, and ensure
providerUtils.ParseAndSetRawRequest and providerUtils.ProcessAndSendResponse
still run for intermediate completed chunks.

In `@framework/modelcatalog/main.go`:
- Around line 302-323: GetPricingEntryForModel currently reads mc.pricingData
directly and bypasses the shared fallback resolver (getPricing/resolvePricing),
causing misses for fallbacks like Gemini→Vertex, Bedrock anthropic. prefix
handling, and Responses→Chat; update GetPricingEntryForModel to use the existing
resolver (call mc.getPricing or mc.resolvePricing with model, provider, and
request type via normalizeRequestType) for each mode instead of direct map
lookup, then if a pricing is returned convert it with
convertTableModelPricingToPricingData and return it; ensure callers like
CalculateCost that rely on resolver behavior now get the same results.

In `@plugins/logging/operations.go`:
- Around line 941-945: The current conditional only calls
logEntry.DeserializeFields() when TokenUsageParsed or CacheDebugParsed are
missing, leaving TranscriptionOutputParsed, ImageGenerationOutputParsed,
VideoGenerationOutputParsed, and SpeechOutputParsed nil for modality-only logs;
update the gate around DeserializeFields() to also check for those parsed fields
(TranscriptionOutputParsed, ImageGenerationOutputParsed,
VideoGenerationOutputParsed, SpeechOutputParsed) OR simply call
logEntry.DeserializeFields() whenever any of the raw payload fields (TokenUsage,
CacheDebug, TranscriptionOutput, ImageGenerationOutput, VideoGenerationOutput,
SpeechOutput) are non-empty, so that all parsed modality payloads are populated
before passing the logEntry into CalculateCost.

---

Duplicate comments:
In `@framework/modelcatalog/pricing.go`:
- Around line 13-20: The costInput struct lacks a priority/service-tier field,
so downstream pricing helpers cannot select InputCostPerTokenPriority,
OutputCostPerTokenPriority, or CacheReadInputTokenCostPriority; add a priority
(e.g., priority or serviceTier) field to costInput and populate it wherever the
normalized billing input is constructed, then update any callers and pricing
helpers that accept costInput to read this new field and choose the
corresponding
InputCostPerTokenPriority/OutputCostPerTokenPriority/CacheReadInputTokenCostPriority
rates accordingly (ensure symbols costInput, InputCostPerTokenPriority,
OutputCostPerTokenPriority, CacheReadInputTokenCostPriority are used to locate
changes).

In `@plugins/logging/operations.go`:
- Around line 1122-1128: The video branch handles schemas.VideoGenerationRequest
but misses schemas.VideoGenerationStreamRequest, causing streamed video requests
to fall through to the chat path and skip the later Seconds patch; update the
switch/case that returns a *schemas.BifrostResponse with a
VideoGenerationResponse (the block that currently references
schemas.VideoGenerationRequest and returns &schemas.BifrostResponse{
VideoGenerationResponse: &schemas.BifrostVideoGenerationResponse{ ExtraFields:
extra, } }) to also match schemas.VideoGenerationStreamRequest so streamed video
requests return the same VideoGenerationResponse shape and allow the later
Seconds/backfill from VideoGenerationOutputParsed to run.
- Around line 951-953: The early-return that requires `usage` (token counts)
blocks non-token request types from being priced; move the block that checks `if
usage == nil && (cacheDebug == nil || !cacheDebug.CacheHit) { return 0,
fmt.Errorf("token usage not available for log %s", logEntry.ID) }` so it runs
after `requestType` is resolved inside RecalculateCosts, and change the
condition to only enforce token usage for token-based request types (e.g.,
chat/completion/embedding) — for image/video or other time/metadata-based types,
skip this check and allow pricing logic that uses seconds or output metadata to
proceed. Ensure you reference `requestType`, `usage`, `cacheDebug`, and
`logEntry.ID` in the updated conditional so non-token backfills are not
prematurely returned.

---

Nitpick comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Replace the local pointer helper functions ptr and intPtr
with calls to the repo-standard bifrost.Ptr() helper: remove or stop using the
functions ptr(v float64) *float64 and intPtr(v int) *int and instead create
pointers in tests via bifrost.Ptr(v) for float64 and bifrost.Ptr(v) for int to
align with the repo convention; update any test fixtures in pricing_test.go that
call ptr(...) or intPtr(...) to call bifrost.Ptr(...) and import or reference
the bifrost package if not already imported.

In `@ui/app/workspace/logs/sheets/logDetailsSheet.tsx`:
- Line 341: The cost display currently uses the raw value in LogEntryDetailsView
(value={log.cost != null ? `$${log.cost}` : "-"}) which can produce long
decimals; update the value expression to format log.cost to a fixed number of
decimal places (e.g., using toFixed(2) or a helper formatter that adapts
precision) before prefixing with "$", ensuring null/undefined still renders "-"
and update any helper name (e.g., formatCost) if you extract formatting logic
for reuse.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8bd3ad8a-d796-4bd8-9283-ad593566628b

📥 Commits

Reviewing files that changed from the base of the PR and between 0af7f9d and a29ad2f.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • plugins/governance/main.go
  • framework/streaming/transcription.go
  • core/providers/azure/azure.go
  • framework/streaming/audio.go
  • framework/streaming/images.go
  • plugins/logging/main.go
  • framework/modelcatalog/overrides_test.go
  • core/schemas/provider.go
  • framework/tracing/tracer.go
  • core/providers/anthropic/anthropic.go

Comment thread core/schemas/images.go
Comment thread framework/configstore/migrations.go Outdated
Comment thread framework/modelcatalog/pricing.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/bifrost.go (1)

4967-4999: ⚠️ Potential issue | 🟠 Major

Guard BackfillParams on nil provider payloads.

The image request wrappers already handle nil provider responses later, but these new calls run before that guard. A provider returning (nil, nil) will now panic here instead of surfacing the existing error path; SpeechRequest gets the same failure mode.

🩹 Suggested fix
 case schemas.SpeechRequest:
 	speechResponse, bifrostError := provider.Speech(req.Context, key, req.BifrostRequest.SpeechRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	speechResponse.BackfillParams(req.BifrostRequest.SpeechRequest)
+	if speechResponse != nil {
+		speechResponse.BackfillParams(req.BifrostRequest.SpeechRequest)
+	}
 	response.SpeechResponse = speechResponse
@@
 case schemas.ImageGenerationRequest:
 	imageResponse, bifrostError := provider.ImageGeneration(req.Context, key, req.BifrostRequest.ImageGenerationRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	imageResponse.BackfillParams(&req.BifrostRequest)
+	if imageResponse != nil {
+		imageResponse.BackfillParams(&req.BifrostRequest)
+	}
 	response.ImageGenerationResponse = imageResponse
@@
 case schemas.ImageEditRequest:
 	imageEditResponse, bifrostError := provider.ImageEdit(req.Context, key, req.BifrostRequest.ImageEditRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	imageEditResponse.BackfillParams(&req.BifrostRequest)
+	if imageEditResponse != nil {
+		imageEditResponse.BackfillParams(&req.BifrostRequest)
+	}
 	response.ImageGenerationResponse = imageEditResponse
@@
 case schemas.ImageVariationRequest:
 	imageVariationResponse, bifrostError := provider.ImageVariation(req.Context, key, req.BifrostRequest.ImageVariationRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	imageVariationResponse.BackfillParams(&req.BifrostRequest)
+	if imageVariationResponse != nil {
+		imageVariationResponse.BackfillParams(&req.BifrostRequest)
+	}
 	response.ImageGenerationResponse = imageVariationResponse
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4967 - 4999, The new BackfillParams calls
should be guarded against nil provider payloads to avoid panics when a provider
returns (nil, nil); update the handlers for Speech (provider.Speech ->
speechResponse), ImageGeneration (provider.ImageGeneration -> imageResponse),
ImageEdit (provider.ImageEdit -> imageEditResponse) and ImageVariation
(provider.ImageVariation -> imageVariationResponse) to check the returned
response is non-nil before invoking response.BackfillParams(...), and only call
BackfillParams when the response object exists, leaving the existing nil/err
handling path intact.
🧹 Nitpick comments (4)
core/providers/huggingface/huggingface.go (1)

1031-1339: Extract the shared fal-ai SSE pipeline into a private helper.

ImageGenerationStream and ImageEditStream are now near-clones. The transport setup, SSE loop, cancellation/error handling, raw request/response plumbing, and completion logic should live behind one private helper with small request-specific callbacks/params. That will make future fixes much harder to miss in one path.

Also applies to: 1435-1753

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` around lines 1031 - 1339,
ImageGenerationStream and ImageEditStream share nearly identical fal-ai SSE
transport, parsing, cancellation, error plumbing and completion logic; extract
that shared behavior into a private helper (e.g. handleFalAISSEStream or
streamFalAI) and have ImageGenerationStream and ImageEditStream call it with
small, request-specific callbacks/params. The helper should accept context,
postHookRunner, key, modelName, providerName, requestType
(schemas.ImageGenerationStreamRequest / schemas.ImageEditStreamRequest), the
prepared jsonBody, the fasthttp req/resp setup (or build them inside using the
passed modelName/headers), and three callbacks: 1) a parse callback to unmarshal
each jsonData into a typed response, 2) an item-processing callback invoked for
each parsed item to build and send partial chunks (maintaining chunkIndex, lastX
vars), and 3) a finalize callback to construct/send the completion chunk
(preserving raw request/response flags). Move cancellation
(SetupStreamCancellation), gzip decompression (DecompressStreamBody), SSE
reading (GetSSEDataReader), error branches (process provider errors, EOF, read
errors), response header extraction, and providerUtils
ReleaseStreamingResponse/close channel logic into the helper so both
ImageGenerationStream and ImageEditStream only provide the request-specific
parse/emit/finalize logic and call the helper to return the response channel and
error.
framework/modelcatalog/pricing_test.go (1)

16-17: Use the repository-standard pointer helper here.

This file introduces custom ptr/intPtr helpers and later still mixes in raw &value pointers. Please switch these tests to bifrost.Ptr() and drop the ad-hoc helpers so pointer creation stays consistent with the rest of the Go test utilities.

Based on learnings, "prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Replace the
ad-hoc pointer helpers ptr and intPtr with the repository-standard bifrost.Ptr
utility: remove the ptr and intPtr functions from this test and update every
use-site that calls ptr(...), intPtr(...), or uses raw &value pointer
expressions to instead call bifrost.Ptr(value) (using the generic bifrost.Ptr
for both float64 and int). Ensure all imports include the bifrost package if not
already present and remove the now-unused local helper functions (ptr, intPtr).
core/providers/openai/openai.go (1)

4400-4401: Capture startTime before dispatching the image-edit request.

startTime is initialized only after client.Do(req, resp) has already completed, so the “total latency” reported on Line 4610 excludes connect/TTFB time and under-reports end-to-end latency compared with the image-generation stream path.

🐛 Proposed fix
+	// Capture end-to-end latency before dispatching the request.
+	startTime := time.Now()
 	// Make the request
 	err := client.Do(req, resp)
-		startTime := time.Now()
-		lastChunkTime := time.Now()
+		lastChunkTime := startTime
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 4400 - 4401, Move the startTime
capture so it occurs immediately before dispatching the image-edit HTTP request
(i.e., before the call to client.Do(req, resp)); currently startTime is set
after client.Do which omits connect/TTFB from the reported latency. Initialize
startTime (and lastChunkTime if meant to represent the same baseline) right
before client.Do(req, resp) so the subsequent latency calculations use the true
request start time.
framework/modelcatalog/main.go (1)

139-152: Use encoding/json instead of sonic in custom UnmarshalJSON.

The coding guideline specifies: use sonic for hot-path marshaling, but use encoding/json for custom MarshalJSON/UnmarshalJSON implementations. Using sonic inside a custom unmarshaler can cause unexpected behavior when the caller uses a different JSON library.

♻️ Proposed fix
-	if err := sonic.Unmarshal(data, &raw); err != nil {
+	if err := json.Unmarshal(data, &raw); err != nil {
 		return err
 	}

After this change, the sonic import at line 13 can be removed if it's no longer used elsewhere in the file.

As per coding guidelines: "Use sonic (github.com/bytedance/sonic) for JSON marshaling in hot paths; use encoding/json for custom marshaling logic with MarshalJSON/UnmarshalJSON"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 139 - 152, Replace the use of
sonic.Unmarshal inside the custom UnmarshalJSON method for PricingEntry with
encoding/json's json.Unmarshal: in the PricingEntry.UnmarshalJSON function,
import and call json.Unmarshal(data, &raw) instead of sonic.Unmarshal to avoid
mixing JSON libraries in custom unmarshaling; after changing the call, remove
the sonic import if it is no longer referenced elsewhere in the file (keep
references to the type alias PricingEntryAlias and the raw struct intact).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 1305-1328: The test currently doesn't exercise the cache-rate
branches because the BifrostLLMUsage passed to makeChatResponse lacks
PromptTokensDetails; update this test (or add a sibling test) to include prompt
token breakdown with cached read/write counts so CalculateCost hits
CacheReadInputTokenCostAbove200kTokens and
CacheCreationInputTokenCostAbove200kTokens — e.g., set resp's
PromptTokensDetails (or the equivalent field used by makeChatResponse) to
include CachedReadTokens and CachedCreationTokens values that push the
cached-token counts above the 200k threshold so the assertion validates the
tiered cache pricing paths.

In `@framework/modelcatalog/pricing.go`:
- Around line 654-662: tieredCacheCreationInputTokenRate currently ignores the
1-hour cache-write tier; change its signature to accept a TTL/flag (e.g.,
isOneHour bool or ttlHours int) and use it to consult
CacheCreationInputTokenCostAbove1hr and
CacheCreationInputTokenCostAbove1hrAbove200kTokens before falling back to the
existing Above200K and default CacheCreationInputTokenCost logic or
tieredInputRate; specifically, in tieredCacheCreationInputTokenRate check the
1-hour >200k field (CacheCreationInputTokenCostAbove1hrAbove200kTokens) then the
1-hour field (CacheCreationInputTokenCostAbove1hr) when isOneHour is true,
otherwise retain current behavior, and update callers to pass the TTL/flag
because costInput does not currently carry TTL information.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 4967-4999: The new BackfillParams calls should be guarded against
nil provider payloads to avoid panics when a provider returns (nil, nil); update
the handlers for Speech (provider.Speech -> speechResponse), ImageGeneration
(provider.ImageGeneration -> imageResponse), ImageEdit (provider.ImageEdit ->
imageEditResponse) and ImageVariation (provider.ImageVariation ->
imageVariationResponse) to check the returned response is non-nil before
invoking response.BackfillParams(...), and only call BackfillParams when the
response object exists, leaving the existing nil/err handling path intact.

---

Nitpick comments:
In `@core/providers/huggingface/huggingface.go`:
- Around line 1031-1339: ImageGenerationStream and ImageEditStream share nearly
identical fal-ai SSE transport, parsing, cancellation, error plumbing and
completion logic; extract that shared behavior into a private helper (e.g.
handleFalAISSEStream or streamFalAI) and have ImageGenerationStream and
ImageEditStream call it with small, request-specific callbacks/params. The
helper should accept context, postHookRunner, key, modelName, providerName,
requestType (schemas.ImageGenerationStreamRequest /
schemas.ImageEditStreamRequest), the prepared jsonBody, the fasthttp req/resp
setup (or build them inside using the passed modelName/headers), and three
callbacks: 1) a parse callback to unmarshal each jsonData into a typed response,
2) an item-processing callback invoked for each parsed item to build and send
partial chunks (maintaining chunkIndex, lastX vars), and 3) a finalize callback
to construct/send the completion chunk (preserving raw request/response flags).
Move cancellation (SetupStreamCancellation), gzip decompression
(DecompressStreamBody), SSE reading (GetSSEDataReader), error branches (process
provider errors, EOF, read errors), response header extraction, and
providerUtils ReleaseStreamingResponse/close channel logic into the helper so
both ImageGenerationStream and ImageEditStream only provide the request-specific
parse/emit/finalize logic and call the helper to return the response channel and
error.

In `@core/providers/openai/openai.go`:
- Around line 4400-4401: Move the startTime capture so it occurs immediately
before dispatching the image-edit HTTP request (i.e., before the call to
client.Do(req, resp)); currently startTime is set after client.Do which omits
connect/TTFB from the reported latency. Initialize startTime (and lastChunkTime
if meant to represent the same baseline) right before client.Do(req, resp) so
the subsequent latency calculations use the true request start time.

In `@framework/modelcatalog/main.go`:
- Around line 139-152: Replace the use of sonic.Unmarshal inside the custom
UnmarshalJSON method for PricingEntry with encoding/json's json.Unmarshal: in
the PricingEntry.UnmarshalJSON function, import and call json.Unmarshal(data,
&raw) instead of sonic.Unmarshal to avoid mixing JSON libraries in custom
unmarshaling; after changing the call, remove the sonic import if it is no
longer referenced elsewhere in the file (keep references to the type alias
PricingEntryAlias and the raw struct intact).

In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Replace the ad-hoc pointer helpers ptr and intPtr with the
repository-standard bifrost.Ptr utility: remove the ptr and intPtr functions
from this test and update every use-site that calls ptr(...), intPtr(...), or
uses raw &value pointer expressions to instead call bifrost.Ptr(value) (using
the generic bifrost.Ptr for both float64 and int). Ensure all imports include
the bifrost package if not already present and remove the now-unused local
helper functions (ptr, intPtr).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3af8a3ca-7020-4622-8543-fafdc4caeade

📥 Commits

Reviewing files that changed from the base of the PR and between a29ad2f and 01fe6c5.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (13)
  • core/providers/azure/azure.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • framework/configstore/migrations.go
  • framework/streaming/images.go
  • framework/streaming/chat.go
  • framework/streaming/transcription.go
  • plugins/logging/main.go
  • framework/streaming/responses.go
  • docs/contributing/setting-up-repo.mdx
  • plugins/telemetry/main.go
  • core/providers/gemini/gemini.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx

Comment thread framework/modelcatalog/pricing_test.go
Comment thread framework/modelcatalog/pricing.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 5e6a6b8 to c017629 Compare March 12, 2026 13:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/bifrost.go (1)

4969-5001: ⚠️ Potential issue | 🔴 Critical

Guard BackfillParams when the provider returns a nil response.

These calls convert a nil, nil provider return into a worker panic. The image request paths in this same file already treat a nil response as a recoverable error later, so this should stay on the normal error path instead of crashing before handleRequest returns.

🐛 Proposed fix
 case schemas.SpeechRequest:
 	speechResponse, bifrostError := provider.Speech(req.Context, key, req.BifrostRequest.SpeechRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	speechResponse.BackfillParams(req.BifrostRequest.SpeechRequest)
+	if speechResponse != nil {
+		speechResponse.BackfillParams(req.BifrostRequest.SpeechRequest)
+	}
 	response.SpeechResponse = speechResponse
 ...
 case schemas.ImageGenerationRequest:
 	imageResponse, bifrostError := provider.ImageGeneration(req.Context, key, req.BifrostRequest.ImageGenerationRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	imageResponse.BackfillParams(&req.BifrostRequest)
+	if imageResponse != nil {
+		imageResponse.BackfillParams(&req.BifrostRequest)
+	}
 	response.ImageGenerationResponse = imageResponse
 ...
 case schemas.ImageEditRequest:
 	imageEditResponse, bifrostError := provider.ImageEdit(req.Context, key, req.BifrostRequest.ImageEditRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	imageEditResponse.BackfillParams(&req.BifrostRequest)
+	if imageEditResponse != nil {
+		imageEditResponse.BackfillParams(&req.BifrostRequest)
+	}
 	response.ImageGenerationResponse = imageEditResponse
 ...
 case schemas.ImageVariationRequest:
 	imageVariationResponse, bifrostError := provider.ImageVariation(req.Context, key, req.BifrostRequest.ImageVariationRequest)
 	if bifrostError != nil {
 		return nil, bifrostError
 	}
-	imageVariationResponse.BackfillParams(&req.BifrostRequest)
+	if imageVariationResponse != nil {
+		imageVariationResponse.BackfillParams(&req.BifrostRequest)
+	}
 	response.ImageGenerationResponse = imageVariationResponse
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/bifrost.go` around lines 4969 - 5001, The code calls BackfillParams on
provider responses without checking for nil, which panics when a provider
returns (nil, nil); update each case that calls BackfillParams (variables
speechResponse, imageResponse, imageEditResponse, imageVariationResponse
produced by provider.Speech, provider.ImageGeneration, provider.ImageEdit,
provider.ImageVariation) to only invoke response.BackfillParams(...) when the
response is non-nil (i.e., guard with a nil check) so nil responses follow the
existing recoverable error path instead of causing a crash.
core/providers/huggingface/huggingface.go (1)

1070-1106: ⚠️ Potential issue | 🟠 Major

Preserve context header override precedence in the inlined stream setup.

providerUtils.SetExtraHeaders() is documented to give schemas.BifrostContextKeyExtraHeaders priority, but this block calls it first and then unconditionally re-sets Authorization, Accept, Content-Type, and Cache-Control. Any caller trying to override those headers on this streaming path will silently lose the override.

Suggested fix
-	// Set any extra headers from network config
-	providerUtils.SetExtraHeaders(ctx, req, provider.networkConfig.ExtraHeaders, nil)
-
 	// Set headers
 	for key, value := range headers {
 		req.Header.Set(key, value)
 	}
+
+	// Apply network/context extra headers last so context overrides keep priority.
+	providerUtils.SetExtraHeaders(ctx, req, provider.networkConfig.ExtraHeaders, nil)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` around lines 1070 - 1106, The
current stream setup unconditionally resets headers after calling
providerUtils.SetExtraHeaders(), which violates the documented precedence
(schemas.BifrostContextKeyExtraHeaders should override defaults); modify the
sequence in the streaming path (around providerUtils.SetExtraHeaders, the local
headers map, and the req.Header.Set calls) so that you first apply the default
headers (Authorization, Accept, Content-Type, Cache-Control) into req only if
they are not already present, then call providerUtils.SetExtraHeaders(ctx, req,
provider.networkConfig.ExtraHeaders, nil) so context/network extra headers can
override them, and finally set any remaining headers from the local headers map
without overwriting existing header values.
♻️ Duplicate comments (4)
framework/modelcatalog/overrides_test.go (1)

26-26: ⚠️ Potential issue | 🟠 Major

Please re-enable the override-precedence tests.

These unconditional t.Skip() calls still disable the suite that exercises override matching and patch behavior, so this pricing refactor loses regression coverage on one of its most stack-sensitive paths. If a skip is temporarily necessary, make it conditional and document the prerequisite instead of turning the tests off unconditionally.

Also applies to: 38-38, 78-78, 111-111, 145-145, 172-172, 199-199, 231-231, 265-265, 298-298, 331-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/overrides_test.go` at line 26, Remove the
unconditional t.Skip() calls in overrides_test.go (the occurrences at lines
shown: the t.Skip() calls at 26, 38, 78, 111, 145, 172, 199, 231, 265, 298, 331)
and re-enable the override-precedence tests; if a temporary skip is required
make it conditional (e.g., check testing.Short() or a clearly named env/flag
like OVERRIDE_PRECEDENCE_TESTS_ENABLED) and add a short comment describing the
prerequisite/why it may be skipped so the suite remains covered and maintainers
can reproduce the condition. Ensure each test function that currently calls
t.Skip() uses the same conditional pattern and include the prerequisite note in
the test file header.
framework/modelcatalog/pricing_test.go (1)

1305-1328: ⚠️ Potential issue | 🟡 Minor

This case still doesn't exercise the >200k cache-rate branches.

CacheReadInputTokenCostAbove200kTokens and CacheCreationInputTokenCostAbove200kTokens are configured here, but the usage fixture has no PromptTokensDetails, so this only validates tiered base input/output pricing. Add cached read/write tokens here or in a sibling case so regressions in the 200k cache paths are actually covered.

♻️ One way to make this assertion hit the cache-tier logic
 	resp := makeChatResponse(schemas.Bedrock, "anthropic.claude-3-5-sonnet-20240620-v1:0", &schemas.BifrostLLMUsage{
 		PromptTokens:     190000,
 		CompletionTokens: 20000,
 		TotalTokens:      210000, // Above 200k
+		PromptTokensDetails: &schemas.ChatPromptTokensDetails{
+			CachedReadTokens:  50000,
+			CachedWriteTokens: 10000,
+		},
 	})

 	cost := mc.CalculateCost(resp)
-	// Tiered rate: input=0.000006, output=0.00003
-	// 190000*0.000006 + 20000*0.00003 = 1.14 + 0.6 = 1.74
-	assert.InDelta(t, 1.74, cost, 1e-9)
+	// Non-cached input: (190000-50000-10000)*0.000006 = 0.78
+	// Cached read: 50000*0.0000006 = 0.03
+	// Cached write: 10000*0.0000075 = 0.075
+	// Output: 20000*0.00003 = 0.6
+	// Total: 1.485
+	assert.InDelta(t, 1.485, cost, 1e-9)
core/providers/openai/openai.go (1)

4400-4401: ⚠️ Potential issue | 🟡 Minor

Start the image-edit latency timer before the provider call.

startTime is initialized only after the SSE response is already established, so the terminal chunk still excludes request dispatch and TTFB from the “total” latency. The image-generation stream path in this same file starts the timer before Do, and image edit should match that.

🐛 Proposed fix
-	// Make the request
-	err := client.Do(req, resp)
+	// Capture start time before making the HTTP request for end-to-end latency
+	startTime := time.Now()
+
+	// Make the request
+	err := client.Do(req, resp)
 	if err != nil {
 		defer providerUtils.ReleaseStreamingResponse(resp)
 		if errors.Is(err, context.Canceled) {
 			return nil, &schemas.BifrostError{
@@
-		startTime := time.Now()
-		lastChunkTime := time.Now()
+		lastChunkTime := startTime

Also applies to: 4610-4613

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 4400 - 4401, The image-edit
path initializes startTime and lastChunkTime after the SSE/stream has been
established, so move the startTime (and lastChunkTime if needed) initialization
to immediately before the provider HTTP request/Do call (the same place the
image-generation path does) so the measured "total" latency includes request
dispatch and TTFB; update the variables referenced in the image-edit stream
handler (startTime, lastChunkTime) accordingly and apply the same change in the
other occurrence around the code referenced at the second spot (the block at
4610-4613).
framework/modelcatalog/pricing.go (1)

11-20: ⚠️ Potential issue | 🟠 Major

Thread priority/service-tier through the pricing pipeline.

costInput still only carries usage counters and media metadata. Nothing in this file captures a priority/service-tier signal, and none of the tier helpers below accept one, so the new priority columns can never be selected. Priority requests will still be billed at standard rates.

At minimum, plumb a boolean/enum from the request/extra fields into costInput, then teach the text/cache helpers to switch between standard and priority rates off that signal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 11 - 20, Add a service-tier
signal to the pricing pipeline by adding a new field on costInput (e.g.,
ServiceTier string or Priority bool) and propagate it into all downstream
pricing helpers that currently accept a costInput (for example the text and
cache pricing helpers referenced below); update those helper signatures (e.g.,
computeTextCost, computeCacheCost or similar functions) to accept and branch on
the new ServiceTier/Priority value and select priority rates vs standard rates
accordingly, ensuring the boolean/enum is read from the request/extra fields and
passed when constructing costInput.
🧹 Nitpick comments (1)
framework/modelcatalog/pricing_test.go (1)

16-17: Use bifrost.Ptr() in these test helpers for consistency.

These helpers reintroduce &value in a repo that otherwise standardizes on bifrost.Ptr(), including test code paths.

Based on learnings, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even in test utilities.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Replace the
local test pointer helpers ptr and intPtr with calls to the standardized
bifrost.Ptr utility: update the functions ptr and intPtr to return
bifrost.Ptr(v) and bifrost.Ptr(v) (respectively, using the correct bifrost.Ptr
overloads/types) so tests use the repo-standard pointer constructor; ensure
imports include the bifrost package and remove direct uses of &value in these
helpers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@framework/modelcatalog/main.go`:
- Around line 139-170: The current PricingEntry.UnmarshalJSON unmarshal only the
tiered struct and fails on plain-number JSON; update PricingEntry.UnmarshalJSON
to first try decoding into the existing raw struct (as now) and if that
unmarshal fails or raw.SearchContextCostPerQuery is nil/empty, attempt a
fallback sonic.Unmarshal into a plain float64 variable and set
p.SearchContextCostPerQuery to its address; ensure you return the original error
only if both the tiered and plain-float unmarshal attempts fail, and keep all
existing logic that prefers Medium→Low→High when the tiered object is present.

---

Outside diff comments:
In `@core/bifrost.go`:
- Around line 4969-5001: The code calls BackfillParams on provider responses
without checking for nil, which panics when a provider returns (nil, nil);
update each case that calls BackfillParams (variables speechResponse,
imageResponse, imageEditResponse, imageVariationResponse produced by
provider.Speech, provider.ImageGeneration, provider.ImageEdit,
provider.ImageVariation) to only invoke response.BackfillParams(...) when the
response is non-nil (i.e., guard with a nil check) so nil responses follow the
existing recoverable error path instead of causing a crash.

In `@core/providers/huggingface/huggingface.go`:
- Around line 1070-1106: The current stream setup unconditionally resets headers
after calling providerUtils.SetExtraHeaders(), which violates the documented
precedence (schemas.BifrostContextKeyExtraHeaders should override defaults);
modify the sequence in the streaming path (around providerUtils.SetExtraHeaders,
the local headers map, and the req.Header.Set calls) so that you first apply the
default headers (Authorization, Accept, Content-Type, Cache-Control) into req
only if they are not already present, then call
providerUtils.SetExtraHeaders(ctx, req, provider.networkConfig.ExtraHeaders,
nil) so context/network extra headers can override them, and finally set any
remaining headers from the local headers map without overwriting existing header
values.

---

Duplicate comments:
In `@core/providers/openai/openai.go`:
- Around line 4400-4401: The image-edit path initializes startTime and
lastChunkTime after the SSE/stream has been established, so move the startTime
(and lastChunkTime if needed) initialization to immediately before the provider
HTTP request/Do call (the same place the image-generation path does) so the
measured "total" latency includes request dispatch and TTFB; update the
variables referenced in the image-edit stream handler (startTime, lastChunkTime)
accordingly and apply the same change in the other occurrence around the code
referenced at the second spot (the block at 4610-4613).

In `@framework/modelcatalog/overrides_test.go`:
- Line 26: Remove the unconditional t.Skip() calls in overrides_test.go (the
occurrences at lines shown: the t.Skip() calls at 26, 38, 78, 111, 145, 172,
199, 231, 265, 298, 331) and re-enable the override-precedence tests; if a
temporary skip is required make it conditional (e.g., check testing.Short() or a
clearly named env/flag like OVERRIDE_PRECEDENCE_TESTS_ENABLED) and add a short
comment describing the prerequisite/why it may be skipped so the suite remains
covered and maintainers can reproduce the condition. Ensure each test function
that currently calls t.Skip() uses the same conditional pattern and include the
prerequisite note in the test file header.

In `@framework/modelcatalog/pricing.go`:
- Around line 11-20: Add a service-tier signal to the pricing pipeline by adding
a new field on costInput (e.g., ServiceTier string or Priority bool) and
propagate it into all downstream pricing helpers that currently accept a
costInput (for example the text and cache pricing helpers referenced below);
update those helper signatures (e.g., computeTextCost, computeCacheCost or
similar functions) to accept and branch on the new ServiceTier/Priority value
and select priority rates vs standard rates accordingly, ensuring the
boolean/enum is read from the request/extra fields and passed when constructing
costInput.

---

Nitpick comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Replace the local test pointer helpers ptr and intPtr with
calls to the standardized bifrost.Ptr utility: update the functions ptr and
intPtr to return bifrost.Ptr(v) and bifrost.Ptr(v) (respectively, using the
correct bifrost.Ptr overloads/types) so tests use the repo-standard pointer
constructor; ensure imports include the bifrost package and remove direct uses
of &value in these helpers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d622707f-b3dd-4089-a045-468db84a101b

📥 Commits

Reviewing files that changed from the base of the PR and between 01fe6c5 and c017629.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • core/providers/azure/azure.go
  • plugins/logging/main.go
  • docs/contributing/setting-up-repo.mdx
  • framework/tracing/tracer.go
  • framework/streaming/audio.go
  • plugins/governance/main.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • core/schemas/provider.go
  • core/providers/anthropic/anthropic.go
  • framework/streaming/responses.go

Comment thread framework/modelcatalog/main.go
@TejasGhatte TejasGhatte mentioned this pull request Mar 12, 2026
10 tasks
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from c017629 to dbf2248 Compare March 12, 2026 17:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
framework/modelcatalog/main.go (1)

349-373: ⚠️ Potential issue | 🟠 Major

Route this helper through the unified pricing resolver.

This still walks a hardcoded mode order and returns the first exact-key hit from mc.pricingData. After this refactor, that can diverge from the unified cost path whenever a model/provider has multiple mode rows or only resolves via fallback, so callers can still get the wrong entry or nil. Please delegate to the same mode-aware lookup the pricing engine uses instead of maintaining a second resolution path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 349 - 373,
GetPricingEntryForModel currently iterates hardcoded request modes and looks up
mc.pricingData (using makeKey and convertTableModelPricingToPricingData), which
can diverge from the unified pricing engine; instead route this helper through
the central pricing resolver used by the pricing engine. Replace the manual loop
and direct mc.pricingData access with a call to the shared resolver on
ModelCatalog (e.g., mc.pricingResolver.Resolve or the existing method used by
the engine) while preserving the RLock/RUnlock, then convert/return the
resolver's result using convertTableModelPricingToPricingData (or return nil if
the resolver yields none).
♻️ Duplicate comments (2)
plugins/logging/operations.go (1)

949-1012: ⚠️ Potential issue | 🟠 Major

Non-token logs can still never reach the new modality-specific pricing path.

The usage == nil guard on Line 952 still returns before the speech/transcription/image/video restore blocks below can help, and this function only guarantees deserialization of token/cache fields before reading ...OutputParsed. Logs billed from duration, image count, or character count can still be skipped even though the serialized response already contains the data needed to recalculate cost.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 949 - 1012, The early return when
usage is nil (the usage == nil && (cacheDebug == nil || !cacheDebug.CacheHit)
guard) prevents modality-specific restores from running for logs billed by
duration/characters/images; change the guard so we only return when there is
truly no way to compute cost — i.e. only return if usage is nil AND there's no
cache hit AND none of the modality parsed outputs exist on logEntry (check
logEntry.TranscriptionOutputParsed, logEntry.ImageGenerationOutputParsed,
logEntry.VideoGenerationOutputParsed, logEntry.SpeechOutputParsed); keep using
buildResponseForRequestType(requestType, usage, extraFields) with usage possibly
nil so the subsequent resp.* restore blocks can populate modality fields before
cost extraction.
framework/modelcatalog/pricing.go (1)

654-662: ⚠️ Potential issue | 🟠 Major

The 1-hour cache-write pricing path is still unreachable.

tieredCacheCreationInputTokenRate() only selects the default and above_200k cache-write columns. The new cache_creation_input_token_cost_above_1hr / cache_creation_input_token_cost_above_1hr_above_200k_tokens fields are never consulted, and nothing in costInput carries the cache TTL needed to choose them. Requests using the 1-hour prompt-cache tier will still bill at the standard cache-creation rate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 654 - 662,
tieredCacheCreationInputTokenRate currently only checks
CacheCreationInputTokenCost and CacheCreationInputTokenCostAbove200kTokens, so
the 1-hour cache-write rates are never used; update this function (or its
callers) to accept or read the cache TTL from costInput and branch to use
CacheCreationInputTokenCostAbove1hr and
CacheCreationInputTokenCostAbove1hrAbove200kTokens when the request TTL >= 1
hour, falling back to the existing above_200k and default fields otherwise;
ensure you reference and check the nullable fields
CacheCreationInputTokenCostAbove1hr and
CacheCreationInputTokenCostAbove1hrAbove200kTokens and prefer them when present
for TTL >= 1h, otherwise continue existing logic in
tieredCacheCreationInputTokenRate (or pass TTL through to tieredInputRate if
that’s the canonical path).
🧹 Nitpick comments (3)
framework/configstore/migrations.go (1)

4139-4166: Extract the pricing column list once.

The same 27-column slice is duplicated in both migrate and rollback. Hoisting it into one shared list will reduce drift as this schema keeps expanding across framework/configstore/tables/modelpricing.go and framework/modelcatalog/utils.go.

Also applies to: 4181-4208

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 4139 - 4166, The pricing
column slice is duplicated in the migrate and rollback blocks in migrations.go;
extract that 27-item slice into a single shared variable (e.g., pricingColumns
or modelPricingColumns) declared once at the top of the file (or package scope)
and replace the inline slices in both the migrate and rollback code paths with a
reference to that variable; ensure the variable name matches usage in both
blocks and keep the same column strings so schema behavior is unchanged, and
consider adding a short comment pointing to
framework/configstore/tables/modelpricing.go and framework/modelcatalog/utils.go
for future synchronization.
framework/modelcatalog/utils.go (1)

92-153: Add a round-trip guard for the expanded pricing mapping.

These two converters now hand-maintain the same pricing surface in opposite directions. A table-driven round-trip test or shared mapping helper would make future field additions much less likely to silently disappear on DB round-trips.

Also applies to: 157-218

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/utils.go` around lines 92 - 153, The pricing struct
mapping is duplicated in opposite converters causing silent drift; add a shared
mapping helper and a round-trip unit test to prevent regressions. Refactor the
two places that build/parse configstoreTables.TableModelPricing (the converter
that constructs TableModelPricing from entry and its inverse) into a single pair
of functions (e.g., ToTableModelPricing(entry) and FromTableModelPricing(table))
and replace the inline mappings with calls to those helpers; then add a
table-driven round-trip test that iterates all pricing fields to ensure
FromTableModelPricing(ToTableModelPricing(entry)) === entry (also apply same
change to the other mapping region referenced around lines 157-218).
framework/modelcatalog/main.go (1)

63-64: Align the search_context_cost_per_query comments with the real wire format.

The docs here still say this field may be a plain float and that the struct mirrors the datasheet exactly, but this type actually collapses the tiered datasheet object into a single *float64. Please tighten the comments so they describe the tiered-only input and the collapse explicitly.

📝 Suggested wording
-// Field names and JSON tags match the datasheet schema exactly.
+// Field names and JSON tags mostly match the datasheet schema directly.
+// search_context_cost_per_query is decoded from the tiered datasheet object
+// and collapsed to a single value on read.
@@
-// It handles the special case where search_context_cost_per_query may arrive as either
-// a plain float64 or a tiered object {"search_context_size_low":...,
-// "search_context_size_medium":..., "search_context_size_high":...}.
+// It handles the tiered search_context_cost_per_query datasheet object
+// {"search_context_size_low":..., "search_context_size_medium":...,
+// "search_context_size_high":...} and collapses it to one float64.

Based on learnings, search_context_cost_per_query is always represented as a tiered object {"search_context_size_low": ..., "search_context_size_medium": ..., "search_context_size_high": ...} and will never be a plain float64.

Also applies to: 131-138

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 63 - 64, Update the comment on
the PricingEntry struct field search_context_cost_per_query to reflect the
actual wire format and the collapse behavior: note that the input is always a
tiered object (keys like "search_context_size_low",
"search_context_size_medium", "search_context_size_high") rather than a plain
float, and document that the code collapses that tiered object into a single
*float64; apply the same clarified comment wording to the other occurrence
referenced (lines covering the alternate declaration around the 131-138 region)
so both spots describe tiered-only input and the collapse explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/openai/openai.go`:
- Around line 4400-4401: The startTime is being set after stream setup so
measured latency omits request/handshake time; move the initialization of
startTime (and keep lastChunkTime) to before the outbound HTTP request is
performed (i.e., before calling client.Do(req, resp) and before spawning the
goroutine that processes chunks) in the image-edit/stream path, and apply the
same change to the other occurrence around the lines referencing
startTime/lastChunkTime (the second block at ~4610-4613); ensure existing code
still uses lastChunkTime for per-chunk deltas and that time.Since(startTime)
then reflects total end-to-end latency from just before the request.

In `@framework/configstore/tables/modelpricing.go`:
- Line 7: The struct field BaseModel in modelpricing.go must be made nullable in
Go by changing its type from string to *string so SQL NULL is preserved (retain
the gorm tag `type:varchar(255);default:null` and the
`json:"base_model,omitempty"` tag); update any code that constructs, reads or
compares ModelPricing.BaseModel (e.g., initializers, setters, equality checks,
DB scan/assignment sites) to handle nil pointers (use nil checks or helper
functions) so the unset vs empty-string distinction is preserved on round-trip.

In `@plugins/logging/operations.go`:
- Around line 1122-1129: The switch currently handles
schemas.VideoGenerationRequest but not VideoRemix, so VideoRemix responses fall
through to the chat/default branch and lose Seconds and pricing path; add a
branch that treats VideoRemix like the video branch (return
&schemas.BifrostResponse{VideoGenerationResponse:
&schemas.BifrostVideoGenerationResponse{ExtraFields: extra}}) so VideoRemix uses
BifrostVideoGenerationResponse; ensure you reference the VideoRemix symbol and
preserve the existing comment about Seconds being patched from
VideoGenerationOutputParsed by the caller.

---

Outside diff comments:
In `@framework/modelcatalog/main.go`:
- Around line 349-373: GetPricingEntryForModel currently iterates hardcoded
request modes and looks up mc.pricingData (using makeKey and
convertTableModelPricingToPricingData), which can diverge from the unified
pricing engine; instead route this helper through the central pricing resolver
used by the pricing engine. Replace the manual loop and direct mc.pricingData
access with a call to the shared resolver on ModelCatalog (e.g.,
mc.pricingResolver.Resolve or the existing method used by the engine) while
preserving the RLock/RUnlock, then convert/return the resolver's result using
convertTableModelPricingToPricingData (or return nil if the resolver yields
none).

---

Duplicate comments:
In `@framework/modelcatalog/pricing.go`:
- Around line 654-662: tieredCacheCreationInputTokenRate currently only checks
CacheCreationInputTokenCost and CacheCreationInputTokenCostAbove200kTokens, so
the 1-hour cache-write rates are never used; update this function (or its
callers) to accept or read the cache TTL from costInput and branch to use
CacheCreationInputTokenCostAbove1hr and
CacheCreationInputTokenCostAbove1hrAbove200kTokens when the request TTL >= 1
hour, falling back to the existing above_200k and default fields otherwise;
ensure you reference and check the nullable fields
CacheCreationInputTokenCostAbove1hr and
CacheCreationInputTokenCostAbove1hrAbove200kTokens and prefer them when present
for TTL >= 1h, otherwise continue existing logic in
tieredCacheCreationInputTokenRate (or pass TTL through to tieredInputRate if
that’s the canonical path).

In `@plugins/logging/operations.go`:
- Around line 949-1012: The early return when usage is nil (the usage == nil &&
(cacheDebug == nil || !cacheDebug.CacheHit) guard) prevents modality-specific
restores from running for logs billed by duration/characters/images; change the
guard so we only return when there is truly no way to compute cost — i.e. only
return if usage is nil AND there's no cache hit AND none of the modality parsed
outputs exist on logEntry (check logEntry.TranscriptionOutputParsed,
logEntry.ImageGenerationOutputParsed, logEntry.VideoGenerationOutputParsed,
logEntry.SpeechOutputParsed); keep using
buildResponseForRequestType(requestType, usage, extraFields) with usage possibly
nil so the subsequent resp.* restore blocks can populate modality fields before
cost extraction.

---

Nitpick comments:
In `@framework/configstore/migrations.go`:
- Around line 4139-4166: The pricing column slice is duplicated in the migrate
and rollback blocks in migrations.go; extract that 27-item slice into a single
shared variable (e.g., pricingColumns or modelPricingColumns) declared once at
the top of the file (or package scope) and replace the inline slices in both the
migrate and rollback code paths with a reference to that variable; ensure the
variable name matches usage in both blocks and keep the same column strings so
schema behavior is unchanged, and consider adding a short comment pointing to
framework/configstore/tables/modelpricing.go and framework/modelcatalog/utils.go
for future synchronization.

In `@framework/modelcatalog/main.go`:
- Around line 63-64: Update the comment on the PricingEntry struct field
search_context_cost_per_query to reflect the actual wire format and the collapse
behavior: note that the input is always a tiered object (keys like
"search_context_size_low", "search_context_size_medium",
"search_context_size_high") rather than a plain float, and document that the
code collapses that tiered object into a single *float64; apply the same
clarified comment wording to the other occurrence referenced (lines covering the
alternate declaration around the 131-138 region) so both spots describe
tiered-only input and the collapse explicitly.

In `@framework/modelcatalog/utils.go`:
- Around line 92-153: The pricing struct mapping is duplicated in opposite
converters causing silent drift; add a shared mapping helper and a round-trip
unit test to prevent regressions. Refactor the two places that build/parse
configstoreTables.TableModelPricing (the converter that constructs
TableModelPricing from entry and its inverse) into a single pair of functions
(e.g., ToTableModelPricing(entry) and FromTableModelPricing(table)) and replace
the inline mappings with calls to those helpers; then add a table-driven
round-trip test that iterates all pricing fields to ensure
FromTableModelPricing(ToTableModelPricing(entry)) === entry (also apply same
change to the other mapping region referenced around lines 157-218).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 01c9f873-6f93-4f40-89f5-3e959154a450

📥 Commits

Reviewing files that changed from the base of the PR and between c017629 and dbf2248.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (13)
  • plugins/logging/main.go
  • plugins/telemetry/main.go
  • framework/streaming/audio.go
  • framework/modelcatalog/overrides_test.go
  • framework/streaming/transcription.go
  • core/bifrost.go
  • core/providers/azure/azure.go
  • framework/streaming/images.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • framework/tracing/tracer.go
  • framework/modelcatalog/pricing_test.go
  • docs/contributing/setting-up-repo.mdx
  • core/providers/anthropic/anthropic.go

Comment thread core/providers/openai/openai.go Outdated
Comment thread framework/configstore/tables/modelpricing.go
Comment thread plugins/logging/operations.go Outdated
@TejasGhatte TejasGhatte force-pushed the 02-26-refactor_pricing_module_refactor branch from dbf2248 to c017629 Compare March 12, 2026 17:55
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from c017629 to 76d7e6d Compare March 12, 2026 20:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/huggingface/huggingface.go (1)

1112-1116: ⚠️ Potential issue | 🟡 Minor

Move startTime to after the stream handshake in both handlers.

Both ImageGenerationStream and ImageEditStream stamp startTime before provider.client.Do, so the first-chunk and final latencies include connect/headers time instead of just streaming time. That makes these metrics inconsistent with the rest of the stack’s streaming latency semantics.

Initialize startTime inside the goroutine after the body stream/reader is ready, then seed lastChunkTime from that value.

Based on learnings: "In Go streaming handlers under core/providers (e.g., core/providers/openai/openai.go), initialize the streaming startTime inside the streaming goroutine after client.Do and the stream is set up, so the final-chunk latency reflects post-handshake streaming time. Compute per-chunk deltas using lastChunkTime."

Also applies to: 1185-1186, 1303-1312, 1525-1529, 1598-1599, 1716-1725

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` around lines 1112 - 1116, Move the
streaming startTime initialization out of the pre-request section and into the
goroutine after provider.client.Do and after the response body/reader is
successfully established in both ImageGenerationStream and ImageEditStream (and
the other streaming handlers referenced); set startTime := time.Now() there,
initialize lastChunkTime = startTime, and use lastChunkTime to compute per-chunk
deltas and the final-chunk latency so metrics reflect only post-handshake
streaming time.
docs/architecture/framework/model-catalog.mdx (1)

67-72: ⚠️ Potential issue | 🟡 Minor

Use pointers in these Config examples.

modelcatalog.Config.PricingURL is *string and PricingSyncInterval is *time.Duration. The examples at lines 67-72 and 309-314 assign non-pointer values directly, which will not compile.

Fix
+pricingURL := "https://my-custom-url.com/pricing.json"
 config := &modelcatalog.Config{
-    PricingURL: "https://my-custom-url.com/pricing.json",
+    PricingURL: &pricingURL,
 }
 ...
+syncInterval := 12 * time.Hour
 newConfig := &modelcatalog.Config{
-    PricingSyncInterval: 12 * time.Hour,
+    PricingSyncInterval: &syncInterval,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/framework/model-catalog.mdx` around lines 67 - 72, The
examples create a modelcatalog.Config with non-pointer fields but PricingURL and
PricingSyncInterval are declared as *string and *time.Duration; update the
examples that construct modelcatalog.Config so they provide pointers (e.g.,
create local variables for the string and duration and take their addresses, or
use a small helper like ptr.String/ptr.Duration) and pass those pointer values
into PricingURL and PricingSyncInterval before calling modelcatalog.Init; ensure
you update both example blocks that build modelcatalog.Config so they compile.
framework/modelcatalog/main.go (1)

340-360: ⚠️ Potential issue | 🟠 Major

Route GetPricingEntryForModel through getPricing.

This helper still does direct map lookups, so it can return nil for model/provider pairs that CalculateCost can resolve via the new fallback path in framework/modelcatalog/pricing.go (for example provider-specific aliases and other lookup retries). That leaves the public catalog API out of sync with the actual billing engine.

Suggested change
 func (mc *ModelCatalog) GetPricingEntryForModel(model string, provider schemas.ModelProvider) *PricingEntry {
-	mc.mu.RLock()
-	defer mc.mu.RUnlock()
 	// Check all modes
 	for _, mode := range []schemas.RequestType{
 		schemas.TextCompletionRequest,
 		schemas.ChatCompletionRequest,
 		schemas.ResponsesRequest,
@@
 		schemas.ImageEditRequest,
 		schemas.ImageVariationRequest,
 		schemas.VideoGenerationRequest,
 	} {
-		key := makeKey(model, string(provider), normalizeRequestType(mode))
-		pricing, ok := mc.pricingData[key]
+		pricing, ok := mc.getPricing(model, string(provider), mode)
 		if ok {
-			return convertTableModelPricingToPricingData(&pricing)
+			return convertTableModelPricingToPricingData(pricing)
 		}
 	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/main.go` around lines 340 - 360, Replace the direct
map lookups in ModelCatalog.GetPricingEntryForModel with calls to the internal
lookup that applies the billing fallbacks: for each mode in the loop call
mc.getPricing(model, provider, normalizeRequestType(mode)) (or
mc.getPricing(model, provider, mode) if getPricing expects a RequestType)
instead of building makeKey and reading mc.pricingData; if getPricing returns a
non-nil pricing result, return convertTableModelPricingToPricingData(...) for
that result. Ensure you reference the existing symbols GetPricingEntryForModel,
getPricing, convertTableModelPricingToPricingData, makeKey/pricingData only to
remove the direct map usage so the public API matches CalculateCost's fallback
behavior.
♻️ Duplicate comments (3)
framework/modelcatalog/pricing_test.go (1)

1305-1328: ⚠️ Potential issue | 🟡 Minor

This still doesn't exercise the >200k cache-rate path.

The fixture sets CacheReadInputTokenCostAbove200kTokens and CacheCreationInputTokenCostAbove200kTokens, but usage has no PromptTokensDetails, so this assertion only proves tiered base input/output pricing. Add cached read/write tokens here or in a sibling case so the new cache-tier logic is actually covered.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 1305 - 1328, The test
doesn't exercise the cache-tier (>200k) pricing because the BifrostLLMUsage has
no PromptTokensDetails; update the makeChatResponse call (or a sibling test) to
populate PromptTokensDetails with cached token counts that push cached
reads/writes over the 200k threshold so the code paths using
CacheReadInputTokenCostAbove200kTokens and
CacheCreationInputTokenCostAbove200kTokens are executed (e.g., set
PromptTokensDetails.CachedReadTokens and/or CachedWriteTokens to values summing
>200000), then assert expected cost using CalculateCost.
framework/modelcatalog/pricing.go (2)

13-20: ⚠️ Potential issue | 🟠 Major

The 1-hour cache-write rates still can't be selected.

CacheCreationInputTokenCostAbove1hr and CacheCreationInputTokenCostAbove1hrAbove200kTokens are persisted in the schema, but no TTL/retention choice is extracted into costInput, and tieredCacheCreationInputTokenRate never checks those fields. Requests using the 1-hour prompt-cache tier will still be charged with the default cache-creation rate.

Also applies to: 281-282, 654-661

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 13 - 20, The cost logic never
reads the 1-hour cache-tier selection because the TTL/retention choice isn’t
carried in costInput and tieredCacheCreationInputTokenRate doesn’t consult the
1-hour fields; add a field to the costInput struct (e.g., cacheRetention or
cacheTTL enum/boolean) to represent the selected cache tier, populate that new
field where costInput is constructed, and update
tieredCacheCreationInputTokenRate to check CacheCreationInputTokenCostAbove1hr
and CacheCreationInputTokenCostAbove1hrAbove200kTokens when the new
cacheRetention indicates the 1-hour tier so the correct 1-hour rates are
returned.

13-20: ⚠️ Potential issue | 🟠 Major

Priority pricing columns are still unreachable.

The runtime path never carries a priority/service-tier signal, and the billing helpers only read the standard token/cache rates. InputCostPerTokenPriority, OutputCostPerTokenPriority, and CacheReadInputTokenCostPriority are therefore dead columns right now, so priority requests still bill at standard prices.

Also applies to: 279-282, 644-651

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 13 - 20, The
priority/service-tier signal is never threaded into pricing calculations, so the
priority price columns (InputCostPerTokenPriority, OutputCostPerTokenPriority,
CacheReadInputTokenCostPriority) are never used; add a priority field (e.g.
priority or serviceTier as an enum/string) to the costInput struct and ensure
every call site that builds costInput in the runtime path populates it, then
update the billing helper functions that compute per-token/cache costs to
consult costInput.priority and select the priority rates instead of always using
standard rates (modify the pricing lookup logic where InputCostPerTokenPriority,
OutputCostPerTokenPriority, and CacheReadInputTokenCostPriority are read).
Ensure the priority field is propagated through any helper functions that accept
costInput so the priority-aware rates are applied end-to-end.
🧹 Nitpick comments (3)
framework/modelcatalog/pricing_test.go (2)

16-17: Prefer bifrost.Ptr() over local &v helpers.

Reintroducing ptr and intPtr here works, but it spreads another local pointer helper through the test file and diverges from the repo-wide convention.

Based on learnings, "prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, Remove the
local helper functions ptr and intPtr and replace their call sites to use the
repo-standard bifrost.Ptr() helper (e.g., use bifrost.Ptr(1.23) for float64 and
bifrost.Ptr(42) for int) so the test follows the global convention; update any
references to ptr(...) and intPtr(...) in this file (pricing_test.go) to
bifrost.Ptr(...) and delete the ptr and intPtr function declarations.

100-225: Add a priority-pricing case to the text-cost suite.

These tests cover base, cache, tiered, and search-query paths, but nothing hits InputCostPerTokenPriority, OutputCostPerTokenPriority, or CacheReadInputTokenCostPriority. A small focused case here would keep the new priority branch from regressing silently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 100 - 225, Add a focused
unit test (e.g., TestComputeTextCost_PriorityRates) that constructs a pricing
object via chatPricing and sets InputCostPerTokenPriority,
OutputCostPerTokenPriority, and CacheReadInputTokenCostPriority (and optionally
CacheCreationInputTokenCostPriority) to non-nil values, create a
schemas.BifrostLLMUsage with PromptTokens, CompletionTokens, TotalTokens and
PromptTokensDetails.CachedReadTokens (and CachedWriteTokens if testing cache
creation), call computeTextCost(&p, usage) and assert the cost equals the manual
calculation that uses the priority rates for input/output and cache-read
portions; this ensures the priority-rate branch (InputCostPerTokenPriority,
OutputCostPerTokenPriority, CacheReadInputTokenCostPriority) is executed and
verified.
framework/configstore/migrations.go (1)

4139-4166: Keep the refactor column list in one place.

The same column slice is duplicated in both Migrate and Rollback, so a future pricing-field addition can easily update one branch and miss the other. Hoisting this into a shared slice/helper would make the migration much less drift-prone.

Also applies to: 4181-4208

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 4139 - 4166, The columns
slice is duplicated between the Migrate and Rollback functions; extract that
slice into a single shared symbol (e.g., pricingColumns or getPricingColumns())
and have both Migrate and Rollback reference it so future additions are made in
one place; update both places where the duplicate slice appears (the second
occurrence after the first block) to use the shared variable/helper and remove
the duplicated literal slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/openai/openai.go`:
- Around line 7032-7035: The passthrough stream's startTime is being set before
the HTTP request/handshake, making PassthroughStream's latency include request
time; move the startTime initialization inside the streaming goroutine after the
client.Do()/stream is successfully established so the terminal latency measures
only post-handshake streaming time. Locate where
providerUtils.PrepareResponseStreaming(...) returns activeClient and where
PassthroughStream begins (the goroutine that calls activeClient.Do or sets up
the stream) and set startTime := time.Now() there, not immediately after
providerUtils.PrepareResponseStreaming, ensuring all other streaming handlers
remain using post-handshake timing.

In `@core/schemas/speech.go`:
- Around line 30-37: The SpeechUsage token fields (InputTokens, OutputTokens,
TotalTokens) currently serialize zero values and should be optional; modify the
SpeechUsage type to make these three fields pointers (e.g., *int) and add
json:",omitempty" tags, and update BifrostSpeechResponse.BackfillParams so it
does not allocate a new SpeechUsage{} unless you actually set token values (keep
allocating/setting Usage.InputChars as needed but leave token pointers nil when
unknown). Also apply the same change to the other BackfillParams variant
referenced (lines ~149-156) so token fields remain nil/omitted when usage is not
reported.

In `@docs/architecture/framework/model-catalog.mdx`:
- Line 31: Update the "Audio Processing" bullet to explicitly call out
character-based speech billing: mention the new input_cost_per_character pricing
field and that speech responses populate usage.input_chars so speech can be
billed by input characters in addition to tokens/duration; modify the bullet
text (the "**Audio Processing**:" line) to include both "character-based
(input_cost_per_character / usage.input_chars)" and the existing "token-based
and duration-based" phrasing so the documentation reflects the active character
billing path.

In `@plugins/logging/operations.go`:
- Around line 1042-1074: The OutputTokensDetails mapping for
CompletionTokensDetails is incomplete: update the conversion that builds
schemas.ResponsesResponseOutputTokens (used when creating the
ResponsesResponseUsage inside the case for schemas.ResponsesRequest /
schemas.ResponsesStreamRequest) to copy all corresponding fields from
usage.CompletionTokensDetails (not just ReasoningTokens and NumSearchQueries).
Specifically, include TextTokens, AcceptedPredictionTokens, AudioTokens,
ImageTokens, CitationTokens, and RejectedPredictionTokens when constructing
respUsage.OutputTokensDetails so the full token breakdown in
schemas.BifrostResponse -> BifrostResponsesResponse -> Usage is preserved for
cost calculations.

---

Outside diff comments:
In `@core/providers/huggingface/huggingface.go`:
- Around line 1112-1116: Move the streaming startTime initialization out of the
pre-request section and into the goroutine after provider.client.Do and after
the response body/reader is successfully established in both
ImageGenerationStream and ImageEditStream (and the other streaming handlers
referenced); set startTime := time.Now() there, initialize lastChunkTime =
startTime, and use lastChunkTime to compute per-chunk deltas and the final-chunk
latency so metrics reflect only post-handshake streaming time.

In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 67-72: The examples create a modelcatalog.Config with non-pointer
fields but PricingURL and PricingSyncInterval are declared as *string and
*time.Duration; update the examples that construct modelcatalog.Config so they
provide pointers (e.g., create local variables for the string and duration and
take their addresses, or use a small helper like ptr.String/ptr.Duration) and
pass those pointer values into PricingURL and PricingSyncInterval before calling
modelcatalog.Init; ensure you update both example blocks that build
modelcatalog.Config so they compile.

In `@framework/modelcatalog/main.go`:
- Around line 340-360: Replace the direct map lookups in
ModelCatalog.GetPricingEntryForModel with calls to the internal lookup that
applies the billing fallbacks: for each mode in the loop call
mc.getPricing(model, provider, normalizeRequestType(mode)) (or
mc.getPricing(model, provider, mode) if getPricing expects a RequestType)
instead of building makeKey and reading mc.pricingData; if getPricing returns a
non-nil pricing result, return convertTableModelPricingToPricingData(...) for
that result. Ensure you reference the existing symbols GetPricingEntryForModel,
getPricing, convertTableModelPricingToPricingData, makeKey/pricingData only to
remove the direct map usage so the public API matches CalculateCost's fallback
behavior.

---

Duplicate comments:
In `@framework/modelcatalog/pricing_test.go`:
- Around line 1305-1328: The test doesn't exercise the cache-tier (>200k)
pricing because the BifrostLLMUsage has no PromptTokensDetails; update the
makeChatResponse call (or a sibling test) to populate PromptTokensDetails with
cached token counts that push cached reads/writes over the 200k threshold so the
code paths using CacheReadInputTokenCostAbove200kTokens and
CacheCreationInputTokenCostAbove200kTokens are executed (e.g., set
PromptTokensDetails.CachedReadTokens and/or CachedWriteTokens to values summing
>200000), then assert expected cost using CalculateCost.

In `@framework/modelcatalog/pricing.go`:
- Around line 13-20: The cost logic never reads the 1-hour cache-tier selection
because the TTL/retention choice isn’t carried in costInput and
tieredCacheCreationInputTokenRate doesn’t consult the 1-hour fields; add a field
to the costInput struct (e.g., cacheRetention or cacheTTL enum/boolean) to
represent the selected cache tier, populate that new field where costInput is
constructed, and update tieredCacheCreationInputTokenRate to check
CacheCreationInputTokenCostAbove1hr and
CacheCreationInputTokenCostAbove1hrAbove200kTokens when the new cacheRetention
indicates the 1-hour tier so the correct 1-hour rates are returned.
- Around line 13-20: The priority/service-tier signal is never threaded into
pricing calculations, so the priority price columns (InputCostPerTokenPriority,
OutputCostPerTokenPriority, CacheReadInputTokenCostPriority) are never used; add
a priority field (e.g. priority or serviceTier as an enum/string) to the
costInput struct and ensure every call site that builds costInput in the runtime
path populates it, then update the billing helper functions that compute
per-token/cache costs to consult costInput.priority and select the priority
rates instead of always using standard rates (modify the pricing lookup logic
where InputCostPerTokenPriority, OutputCostPerTokenPriority, and
CacheReadInputTokenCostPriority are read). Ensure the priority field is
propagated through any helper functions that accept costInput so the
priority-aware rates are applied end-to-end.

---

Nitpick comments:
In `@framework/configstore/migrations.go`:
- Around line 4139-4166: The columns slice is duplicated between the Migrate and
Rollback functions; extract that slice into a single shared symbol (e.g.,
pricingColumns or getPricingColumns()) and have both Migrate and Rollback
reference it so future additions are made in one place; update both places where
the duplicate slice appears (the second occurrence after the first block) to use
the shared variable/helper and remove the duplicated literal slice.

In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: Remove the local helper functions ptr and intPtr and
replace their call sites to use the repo-standard bifrost.Ptr() helper (e.g.,
use bifrost.Ptr(1.23) for float64 and bifrost.Ptr(42) for int) so the test
follows the global convention; update any references to ptr(...) and intPtr(...)
in this file (pricing_test.go) to bifrost.Ptr(...) and delete the ptr and intPtr
function declarations.
- Around line 100-225: Add a focused unit test (e.g.,
TestComputeTextCost_PriorityRates) that constructs a pricing object via
chatPricing and sets InputCostPerTokenPriority, OutputCostPerTokenPriority, and
CacheReadInputTokenCostPriority (and optionally
CacheCreationInputTokenCostPriority) to non-nil values, create a
schemas.BifrostLLMUsage with PromptTokens, CompletionTokens, TotalTokens and
PromptTokensDetails.CachedReadTokens (and CachedWriteTokens if testing cache
creation), call computeTextCost(&p, usage) and assert the cost equals the manual
calculation that uses the priority rates for input/output and cache-read
portions; this ensures the priority-rate branch (InputCostPerTokenPriority,
OutputCostPerTokenPriority, CacheReadInputTokenCostPriority) is executed and
verified.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 529332fe-02f5-4891-b6b2-b7d257f3fae2

📥 Commits

Reviewing files that changed from the base of the PR and between dbf2248 and 76d7e6d.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (13)
  • plugins/telemetry/main.go
  • plugins/logging/main.go
  • docs/contributing/setting-up-repo.mdx
  • core/schemas/provider.go
  • framework/modelcatalog/overrides_test.go
  • framework/streaming/audio.go
  • core/bifrost.go
  • plugins/governance/main.go
  • core/providers/anthropic/anthropic.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • framework/tracing/tracer.go

Comment thread core/providers/openai/openai.go
Comment thread core/schemas/speech.go
Comment thread docs/architecture/framework/model-catalog.mdx Outdated
Comment thread plugins/logging/operations.go
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 76d7e6d to 1df9e09 Compare March 12, 2026 20:53
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
core/providers/openai/openai.go (1)

7033-7035: ⚠️ Potential issue | 🟡 Minor

Move passthrough startTime into the streaming goroutine (post-handshake).

startTime is still captured before activeClient.Do(...), so passthrough final latency includes request/handshake time and diverges from the repo’s streaming convention.

♻️ Proposed fix
 	activeClient := providerUtils.PrepareResponseStreaming(ctx, provider.client, resp)
-
-	startTime := time.Now()
 
 	if err := activeClient.Do(fasthttpReq, resp); err != nil {
 		providerUtils.ReleaseStreamingResponse(resp)
@@
 	ch := make(chan *schemas.BifrostStreamChunk, schemas.DefaultStreamBufferSize)
 	go func() {
+		startTime := time.Now()
 		defer func() {
 			if ctx.Err() == context.Canceled {
 				providerUtils.HandleStreamCancellation(ctx, postHookRunner, ch, provider.GetProviderKey(), req.Model, schemas.PassthroughStreamRequest, provider.logger)
Based on learnings: In Go streaming handlers under core/providers, initialize the streaming startTime inside the streaming goroutine after client.Do and stream setup, so final-chunk latency reflects post-handshake streaming time.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/openai/openai.go` around lines 7033 - 7035, The passthrough
startTime is captured before activeClient.Do, so final-chunk latency incorrectly
includes handshake time; move the startTime initialization into the streaming
goroutine after activeClient.Do and after you set up the response stream (i.e.,
inside the goroutine that reads/writes chunks), so startTime is assigned
post-handshake and passthrough final latency only measures streaming; update any
uses of startTime (and related latency calculation) in that goroutine
accordingly.
plugins/logging/operations.go (1)

951-953: ⚠️ Potential issue | 🟠 Major

Allow non-token modalities to proceed when usage is nil.

Line 951 currently rejects logs that can still be priced from modality-specific fields (e.g., image count/pixels, video seconds, speech/transcription usage), so recalculation skips valid entries.

♻️ Suggested fix
-	// If no cache hit and no usage, we can't calculate cost
-	if usage == nil && (cacheDebug == nil || !cacheDebug.CacheHit) {
-		return 0, fmt.Errorf("token usage not available for log %s", logEntry.ID)
-	}
-
-	requestType := schemas.RequestType(logEntry.Object)
+	requestType := schemas.RequestType(logEntry.Object)
+
+	tokenUsageRequired := true
+	switch requestType {
+	case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest,
+		schemas.ImageEditRequest, schemas.ImageEditStreamRequest, schemas.ImageVariationRequest,
+		schemas.VideoGenerationRequest, schemas.VideoRemixRequest,
+		schemas.SpeechRequest, schemas.SpeechStreamRequest,
+		schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest:
+		tokenUsageRequired = false
+	}
+
+	if tokenUsageRequired && usage == nil && (cacheDebug == nil || !cacheDebug.CacheHit) {
+		return 0, fmt.Errorf("token usage not available for log %s", logEntry.ID)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/logging/operations.go` around lines 951 - 953, The early-return
currently treats any nil `usage` as fatal (in the block checking `usage == nil
&& (cacheDebug == nil || !cacheDebug.CacheHit)`), which prevents pricing logs
that have modality-specific fields (e.g., image count/pixels, video seconds,
speech/transcription) on `logEntry`; change the guard to allow continuing when
`usage` is nil but `logEntry` contains modality fields — i.e., only return the
error if `usage` is nil, there is no cache hit, AND there are no
modality-specific indicators present on the log (check fields like
`logEntry.ImageCount`, `logEntry.ImagePixels`, `logEntry.VideoSeconds`,
`logEntry.AudioSeconds`/transcription counters or similar); update the condition
and downstream cost-calculation code in this area so those modality fields are
used to compute cost when `usage` is absent.
docs/architecture/framework/model-catalog.mdx (1)

395-399: ⚠️ Potential issue | 🟡 Minor

Line 397 still says “warnings” for a debug-only path.

The missing-pricing/model path described elsewhere in this page now logs at debug level, so this best-practices bullet is still telling operators to watch for a warning that never fires.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/framework/model-catalog.mdx` around lines 395 - 399, The
docs' bullet about "Logging" incorrectly tells operators to monitor "missing
model warnings" even though the missing-pricing/model path now logs at debug
level; update the sentence referencing ModelCatalog/CalculateCost to reflect
that missing-model events are logged at debug (not warning) level — e.g., change
"missing model warnings" to "missing model debug logs" or instruct operators to
enable debug logging for ModelCatalog/CalculateCost pricing syncs, and ensure
Cleanup() remains mentioned for shutdown.
framework/modelcatalog/pricing.go (1)

11-20: ⚠️ Potential issue | 🟠 Major

Priority and 1-hour cache-write price columns are still unreachable.

costInput never carries a service-tier/priority flag or cache-lifetime signal, and the selectors used from computeTextCost only key off totalTokens. As written, InputCostPerTokenPriority, OutputCostPerTokenPriority, CacheReadInputTokenCostPriority, and CacheCreationInputTokenCostAbove1hr* can never be selected, so those requests still fall back to the standard rates.

Also applies to: 279-282, 644-662

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing.go` around lines 11 - 20, costInput currently
lacks service-tier/priority and cache-lifetime signals so computeTextCost (and
related selectors that only inspect totalTokens) can never pick the priority or
1-hour cache-write pricing columns; add explicit fields to costInput (e.g.,
Priority/ServiceTier and CacheLifetime or CacheSignal) and propagate those from
the Bifrost response creation paths, then update computeTextCost and any other
price-selection logic that currently keys only on totalTokens to branch on the
new costInput.Priority and costInput.CacheLifetime values so selectors can
choose InputCostPerTokenPriority, OutputCostPerTokenPriority,
CacheReadInputTokenCostPriority, and the CacheCreationInputTokenCostAbove1hr*
columns when appropriate.
🧹 Nitpick comments (2)
core/providers/huggingface/huggingface.go (1)

1116-1116: Initialize stream startTime after handshake, not before client.Do.

With the current placement, chunk/final stream latency includes connection/headers time; move stream timing init into the goroutine after stream setup to track post-handshake streaming duration.

Based on learnings: “initialize the streaming startTime inside the streaming goroutine after client.Do and the stream is set up, so the final-chunk latency reflects post-handshake streaming time.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/huggingface/huggingface.go` at line 1116, The current code
initializes the stream timing (startTime) before calling provider.client.Do(req,
resp), which makes chunk/final stream latency include connection/handshake time;
move the startTime initialization into the streaming goroutine after
provider.client.Do returns and the stream/resp is validated (i.e., inside the
goroutine that reads chunks from resp.Body or the stream reader), so startTime
is set once the handshake is complete and subsequent latency measurements
(per-chunk and final-chunk) reflect post-handshake streaming duration; update
any references that calculate elapsed time to use this relocated startTime.
framework/modelcatalog/pricing_test.go (1)

16-17: Use bifrost.Ptr() in these helpers for repo consistency.

These helpers are the shared pointer factory for the whole file, so it’s worth keeping them aligned with the repository convention.

♻️ Suggested cleanup
 import (
 	"testing"
 
+	bifrost "github.com/maximhq/bifrost/core"
 	"github.com/maximhq/bifrost/core/schemas"
 	configstoreTables "github.com/maximhq/bifrost/framework/configstore/tables"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )
 
-func ptr(v float64) *float64 { return &v }
-func intPtr(v int) *int      { return &v }
+func ptr(v float64) *float64 { return bifrost.Ptr(v) }
+func intPtr(v int) *int      { return bifrost.Ptr(v) }

Based on learnings, "In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/modelcatalog/pricing_test.go` around lines 16 - 17, The test
helpers ptr and intPtr use the address operator instead of the
repository-standard bifrost.Ptr helper; replace their implementations so they
call bifrost.Ptr for float64 and int respectively (update functions ptr and
intPtr to return bifrost.Ptr(v) for their value types) to keep pointer creation
consistent with the rest of the codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/images.go`:
- Around line 177-180: Update the comment for BackfillParams to correctly state
which field is backfilled: mention that NumInputImages is set on ImageUsage and
that Size is backfilled on BifrostImageGenerationStreamResponse (r.Size), not on
ImageGenerationResponseParameters; edit the comment text accordingly so it
accurately names the target struct and field (e.g., "Size on
BifrostImageGenerationStreamResponse (r.Size)").

In `@framework/modelcatalog/utils.go`:
- Around line 92-153: The converters that map between TableModelPricing and
PricingEntry are copying pointer addresses instead of cloning values; update
both conversion directions (the code that constructs
configstoreTables.TableModelPricing from a PricingEntry and the reverse path
used by GetPricingEntryForModel) to allocate new *float64 values and copy the
numeric values for every optional cost field so callers never share pointer
addresses with mc.pricingData; specifically adjust the converter functions that
populate fields like InputCostPerToken, OutputCostPerToken,
CacheCreationInputTokenCost, InputCostPerImage, InputCostPerAudioToken, etc.,
and ensure GetPricingEntryForModel returns newly allocated pointers (deep
copies) while still protecting mc.pricingData behind mc.mu to avoid races.

---

Duplicate comments:
In `@core/providers/openai/openai.go`:
- Around line 7033-7035: The passthrough startTime is captured before
activeClient.Do, so final-chunk latency incorrectly includes handshake time;
move the startTime initialization into the streaming goroutine after
activeClient.Do and after you set up the response stream (i.e., inside the
goroutine that reads/writes chunks), so startTime is assigned post-handshake and
passthrough final latency only measures streaming; update any uses of startTime
(and related latency calculation) in that goroutine accordingly.

In `@docs/architecture/framework/model-catalog.mdx`:
- Around line 395-399: The docs' bullet about "Logging" incorrectly tells
operators to monitor "missing model warnings" even though the
missing-pricing/model path now logs at debug level; update the sentence
referencing ModelCatalog/CalculateCost to reflect that missing-model events are
logged at debug (not warning) level — e.g., change "missing model warnings" to
"missing model debug logs" or instruct operators to enable debug logging for
ModelCatalog/CalculateCost pricing syncs, and ensure Cleanup() remains mentioned
for shutdown.

In `@framework/modelcatalog/pricing.go`:
- Around line 11-20: costInput currently lacks service-tier/priority and
cache-lifetime signals so computeTextCost (and related selectors that only
inspect totalTokens) can never pick the priority or 1-hour cache-write pricing
columns; add explicit fields to costInput (e.g., Priority/ServiceTier and
CacheLifetime or CacheSignal) and propagate those from the Bifrost response
creation paths, then update computeTextCost and any other price-selection logic
that currently keys only on totalTokens to branch on the new costInput.Priority
and costInput.CacheLifetime values so selectors can choose
InputCostPerTokenPriority, OutputCostPerTokenPriority,
CacheReadInputTokenCostPriority, and the CacheCreationInputTokenCostAbove1hr*
columns when appropriate.

In `@plugins/logging/operations.go`:
- Around line 951-953: The early-return currently treats any nil `usage` as
fatal (in the block checking `usage == nil && (cacheDebug == nil ||
!cacheDebug.CacheHit)`), which prevents pricing logs that have modality-specific
fields (e.g., image count/pixels, video seconds, speech/transcription) on
`logEntry`; change the guard to allow continuing when `usage` is nil but
`logEntry` contains modality fields — i.e., only return the error if `usage` is
nil, there is no cache hit, AND there are no modality-specific indicators
present on the log (check fields like `logEntry.ImageCount`,
`logEntry.ImagePixels`, `logEntry.VideoSeconds`,
`logEntry.AudioSeconds`/transcription counters or similar); update the condition
and downstream cost-calculation code in this area so those modality fields are
used to compute cost when `usage` is absent.

---

Nitpick comments:
In `@core/providers/huggingface/huggingface.go`:
- Line 1116: The current code initializes the stream timing (startTime) before
calling provider.client.Do(req, resp), which makes chunk/final stream latency
include connection/handshake time; move the startTime initialization into the
streaming goroutine after provider.client.Do returns and the stream/resp is
validated (i.e., inside the goroutine that reads chunks from resp.Body or the
stream reader), so startTime is set once the handshake is complete and
subsequent latency measurements (per-chunk and final-chunk) reflect
post-handshake streaming duration; update any references that calculate elapsed
time to use this relocated startTime.

In `@framework/modelcatalog/pricing_test.go`:
- Around line 16-17: The test helpers ptr and intPtr use the address operator
instead of the repository-standard bifrost.Ptr helper; replace their
implementations so they call bifrost.Ptr for float64 and int respectively
(update functions ptr and intPtr to return bifrost.Ptr(v) for their value types)
to keep pointer creation consistent with the rest of the codebase.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4450525b-ba03-40f8-b9a1-72b58230ed98

📥 Commits

Reviewing files that changed from the base of the PR and between 76d7e6d and 1df9e09.

📒 Files selected for processing (31)
  • core/bifrost.go
  • core/providers/anthropic/anthropic.go
  • core/providers/azure/azure.go
  • core/providers/gemini/gemini.go
  • core/providers/huggingface/huggingface.go
  • core/providers/openai/openai.go
  • core/schemas/images.go
  • core/schemas/provider.go
  • core/schemas/speech.go
  • docs/architecture/framework/model-catalog.mdx
  • docs/contributing/setting-up-repo.mdx
  • framework/configstore/migrations.go
  • framework/configstore/tables/modelpricing.go
  • framework/modelcatalog/main.go
  • framework/modelcatalog/overrides.go
  • framework/modelcatalog/overrides_test.go
  • framework/modelcatalog/pricing.go
  • framework/modelcatalog/pricing_test.go
  • framework/modelcatalog/utils.go
  • framework/streaming/audio.go
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/streaming/responses.go
  • framework/streaming/transcription.go
  • framework/tracing/tracer.go
  • plugins/governance/main.go
  • plugins/logging/main.go
  • plugins/logging/operations.go
  • plugins/telemetry/main.go
  • transports/bifrost-http/handlers/providers.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
💤 Files with no reviewable changes (2)
  • transports/bifrost-http/handlers/providers.go
  • framework/modelcatalog/overrides.go
🚧 Files skipped from review as they are similar to previous changes (14)
  • framework/streaming/transcription.go
  • core/providers/azure/azure.go
  • framework/tracing/tracer.go
  • plugins/logging/main.go
  • framework/streaming/responses.go
  • core/providers/gemini/gemini.go
  • core/schemas/speech.go
  • plugins/telemetry/main.go
  • plugins/governance/main.go
  • core/bifrost.go
  • ui/app/workspace/logs/sheets/logDetailsSheet.tsx
  • framework/streaming/chat.go
  • framework/streaming/images.go
  • framework/modelcatalog/overrides_test.go

Comment thread core/schemas/images.go
Comment thread framework/modelcatalog/utils.go
Copy link
Copy Markdown
Collaborator Author

Pratham-Mishra04 commented Mar 13, 2026

Merge activity

  • Mar 13, 4:57 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 13, 4:58 AM UTC: Graphite rebased this pull request as part of a merge.
  • Mar 13, 4:59 AM UTC: @Pratham-Mishra04 merged this pull request with Graphite.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 02-26-refactor_pricing_module_refactor branch from 1df9e09 to 58e3f4a Compare March 13, 2026 04:58
@Pratham-Mishra04 Pratham-Mishra04 merged commit 75806e8 into main Mar 13, 2026
9 of 11 checks passed
@Pratham-Mishra04 Pratham-Mishra04 deleted the 02-26-refactor_pricing_module_refactor branch March 13, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants