Skip to content

proxy: support timings for /infill from llama-server#510

Merged
mostlygeek merged 1 commit intomainfrom
mg/infill-timings-463
Feb 8, 2026
Merged

proxy: support timings for /infill from llama-server#510
mostlygeek merged 1 commit intomainfrom
mg/infill-timings-463

Conversation

@mostlygeek
Copy link
Owner

@mostlygeek mostlygeek commented Feb 8, 2026

fixes: #463

Summary by CodeRabbit

  • Bug Fixes
    • Fixed metrics collection for the /infill endpoint to properly extract timing data and accurately report token usage (input, output, and cached tokens) and performance metrics including throughput rates and response duration calculations.

@coderabbitai
Copy link

coderabbitai bot commented Feb 8, 2026

Walkthrough

The changes add special-case handling for /infill requests to extract timing metrics from the last element of parsed JSON array responses instead of from the root object, addressing metrics reporting failures for the /infill endpoint.

Changes

Cohort / File(s) Summary
Metrics Monitor /infill Handling
proxy/metrics_monitor.go, proxy/metrics_monitor_test.go
Adds conditional logic to extract timings from the last JSON array element for /infill endpoint responses. Includes tests for successful array parsing and empty array edge cases, verifying token counts and timing calculations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding support for extracting timings/metrics from /infill responses through llama-server.
Linked Issues check ✅ Passed The PR correctly addresses issue #463 by implementing special-case handling for /infill to extract timings from the JSON array response, restoring metrics reporting [#463].
Out of Scope Changes check ✅ Passed All changes are scoped to /infill metrics handling in proxy/metrics_monitor.go and corresponding tests; no extraneous modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mg/infill-timings-463

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mostlygeek mostlygeek merged commit 8d6d949 into main Feb 8, 2026
3 checks passed
@mostlygeek mostlygeek deleted the mg/infill-timings-463 branch February 8, 2026 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/infill endpoint does not report usage metrics

1 participant