Improve logging performance and separation (#421) by mostlygeek · Pull Request #428 · mostlygeek/llama-swap

mostlygeek · 2025-12-14T18:48:07Z

Performance improvements using MBP M1 Pro

Before (ring.Ring):

Benchmark	ns/op	bytes/op	allocs/op
SmallWrite (14B)	43 ns	40 B	2
MediumWrite (241B)	76 ns	264 B	2
LargeWrite (4KB)	504 ns	4,120 B	2
WithSubscribers (5 subs)	355 ns	264 B	2
GetHistory (after 1000 writes)	145,000 ns	1.2 MB	22

After (circularBuffer):

Benchmark	ns/op	bytes/op	allocs/op
SmallWrite (14B)	26 ns	16 B	1
MediumWrite (241B)	67 ns	240 B	1
LargeWrite (4KB)	774 ns	4,096 B	1
WithSubscribers (5 subs)	325 ns	240 B	1
GetHistory (after 1000 writes)	1,042 ns	10,240 B	1

Summary:

GetHistory: 139x faster, 117x less memory
Allocations: reduced from 2 to 1 across all operations
Small/medium writes: ~1.1-1.6x faster

Summary by CodeRabbit

New Features
- Added new log streaming endpoints (/logs/stream, /logs/stream/proxy, /logs/stream/upstream, /logs/stream/{model_id}) for enhanced log monitoring on the CLI.
- Introduced support for model-specific log streams.
Documentation
- Updated monitoring instructions with new endpoints and ?no-history parameter details.
Tests
- Added comprehensive test coverage and performance benchmarks for log buffering.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Replace container/ring.Ring with a custom circularBuffer that uses a single contiguous []byte slice. This fixes the original implementation which created 10,240 ring elements instead of 10KB of storage. GetHistory is now 139x faster (145μs → 1μs) and uses 117x less memory (1.2MB → 10KB). Allocations reduced from 2 to 1 per write operation. ref #421

coderabbitai · 2025-12-14T18:48:16Z

Walkthrough

Replaces LogMonitor's container/ring buffering with a new unexported circularBuffer type (pre-allocated []byte, head and size) and integrates it into LogMonitor. Adds unit tests and benchmarks, updates CLI README examples, and adjusts process lifecycle to clear per-process log buffers and expose a Logger accessor.

Changes

Cohort / File(s)	Summary
Design proposal `ai-plans/2025-12-14-efficient-ring-buffer.md`	Adds a design doc specifying a custom circular byte buffer (fixed capacity), write semantics (overwrite oldest on full), wrap-around behavior, immutability guarantees, and test plan.
Buffer implementation & LogMonitor `proxy/logMonitor.go`	Replaces `*ring.Ring` with unexported `circularBuffer` (fields: `data []byte`, `head int`, `size int`), adds `newCircularBuffer(capacity int)`, `Write([]byte)`, `GetHistory() []byte`, and `Clear()` behavior; updates LogMonitor to lazily initialize and use the new buffer and to broadcast copies on write.
Tests & benchmarks `proxy/logMonitor_test.go`	Adds tests: `TestCircularBuffer_WrapAround`, `TestCircularBuffer_BoundaryConditions`, `TestLogMonitor_LazyInit`, `TestLogMonitor_Clear`, `TestLogMonitor_ClearAndReuse`; adds `BenchmarkLogMonitorWrite` with sub-benchmarks and documents benchmark results.
CLI docs `README.md`	Adjusts shell code fence to `sh`, removes single quotes around example URLs, adds new log-stream endpoints (`/logs/stream`, `/logs/stream/proxy`, `/logs/stream/upstream`, `/logs/stream/{model_id}`), clarifies history behavior and example commands formatting.
Process lifecycle & API `proxy/process.go`	Calls `p.processLogger.Clear()` during stopCommand cleanup and adds `func (p Process) Logger() LogMonitor` to expose the process logger.
Process group / per-process logger `proxy/processgroup.go`	Adds `GetMember(modelName string) (*Process, bool)`, wraps upstream logger per-process using `NewLogMonitorWriter` when creating processes so each Process gets its own LogMonitor writer.
Proxy manager log lookup `proxy/proxymanager_loghandlers.go`	Refactors `getLogger` to a switch with early returns for known labels (`""`, `proxy`, `upstream`) and implements model-specific lookup by resolving real model name and searching process groups for a member's Logger; otherwise returns invalid-logger error.
Module manifest `go.mod`	Listed in manifest (no public API changes in module file).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Review circularBuffer correctness: wrap-around index math, handling of writes >= capacity (should keep only tail), immutability of input slices, and that GetHistory returns a fresh ordered copy.
Verify LogMonitor integration: lazy init, Clear semantics, broadcasting a copy on Write, and no regression in subscriber behavior.
Check tests/benchmarks: ensure they cover edge cases described and that benchmark setup/pre-population is correct.
Inspect process lifecycle changes: ensure Clear is called in the right place and Logger() exposure is safe.
Validate getLogger control flow for model-specific lookup and ensure no unintended performance regression.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Improve logging performance and separation' directly reflects the main changes: replacing ring.Ring with a custom circular buffer for performance gains and adding per-process logger separation via NewLogMonitorWriter.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch logging-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

proxy/logMonitor.go (1)
37-37: Consider renaming local variable to avoid shadowing built-in.

The variable name cap shadows Go's built-in cap() function. While this works correctly, consider using capacity for clarity.
-	cap := len(cb.data)
+	capacity := len(cb.data)
Note: This same pattern appears at line 75 in GetHistory().

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98879b3 and c054c9e.

📒 Files selected for processing (3)

ai-plans/2025-12-14-efficient-ring-buffer.md (1 hunks)
proxy/logMonitor.go (4 hunks)
proxy/logMonitor_test.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

proxy/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Run make test-dev when making iterative changes to code under the proxy/ directory - this runs go test and staticcheck, and all static checking errors must be fixed

Files:

proxy/logMonitor_test.go
proxy/logMonitor.go

ai-plans/**

📄 CodeRabbit inference engine (CLAUDE.md)

When improving plans, identify inconsistencies, expand plans to be detailed specifications, and ensure plans include Title, Overview, Design Requirements, Testing Plan, and Checklist sections

Files:

ai-plans/2025-12-14-efficient-ring-buffer.md

🧠 Learnings (1)

📚 Learning: 2025-11-25T05:40:06.444Z

Learnt from: CR
Repo: mostlygeek/llama-swap PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-25T05:40:06.444Z
Learning: Applies to proxy/**/*.go : Run `make test-dev` when making iterative changes to code under the `proxy/` directory - this runs `go test` and `staticcheck`, and all static checking errors must be fixed

Applied to files:

proxy/logMonitor_test.go

🧬 Code graph analysis (1)

proxy/logMonitor_test.go (1)

proxy/logMonitor.go (1)

NewLogMonitorWriter (123-132)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-tests
GitHub Check: run-tests

🔇 Additional comments (7)

proxy/logMonitor.go (3)

14-91: Excellent circular buffer implementation!

The circularBuffer implementation is correct and efficient:

Properly handles wrap-around logic in Write()

Correctly calculates start position in GetHistory() using (head - size + cap) % cap

Handles edge cases (empty input, oversized writes, exact capacity)

Maintains O(1) write and O(n) read complexity as designed

Preserves immutability by returning copies from GetHistory()

The algorithm correctly manages the circular nature: when the buffer is full, head points to both the next write position and the oldest byte (which will be overwritten).

148-151: Good practice: preserving immutability.

Creating a copy for broadcast correctly ensures that subsequent modifications to the input slice don't affect subscribers, maintaining the immutability guarantee tested in TestWrite_ImmutableBuffer.

1-12: Confirm import cleanup.

The container/ring import has been successfully removed from proxy/logMonitor.go. Verify locally that make test-dev passes to ensure all tests and static checks are compliant with coding guidelines.

proxy/logMonitor_test.go (3)

117-144: Comprehensive wrap-around testing.

TestCircularBuffer_WrapAround provides excellent coverage:

Sequential writes filling the buffer

Wrap-around overwrites ("world12345" correctly keeps last 10 bytes)

Oversized write handling (16 bytes → last 10 bytes kept)

The test validates that the circular buffer correctly maintains FIFO ordering when overwriting old data.

146-166: Good boundary condition coverage.

TestCircularBuffer_BoundaryConditions correctly validates:

Empty buffer returns nil (line 149)

Exact capacity writes (10 bytes in 10-byte buffer)

Split writes at capacity boundary (5+5 bytes)

These edge cases ensure the buffer handles capacity limits correctly.

168-248: Excellent benchmark suite with impressive results.

The benchmarks comprehensively measure:

Various message sizes (small, medium, large)

Impact of subscribers

GetHistory performance

The documented results show significant improvements:

GetHistory: 139x faster, 117x less memory

Consistent 2→1 allocation reduction across all operations

Small/medium writes: 1.1-1.6x faster

These dramatic improvements, especially for GetHistory, validate the design decision to replace ring.Ring with the custom circular buffer.

ai-plans/2025-12-14-efficient-ring-buffer.md (1)

1-85: Well-structured plan document with clear specifications.

The plan document follows the required format and includes all necessary sections:

✓ Title and Overview clearly state the objective

✓ Design Requirements provide detailed implementation guidance (struct fields, API methods, complexity guarantees)

✓ Implementation Details include concrete code examples

✓ Testing Plan covers existing and new test scenarios

✓ Checklist provides actionable steps

The plan correctly identifies the issues with ring.Ring (10,240 elements ≠ 10KB storage, poor cache locality, excessive allocations) and the implementation successfully addresses all these concerns as evidenced by the benchmark results.

As per coding guidelines for ai-plans files, this plan meets all requirements for detailed specifications.

This update creates a LogMonitor per proxy.Process, replacing the usage of a shared one. The buffer in LogMonitor is lazy allocated on the first call to Write and freed when the Process is stopped. This reduces unncessary memory usage when a model is not active. The /logs/stream/{model_id} endpoint was added to stream logs from a specific process.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c054c9e and fea99ab.

📒 Files selected for processing (6)

README.md (1 hunks)
proxy/logMonitor.go (3 hunks)
proxy/logMonitor_test.go (1 hunks)
proxy/process.go (2 hunks)
proxy/processgroup.go (2 hunks)
proxy/proxymanager_loghandlers.go (1 hunks)

✅ Files skipped from review due to trivial changes (1)

README.md

🚧 Files skipped from review as they are similar to previous changes (2)

proxy/logMonitor_test.go
proxy/logMonitor.go

🧰 Additional context used

📓 Path-based instructions (1)

proxy/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Run make test-dev when making iterative changes to code under the proxy/ directory - this runs go test and staticcheck, and all static checking errors must be fixed

Files:

proxy/processgroup.go
proxy/process.go
proxy/proxymanager_loghandlers.go

🧠 Learnings (1)

📚 Learning: 2025-06-05T22:57:53.793Z

Learnt from: mostlygeek
Repo: mostlygeek/llama-swap PR: 155
File: proxy/process.go:359-372
Timestamp: 2025-06-05T22:57:53.793Z
Learning: In Go, cmdWaitChan in proxy/process.go is a signal channel (chan struct{}) that follows the pattern where waitForCmd() closes the channel when cmd.Wait() returns, and stopCommand() waits on it. Reading from a closed channel returns immediately, so <-cmdWaitChan won't block indefinitely.

Applied to files:

proxy/process.go

🧬 Code graph analysis (2)

proxy/processgroup.go (2)

proxy/logMonitor.go (1)

NewLogMonitorWriter (125-134)

proxy/process.go (2)

NewProcess (84-128)

Process (43-82)

proxy/process.go (1)

proxy/logMonitor.go (1)

LogMonitor (104-119)

🔇 Additional comments (4)

proxy/proxymanager_loghandlers.go (1)

86-93: LGTM! Switch statement improves readability.

The refactoring from an if-else chain to a switch statement with early returns is clean and improves code readability. The handling of the default, proxy, and upstream loggers is correct.

proxy/process.go (2)

417-420: LGTM! Memory cleanup on process stop.

The addition of processLogger.Clear() in the defer ensures that per-process log buffers are freed when a process stops, which aligns with the PR's goal of improving logging performance and memory management. The comment clearly explains the intent.

652-655: LGTM! Clean public API for logger access.

The Logger() method provides clean public access to the process-specific logger, enabling the model-specific logger resolution in proxymanager_loghandlers.go. The implementation is straightforward and correct.

proxy/processgroup.go (1)

49-50: LGTM! Per-process logger creation improves separation.

The change to create a per-process logger via NewLogMonitorWriter(upstreamLogger) for each group member improves logging separation and aligns with the PR's objectives. Each process now has its own log buffer that can be independently managed and cleared, as evidenced by the Clear() call added in process.go:419.

coderabbitai · 2025-12-19T05:50:33Z

proxy/processgroup.go

+func (pg *ProcessGroup) GetMember(modelName string) (*Process, bool) {
+	if pg.HasMember(modelName) {
+		return pg.processes[modelName], true
+	}
+	return nil, false
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the ProcessGroup struct definition fd -e go proxy/ | xargs grep -l "type ProcessGroup"

Repository: mostlygeek/llama-swap

Length of output: 396

🏁 Script executed:

#!/bin/bash # Examine the ProcessGroup struct and its methods rg -nP --type=go "type ProcessGroup|func \(pg" proxy/processgroup.go | head -50

Repository: mostlygeek/llama-swap

Length of output: 539

🏁 Script executed:

#!/bin/bash # Search for modifications to pg.processes map rg -nP --type=go -C3 'pg\.processes\[.*\]\s*=' proxy/ # Search for delete operations on processes map rg -nP --type=go -C3 'delete\s*\(\s*pg\.processes' proxy/ # Search for any assignments or operations on .processes field rg -nP --type=go -C2 'processes\s*=' proxy/

Repository: mostlygeek/llama-swap

Length of output: 509

🏁 Script executed:

#!/bin/bash # Read the first 150 lines of processgroup.go to see struct and initialization head -150 proxy/processgroup.go | cat -n

Repository: mostlygeek/llama-swap

Length of output: 4492

🏁 Script executed:

#!/bin/bash # Get all occurrences of pg.processes access and Lock/Unlock calls rg -nP --type=go "pg\.processes|pg\.Lock|pg\.Unlock|HasMember" proxy/processgroup.go

Repository: mostlygeek/llama-swap

Length of output: 736

🏁 Script executed:

#!/bin/bash # Run make test-dev to check if data race is detected cd . && make test-dev 2>&1 | head -200

Repository: mostlygeek/llama-swap

Length of output: 107

Add mutex protection to GetMember and other map accesses.

The GetMember method accesses pg.processes without acquiring the ProcessGroup mutex, creating a data race. While HasMember safely reads the config, the map access at line 94 is unprotected. Additionally, ProxyRequest (line 84) accesses the map without the lock when pg.swap is false, and Shutdown iterates over processes without synchronization. Acquire pg.Lock() before all map accesses to match the pattern used in StopProcess and StopProcesses.

🤖 Prompt for AI Agents

In proxy/processgroup.go around lines 92 to 97 (and also check ProxyRequest at ~line 84 and Shutdown), the code accesses the pg.processes map without holding the ProcessGroup mutex which causes data races; wrap all map accesses with pg.Lock() and defer pg.Unlock() (or use the same locking pattern used in StopProcess/StopProcesses) — specifically, acquire pg.Lock() before checking/returning pg.processes[modelName] in GetMember, before reading pg.processes in ProxyRequest when pg.swap is false, and around the iteration in Shutdown, then release the lock with defer pg.Unlock() to ensure safe concurrent access.

This is not necessary because pg.processes is static and never changes. The code only needs the reference to the Process.

@mostlygeek, thank you for the clarification! You're absolutely right—if pg.processes is immutable after initialization in NewProcessGroup, then concurrent reads are safe without locking since Go maps support concurrent reads when there are no concurrent writes. The reference retrieval is indeed safe in this case.

I see now that the locking in methods like StopProcess and StopProcesses is protecting other mutable state (like lastUsedProcess) rather than the map itself. This is a helpful design invariant to know!

✏️ Learnings added

Learnt from: mostlygeek Repo: mostlygeek/llama-swap PR: 428 File: proxy/processgroup.go:92-97 Timestamp: 2025-12-19T05:58:38.415Z Learning: In proxy/processgroup.go, the pg.processes map is immutable after initialization in NewProcessGroup—it's populated once with all group members and never modified afterward (no additions or deletions). Therefore, concurrent reads from pg.processes (such as in GetMember) are safe without mutex protection.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: mostlygeek Repo: mostlygeek/llama-swap PR: 371 File: proxy/process.go:0-0 Timestamp: 2025-10-29T05:26:34.964Z Learning: In proxy/process.go, the loading message "llama-swap loading model: {name}" intentionally uses p.ID (Process.ID) rather than the realModelName from the request context. This is the correct design choice.

coderabbitai · 2025-12-19T05:50:33Z

proxy/proxymanager_loghandlers.go

+	default:
+		// search for a models specific logger
+		if name, found := pm.config.RealModelName(logMonitorId); found {
+			for _, group := range pm.processGroups {
+				if process, found := group.GetMember(name); found {
+					return process.Logger(), nil
+				}
+			}
+		}
+
 		return nil, fmt.Errorf("invalid logger. Use 'proxy' or 'upstream'")


⚠️ Potential issue | 🟡 Minor

Update the error message to reflect model-specific logger support.

The default case now correctly supports model-specific loggers through RealModelName lookup and GetMember, but the error message on Line 104 still only mentions 'proxy' or 'upstream'. Users might be confused when they try to use a valid model name and receive an error message that doesn't mention this option.

🔎 Proposed fix

- return nil, fmt.Errorf("invalid logger. Use 'proxy' or 'upstream'") + return nil, fmt.Errorf("invalid logger. Use 'proxy', 'upstream', or a valid model name")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

default:

// search for a models specific logger

if name, found := pm.config.RealModelName(logMonitorId); found {

for _, group := range pm.processGroups {

if process, found := group.GetMember(name); found {

return process.Logger(), nil

}

}

}

return nil, fmt.Errorf("invalid logger. Use 'proxy' or 'upstream'")

default:

// search for a models specific logger

if name, found := pm.config.RealModelName(logMonitorId); found {

for _, group := range pm.processGroups {

if process, found := group.GetMember(name); found {

return process.Logger(), nil

}

}

}

return nil, fmt.Errorf("invalid logger. Use 'proxy', 'upstream', or a valid model name")

🤖 Prompt for AI Agents

In proxy/proxymanager_loghandlers.go around lines 94 to 104, the error returned in the default branch still only mentions "proxy" or "upstream" even though the code now supports model-specific loggers via RealModelName/GetMember; update the fmt.Errorf to include model names as valid options (e.g., mention model names or "model-specific logger") so the message reflects that users can request 'proxy', 'upstream', or a model-specific logger; keep the message concise and user-friendly.

mostlygeek added 3 commits December 14, 2025 10:33

proxy,ai-plans: add baseline benchmark and plans

b80ae3e

ai-plans: rename plan

e2a47d4

mostlygeek marked this pull request as draft December 14, 2025 18:48

coderabbitai bot reviewed Dec 14, 2025

View reviewed changes

proxy: increase logMonitor to 100KB of history

d38b898

bartlettroscoe mentioned this pull request Dec 15, 2025

Add <host>/logs/stream/upstream:model_id endpoint to allow getting llama-server STDOUT for a single llama-server instance/model? #421

Closed

mostlygeek mentioned this pull request Dec 17, 2025

New in llama.cpp: Model Management #426

Closed

mostlygeek marked this pull request as ready for review December 19, 2025 05:47

mostlygeek merged commit d3f329f into main Dec 19, 2025
2 of 3 checks passed

coderabbitai bot reviewed Dec 19, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Dec 22, 2025

config,proxy: add new configuration logToStdout #432

Merged

mostlygeek deleted the logging-improvements branch January 31, 2026 04:16

coderabbitai bot mentioned this pull request Mar 14, 2026

feat: add status and ttl to /v1/models endpoint #569

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve logging performance and separation (#421)#428

Improve logging performance and separation (#421)#428
mostlygeek merged 5 commits intomainfrom
logging-improvements

mostlygeek commented Dec 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 19, 2025 •

edited

Loading

Uh oh!

mostlygeek Dec 19, 2025

Uh oh!

coderabbitai bot Dec 19, 2025

Uh oh!

coderabbitai bot Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mostlygeek commented Dec 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance improvements using MBP M1 Pro

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mostlygeek Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mostlygeek commented Dec 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 14, 2025 •

edited

Loading

coderabbitai bot Dec 19, 2025 •

edited

Loading