in memory request logging and observability by khimaros · Pull Request #503 · mostlygeek/llama-swap

khimaros · 2026-02-01T20:59:08Z

i tried to keep this as isolated as possible and minimize impact to the rest of the codebase. sorry i didn't run gofmt in a separate changelist before submitting this one so there is a bit of whitespace noise. fixes #464

Summary by CodeRabbit

New Features
- Request monitoring: capture per-request metadata, durations, and streaming response bodies with real-time events.
- New API endpoints to list requests and fetch request details; SSE now emits request events.
- New "Requests" UI page and header nav item with list/detail panels, streaming and full-JSON views.
- JsonView component for pretty-printed JSON display.
- Frontend types and stores updated to support request logs and request detail fetching.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-02-01T20:59:23Z

Walkthrough

Adds end-to-end request recording and streaming: new request monitor and events, captures request/response bodies in proxy handlers and metrics, exposes GET /api/requests and GET /api/requests/:id, and adds a Svelte Requests UI, types, and store integration.

Changes

Cohort / File(s)	Summary
Event ID `proxy/events.go`	Adds `RequestEventID = 0x07`.
Request monitor `proxy/request_monitor.go`	New thread-safe `requestMonitor`, public `RequestEntry`/`RequestEvent` types, Add/Update/AppendResponse/Get methods, buffering and throttled event emission.
Proxy integration & metrics `proxy/proxymanager.go`, `proxy/metrics_monitor.go`	Adds `ProxyManager.requestMonitor` field and `recordRequest` method; proxy handlers now create per-request recorder to capture small request bodies and stream responses; `responseBodyCopier` gains `onWrite func([]byte)` and reuse logic.
API & SSE `proxy/proxymanager_api.go`	Adds `GET /api/requests` and `GET /api/requests/:id`, new SSE message type `request`, initial sync includes recent requests (bodies stripped), SSE buffer increased.
Frontend types & store `ui-svelte/src/lib/types.ts`, `ui-svelte/src/stores/api.ts`	Adds `RequestLog` type and `request` event union member; relaxes `ChatMessage` typing; adds `requests` writable store and `getRequestDetail` helper; SSE handling updated to merge request events.
Frontend UI & components `ui-svelte/src/routes/Requests.svelte`, `ui-svelte/src/components/JsonView.svelte`, `ui-svelte/src/components/playground/ChatMessage.svelte`	Adds Requests page (list/detail, streaming and JSON parsing, two-panel UI), `JsonView` component, and updates ChatMessage.svelte to use shared `ChatMessage` type.
Routing & header `ui-svelte/src/App.svelte`, `ui-svelte/src/components/Header.svelte`	Registers `/requests` route and adds "Requests" navigation item in header.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

ui: add Svelte port of React UI #487: Overlapping Svelte UI changes (App.svelte, Header.svelte, types/stores) that add the requests UI and types.

Suggested labels

enhancement

Suggested reviewers

mostlygeek

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'in memory request logging and observability' directly reflects the main purpose of the changeset: implementing request logging and observability features.
Linked Issues check	✅ Passed	The PR implements request body logging and observability to address issue `#464`'s need to inspect HTTP requests and debug peer integration issues.
Out of Scope Changes check	✅ Passed	Changes include request monitoring, API endpoints, UI components for viewing requests, and type updates that support the core objective of request logging observability.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

khimaros · 2026-02-01T21:01:25Z

provides a user friendly interface for browsing through different sections of a request (system prompt, user prompt, tools) and response (thoughts, tool calls, and answer)

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In `@proxy/proxymanager.go`:
- Around line 739-740: The recorded request body can be arbitrarily large;
truncate the body to a safe cap (e.g. 1 MiB) before passing it to
pm.recordRequest to avoid unbounded in-memory growth—modify the call site around
recorder, done := pm.recordRequest(c, modelID, string(bodyBytes)) to pass a
capped/truncated string (or add truncation inside pm.recordRequest) so only the
first 1<<20 bytes are retained, and keep the recorder/done flow unchanged;
reference pm.recordRequest, recorder, done and the bodyBytes variable when
making the change.
- Around line 603-612: The read-failure branch must stop proxying and return an
error to the client: when io.ReadAll(c.Request.Body) returns an err, log the
error with pm.proxyLogger.Errorf and then abort the request handling (do not
continue to upstream) by sending an HTTP error response (e.g., 500) and
returning from the handler; do not attempt to proxy with a partially consumed
c.Request.Body or set requestBody in that case. Update the block around
io.ReadAll, c.Request.Body, requestBody and the surrounding handler logic to
perform this early return on error (use the framework's abort/return method such
as c.AbortWithStatus/AbortWithStatusJSON or equivalent).
- Around line 1055-1078: In recordRequest, the responseBodyCopier created by
newBodyCopier is never assigned to the Gin context writer so error responses
(e.g., those sent via sendErrorResponse) bypass it; fix by assigning the
recorder (responseBodyCopier) to c.Writer immediately after creation so it
implements gin.ResponseWriter and captures all writes (ensure recorder.onWrite
remains set and cleanup restores original writer if needed).

In `@ui-svelte/src/routes/Requests.svelte`:
- Around line 23-30: The current merge in selectedRequest ({ ...detailedRequest,
...fromList }) allows empty list fields to overwrite fetched detail bodies;
instead, keep detailedRequest as the source of truth for bodies and only pull
live status fields from the list. Update the selectedRequest derivation to merge
so detailedRequest properties win for request_body/response_body (e.g., merge
detailedRequest last) and, if fromList exists, copy only the live status fields
(like status, statusText or whatever live fields your app uses) from fromList
into the final object; reference the selectedRequest variable and the
detailedRequest/fromList identifiers when making this change.
- Around line 257-277: The clickable <tr> currently uses onclick with viewDetail
and blocks keyboard users; remove the row-level onclick and instead render a
native interactive element (a <button> or <a>) inside a <td> for each row entry
(e.g., wrap the row content in a full-width button inside the first or a
dedicated <td>), keep using viewDetail(req) as the click handler on that
element, style it with CSS (display:block; width:100%; padding:inherit) so it
visually spans the row, and preserve selection logic using selectedId and ARIA
attributes on the button (e.g., aria-pressed or aria-current) while keeping
existing helpers like formatRelativeTime and formatDuration unchanged.

🧹 Nitpick comments (3)

ui-svelte/src/components/JsonView.svelte (1)

16-22: Consider adding clipboard error handling.

The navigator.clipboard.writeText call can fail (e.g., in non-HTTPS contexts or when clipboard permissions are denied). Consider adding user feedback for success/failure.

♻️ Optional: Add clipboard feedback

+<script lang="ts">
+  let { content = "" } = $props();
+  let copyStatus = $state<"idle" | "copied" | "error">("idle");
+
+  let formattedContent = $derived.by(() => {
+    try {
+      const obj = JSON.parse(content);
+      return JSON.stringify(obj, null, 2);
+    } catch (e) {
+      return content;
+    }
+  });
+
+  async function copyToClipboard() {
+    try {
+      await navigator.clipboard.writeText(formattedContent);
+      copyStatus = "copied";
+      setTimeout(() => copyStatus = "idle", 1500);
+    } catch {
+      copyStatus = "error";
+      setTimeout(() => copyStatus = "idle", 1500);
+    }
+  }
+</script>

Then update the button:

   <button
     class="absolute top-2 right-2 p-1 bg-white/10 hover:bg-white/20 rounded text-xs opacity-0 group-hover:opacity-100 transition-opacity"
-    onclick={() => navigator.clipboard.writeText(formattedContent)}
+    onclick={copyToClipboard}
     title="Copy to clipboard"
   >
-    Copy
+    {copyStatus === "copied" ? "Copied!" : copyStatus === "error" ? "Failed" : "Copy"}
   </button>

ui-svelte/src/stores/api.ts (1)

12-12: Consider adding a size limit to the requests store.

Unlike proxyLogs which has LOG_LENGTH_LIMIT, the requests array can grow unbounded during long sessions. If the backend sends many request events, this could consume significant browser memory over time.

♻️ Suggested: Add requests limit

 const LOG_LENGTH_LIMIT = 1024 * 100; /* 100KB of log data */
+const MAX_REQUESTS = 1000; /* Maximum number of requests to keep in memory */

 // Stores
 export const models = writable<Model[]>([]);

Then in the request handler:

           case "request": {
             const req = JSON.parse(message.data) as RequestLog;
             requests.update((prev) => {
               const index = prev.findIndex((r) => r.id === req.id);
               if (index === -1) {
-                return [req, ...prev];
+                const updated = [req, ...prev];
+                return updated.length > MAX_REQUESTS ? updated.slice(0, MAX_REQUESTS) : updated;
               } else {
                 const updated = [...prev];
                 updated[index] = req;
                 return updated;
               }
             });
             break;
           }

proxy/request_monitor.go (1)

66-84: Consider clearing respBuf after completion to reduce memory retention.
Once ResponseBody is set, keeping the buffer doubles memory for large responses.
♻️ Suggested tweak
 			e.Pending = false
 			event.Emit(RequestEvent{Entry: *e})
+			e.respBuf = bytes.Buffer{}
 			return

coderabbitai · 2026-02-01T21:10:41Z

proxy/proxymanager.go

+	var requestBody string
+	if c.Request.ContentLength > 0 && c.Request.ContentLength < 1024*1024 { // Only capture small bodies
+		bodyBytes, err := io.ReadAll(c.Request.Body)
+		if err != nil {
+			pm.proxyLogger.Errorf("Error reading request body for recording: %v", err)
+		} else {
+			c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
+			requestBody = string(bodyBytes)
+		}
+	}


⚠️ Potential issue | 🟠 Major

Don’t continue proxying after a failed body read.
If io.ReadAll fails, the request body may be partially consumed, and the upstream call will see a truncated or empty body. Bail out early and return an error response.

🔧 Suggested fix

- bodyBytes, err := io.ReadAll(c.Request.Body) - if err != nil { - pm.proxyLogger.Errorf("Error reading request body for recording: %v", err) - } else { - c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes)) - requestBody = string(bodyBytes) - } + bodyBytes, err := io.ReadAll(c.Request.Body) + if err != nil { + pm.sendErrorResponse(c, http.StatusBadRequest, "failed to read request body") + pm.proxyLogger.Errorf("Error reading request body for recording: %v", err) + return + } + c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes)) + requestBody = string(bodyBytes)

🤖 Prompt for AI Agents

In `@proxy/proxymanager.go` around lines 603 - 612, The read-failure branch must stop proxying and return an error to the client: when io.ReadAll(c.Request.Body) returns an err, log the error with pm.proxyLogger.Errorf and then abort the request handling (do not continue to upstream) by sending an HTTP error response (e.g., 500) and returning from the handler; do not attempt to proxy with a partially consumed c.Request.Body or set requestBody in that case. Update the block around io.ReadAll, c.Request.Body, requestBody and the surrounding handler logic to perform this early return on error (use the framework's abort/return method such as c.AbortWithStatus/AbortWithStatusJSON or equivalent).

proxy/proxymanager.go

coderabbitai · 2026-02-01T21:10:42Z

proxy/proxymanager.go

+func (pm *ProxyManager) recordRequest(c *gin.Context, modelID string, requestBody string) (*responseBodyCopier, func()) {
+	startTime := time.Now()
+	requestID := pm.requestMonitor.Add(&RequestEntry{
+		Timestamp:   startTime,
+		Method:      c.Request.Method,
+		Path:        c.Request.URL.Path,
+		Model:       modelID,
+		RequestBody: requestBody,
+	})
+
+	recorder := newBodyCopier(c.Writer)
+	recorder.onWrite = func(b []byte) {
+		pm.requestMonitor.AppendResponse(requestID, string(b))
+	}
+
+	return recorder, func() {
+		duration := time.Since(startTime)
+		respBody := ""
+		isStreaming := strings.Contains(recorder.Header().Get("Content-Type"), "text/event-stream")
+		if !isStreaming {
+			respBody = recorder.body.String()
+		}
+		pm.requestMonitor.Update(requestID, recorder.Status(), duration, respBody)
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, let's locate the responseBodyCopier implementation cd /tmp/repo 2>/dev/null || cd . git ls-files | head -20 rg -l "responseBodyCopier|newBodyCopier" --type go

Repository: mostlygeek/llama-swap

Length of output: 686

🏁 Script executed:

#!/bin/bash # Find the responseBodyCopier type definition and newBodyCopier function rg "type responseBodyCopier|newBodyCopier" -A 5 --type go

Repository: mostlygeek/llama-swap

Length of output: 2756

🏁 Script executed:

#!/bin/bash # Find all methods implemented by responseBodyCopier rg "func $.*\*?responseBodyCopier$" --type go

Repository: mostlygeek/llama-swap

Length of output: 388

🏁 Script executed:

#!/bin/bash # Check how sendErrorResponse is implemented rg "func.*sendErrorResponse" -A 20 --type go

Repository: mostlygeek/llama-swap

Length of output: 1185

🏁 Script executed:

#!/bin/bash # Find where recordRequest is called rg "recordRequest" -B 3 -A 10 --type go

Repository: mostlygeek/llama-swap

Length of output: 4222

🏁 Script executed:

#!/bin/bash # Find the handler/middleware where recordRequest is used rg "func.*gin.HandlerFunc|func.*gin.Context" -B 2 -A 25 --type go | head -200

Repository: mostlygeek/llama-swap

Length of output: 10675

🏁 Script executed:

#!/bin/bash # Check if wrapHandler or other handlers assign c.Writer = recorder rg "c.Writer\s*=" -B 5 -A 5 --type go proxy/

Repository: mostlygeek/llama-swap

Length of output: 47

🏁 Script executed:

#!/bin/bash # Check the wrapHandler signature and implementation rg "func.*wrapHandler" -A 30 --type go

Repository: mostlygeek/llama-swap

Length of output: 1814

Assign the recorder to c.Writer to capture error responses.

recordRequest creates a recorder but doesn't attach it to c.Writer, so error responses written via sendErrorResponse bypass the recorder entirely. This leaves request logs with empty bodies on error paths.

The fix is valid: responseBodyCopier embeds gin.ResponseWriter, satisfying the interface for assignment to c.Writer.

Suggested fix

recorder := newBodyCopier(c.Writer) +c.Writer = recorder recorder.onWrite = func(b []byte) {

🤖 Prompt for AI Agents

In `@proxy/proxymanager.go` around lines 1055 - 1078, In recordRequest, the responseBodyCopier created by newBodyCopier is never assigned to the Gin context writer so error responses (e.g., those sent via sendErrorResponse) bypass it; fix by assigning the recorder (responseBodyCopier) to c.Writer immediately after creation so it implements gin.ResponseWriter and captures all writes (ensure recorder.onWrite remains set and cleanup restores original writer if needed).

ui-svelte/src/routes/Requests.svelte

coderabbitai · 2026-02-01T21:10:42Z

ui-svelte/src/routes/Requests.svelte

+              {#each sortedRequests as req (req.id)}
+                <tr
+                  class="whitespace-nowrap text-sm cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-800 {selectedId === req.id ? 'bg-gray-100 dark:bg-gray-800' : ''}"
+                  onclick={() => viewDetail(req)}
+                >
+                  <td class="px-4 py-3">{req.id + 1}</td>
+                  <td class="px-4 py-3">{formatRelativeTime(req.timestamp)}</td>
+                  <td class="px-4 py-3 font-mono">{req.method}</td>
+                  <td class="px-4 py-3 font-mono text-xs">{req.path}</td>
+                  <td class="px-4 py-3">{req.model}</td>
+                  <td class="px-4 py-3">
+                    {#if req.pending}
+                      <span class="text-yellow-500">pending</span>
+                    {:else}
+                      <span class={req.status >= 200 && req.status < 300 ? 'text-green-500' : 'text-red-500'}>
+                        {req.status}
+                      </span>
+                    {/if}
+                  </td>
+                  <td class="px-4 py-3 text-gray-500">{req.pending ? "-" : formatDuration(req.duration)}</td>
+                </tr>


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Svelte event handler syntax on:keydown keyboard events

💡 Result:

In Svelte, keyboard events use the normal DOM event directive syntax:

<input on:keydown={handleKeydown} />

The handler receives a KeyboardEvent, so you typically branch on event.key:

<script> function handleKeydown(event) { if (event.key === 'Enter') { // ... } } </script> <input on:keydown={handleKeydown} />

[1], [2]

You can also use Svelte’s event modifiers with | (these affect event behavior, not which key was pressed):

<input on:keydown|preventDefault|stopPropagation={handleKeydown} />

[1], [3]

If you want a global key handler, use <svelte:window>:

<svelte:window on:keydown={handleKeydown} />

[2]

Sources: [1] Svelte docs — Element directives (on:eventname, modifiers). [2] Svelte docs — <svelte:window> example with on:keydown and event.key. [3] Svelte docs — on: modifier list.

🌐 Web query:

Svelte accessible clickable table rows keyboard accessibility best practices

💡 Result:

Best practice (recommended): keep the <table> semantic and put a real interactive element in a cell

Don’t make the <tr> itself the “button/link”. Instead, put an actual <a> (navigation) or <button> (action) inside a <td> and style it to look like the whole row is clickable. Native links/buttons already have correct keyboard behavior (Enter; and Space for buttons). Svelte’s a11y guidance explicitly recommends this over adding click handlers to non-interactive elements. [1][2]

Svelte example (row navigates):

<tr> <td class="row"> <a class="rowLink" href={`/users/${user.id}`}> {user.name} </a> </td> <td>{user.email}</td> </tr> <style> td.row { padding: 0; } a.rowLink { display: block; padding: .75rem 1rem; } /* makes the hit area “row-like” */ tr:hover a.rowLink { text-decoration: underline; } a.rowLink:focus-visible { outline: 2px solid currentColor; outline-offset: 2px; } </style>

If you must attach on:click to something non-interactive

Svelte will warn unless you add equivalent keyboard support and make it focusable (typically tabindex="0"), but it still advises using native <button>/<a> where possible. [1][2]

Add on:keydown (or on:keyup) for Enter (and Space if it’s acting like a button).

Avoid positive tabindex values (they break expected tab order). [1]

Why “clickable <tr>” is usually a bad idea

ARIA “table/row” patterns are structural and don’t define row-level keyboard interaction (“Keyboard Interaction: not applicable / none”). So if you turn rows into controls, you’re inventing interactions that assistive tech won’t expect. [3][4]

Sources: Svelte accessibility warnings (keyboard handlers + prefer <button>/<a>) [1][2], WAI-ARIA Authoring Practices / table pattern [3], MDN ARIA table/row roles (no keyboard interaction) [4].

Clickable rows lack keyboard accessibility. Users relying on keyboards cannot open a request, which blocks this workflow.

The suggested fix has two issues:

Syntax error: Svelte uses on:keydown (not onkeydown). The directive syntax is on:eventname.

Anti-pattern: Making <tr> act like a clickable button violates ARIA table semantics and accessibility best practices. ARIA table/row roles don't define keyboard interaction patterns, so assistive technology won't expect row-level interactions.

Recommended approach: Place a native <button> or <a> inside a <td>, and use CSS to style it to visually span the row (e.g., display: block; padding: inherit;). This gives keyboard support natively and maintains proper semantic HTML.

🤖 Prompt for AI Agents

In `@ui-svelte/src/routes/Requests.svelte` around lines 257 - 277, The clickable <tr> currently uses onclick with viewDetail and blocks keyboard users; remove the row-level onclick and instead render a native interactive element (a <button> or <a>) inside a <td> for each row entry (e.g., wrap the row content in a full-width button inside the first or a dedicated <td>), keep using viewDetail(req) as the click handler on that element, style it with CSS (display:block; width:100%; padding:inherit) so it visually spans the row, and preserve selection logic using selectedId and ARIA attributes on the button (e.g., aria-pressed or aria-current) while keeping existing helpers like formatRelativeTime and formatDuration unchanged.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@proxy/proxymanager_api.go`:
- Around line 257-265: apiGetRequests is mutating the entries returned by
pm.requestMonitor.GetEntries() which may be references to stored data; instead
create a new slice of copied entries and clear RequestBody/ResponseBody on the
copies so the original data isn't modified. Locate apiGetRequests and GetEntries
usage, allocate a new slice with the same length, copy each entry (e.g., by
value or deep copy), set the RequestBody and ResponseBody on the copy to empty,
and return the new slice in c.JSON to avoid corrupting the stored entries.

🧹 Nitpick comments (4)

ui-svelte/src/lib/types.ts (1)

69-74: Type loosening: role is now string instead of a union.

Changing role from "user" | "assistant" | "system" to string removes compile-time safety for role values. If this is intentional to support additional roles (e.g., "tool"), consider documenting or using a broader union type for clarity.
ui-svelte/src/components/JsonView.svelte (1)
16-22: Add error handling for clipboard API.

The copy button directly calls navigator.clipboard.writeText() without error handling. This can throw if the clipboard API is unavailable (non-HTTPS contexts) or if the user denies permission. The ChatMessage.svelte component in this same codebase has a more robust implementation with fallback.
Suggested fix
+<script lang="ts">
+  let { content = "" } = $props();
+  let copied = $state(false);
+
+  let formattedContent = $derived.by(() => {
+    try {
+      const obj = JSON.parse(content);
+      return JSON.stringify(obj, null, 2);
+    } catch (e) {
+      return content;
+    }
+  });
+
+  async function copyToClipboard() {
+    try {
+      await navigator.clipboard.writeText(formattedContent);
+      copied = true;
+      setTimeout(() => (copied = false), 2000);
+    } catch (err) {
+      console.error("Failed to copy:", err);
+    }
+  }
+</script>
Then update the button:
   <button
     class="absolute top-2 right-2 p-1 bg-white/10 hover:bg-white/20 rounded text-xs opacity-0 group-hover:opacity-100 transition-opacity"
-    onclick={() => navigator.clipboard.writeText(formattedContent)}
+    onclick={copyToClipboard}
     title="Copy to clipboard"
   >
-    Copy
+    {copied ? "Copied!" : "Copy"}
   </button>
ui-svelte/src/routes/Requests.svelte (2)
32-42: Consider surfacing fetch errors to the user.

The error is logged to console but the user sees no indication that the detail fetch failed. For a debugging tool, this may cause confusion if the detail panel appears empty without explanation.
💡 Optional: Add error state feedback
 let isLoadingDetail = $state(false);
+let detailError = $state<string | null>(null);
 ...
 async function viewDetail(req: RequestLog) {
   selectedId = req.id;
   isLoadingDetail = true;
+  detailError = null;
   try {
     detailedRequest = await getRequestDetail(req.id);
   } catch (err) {
     console.error(err);
+    detailError = "Failed to load request details";
   } finally {
     isLoadingDetail = false;
   }
 }
450-484: Array tool arguments will display with numeric indices.

The typeof parsedArgs === 'object' check (line 457) includes arrays. If tool arguments are an array, Object.entries will show indices like "0", "1" as argument names. This may be confusing but won't break functionality.
💡 Optional: Add array check for cleaner display
-{`#if` parsedArgs && typeof parsedArgs === 'object'}
+{`#if` parsedArgs && typeof parsedArgs === 'object' && !Array.isArray(parsedArgs)}
   
+{:else if Array.isArray(parsedArgs)}
+  <JsonView content={JSON.stringify(parsedArgs)} />

coderabbitai · 2026-02-01T21:39:26Z

proxy/proxymanager_api.go

+func (pm *ProxyManager) apiGetRequests(c *gin.Context) {
+	entries := pm.requestMonitor.GetEntries()
+	// Strip bodies for list view
+	for i := range entries {
+		entries[i].RequestBody = ""
+		entries[i].ResponseBody = ""
+	}
+	c.JSON(http.StatusOK, entries)
+}


⚠️ Potential issue | 🟠 Major

Same mutation issue — modifying returned slice elements.

Similar to the SSE initial sync, this modifies entries[i] directly. If GetEntries() returns references, this corrupts the stored data.

Suggested fix

func (pm *ProxyManager) apiGetRequests(c *gin.Context) { entries := pm.requestMonitor.GetEntries() - // Strip bodies for list view - for i := range entries { - entries[i].RequestBody = "" - entries[i].ResponseBody = "" - } - c.JSON(http.StatusOK, entries) + // Strip bodies for list view - create copies to avoid mutating stored data + stripped := make([]RequestEntry, len(entries)) + for i, e := range entries { + stripped[i] = e + stripped[i].RequestBody = "" + stripped[i].ResponseBody = "" + } + c.JSON(http.StatusOK, stripped) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (pm *ProxyManager) apiGetRequests(c *gin.Context) {

entries := pm.requestMonitor.GetEntries()

// Strip bodies for list view

for i := range entries {

entries[i].RequestBody = ""

entries[i].ResponseBody = ""

}

c.JSON(http.StatusOK, entries)

}

func (pm *ProxyManager) apiGetRequests(c *gin.Context) {

entries := pm.requestMonitor.GetEntries()

// Strip bodies for list view - create copies to avoid mutating stored data

stripped := make([]RequestEntry, len(entries))

for i, e := range entries {

stripped[i] = e

stripped[i].RequestBody = ""

stripped[i].ResponseBody = ""

}

c.JSON(http.StatusOK, stripped)

}

🤖 Prompt for AI Agents

In `@proxy/proxymanager_api.go` around lines 257 - 265, apiGetRequests is mutating the entries returned by pm.requestMonitor.GetEntries() which may be references to stored data; instead create a new slice of copied entries and clear RequestBody/ResponseBody on the copies so the original data isn't modified. Locate apiGetRequests and GetEntries usage, allocate a new slice with the same length, copy each entry (e.g., by value or deep copy), set the RequestBody and ResponseBody on the copy to empty, and return the new slice in c.JSON to avoid corrupting the stored entries.

mostlygeek · 2026-02-03T08:46:16Z

Hi,

Thanks for submitting this PR. After building a few llm traffic capture tools I want this functionality to be independent of llama-swap.

The main reason is so there is a more room for it to develop its own unique feature set. That would be better for llama-swap and the inspector tool.

h3po · 2026-02-03T16:13:23Z

would you maybe consider exposing the request-response streams via an api, so one does not have to put another proxy in front of llama-swap?

i'm also looking for a logging solution and came to the conclusion that it would be easiest to contribute it to llama-swap, landing me here. there is existing standalone software for this, for example llm-proxy - but it would require node, which i could do without... other solutions are heavy and need other supporting infrastructure.

mostlygeek · 2026-02-03T17:02:28Z

That's a good suggestion to expose the data via some sort of API.

@h3po what is the use case you have for req/resp in llama-swap?

h3po · 2026-02-03T17:48:12Z

i use llama-swap a lot for quickly trying out different models/sampling configs/quantizations and also frontend software like rag databases, chat ui etc. it would be useful to gather logs for debugging and auditing on the llama-swap side instead of coming up with a way to log each and every client software separately.
i just spent a few hours trying to coax the logs out of a backend llama-server --log-verbose itself but with streaming responses it's hard to correlate the response with the request, since the full response text is never gathered/logged by the http server thread itself (just some other jinja parsing related outputs). apart from patching llama.cpp, the proxy that handles the client connection would have the easiest way to accumulate the response tokens i think.

mostlygeek · 2026-02-03T18:16:56Z

Thanks that's helpful context. I think perhaps there is a good middle ground here. A lightweight UI similar to what @khimaros created and also an API/hooks/plugin system for a deeper inspection. What is llama-swap anymore!? :D

khimaros · 2026-02-03T19:01:39Z

this is my use case exactly as well. I've been running this PR on my Strix Halo and have found it incredibly useful for understanding how different clients behave.

@mostlygeek if there is anything you need from me to make this more attractive for merge, please let me know. happy to iterate on design or technical approach.

khimaros · 2026-02-03T19:04:34Z

FWIW, i did try out some other proxies like Bifrost and LiteLLM but it's quite annoying maintaining two identical sources of truth for model lists (with a lot of clicking around a web admin interface). useful if also working with remote models but not helpful for my case.

h3po · 2026-02-04T09:12:55Z

@khimaros thanks for implementing this. i'll suggest adding a filter for the request path; my log is full of /metrics requests because i use llama-server with --metrics (https://gist.github.com/h3po/f7703e7cc08cf7151b58820eaeccfbd9). also it would be nice if you would drop large components like type image_url from the body before the 1MB check

mostlygeek · 2026-02-05T06:39:37Z

@khimaros thanks for opening up this issue. After reviewing this PR, I think integrating capture data into the metrics monitor makes a lot of sense. I opened #508 and would appreciate your feedback if you have any.

Add saving request and response headers and bodies that go through llama-swap in memory. - captureBuffer added to configuration. Captures are enabled by default. - 5MB of memory is allocated for req/response captures in a ring buffer. Setting captureBuffer to 0 will disable captures. - UI elements to view captured data added to Activity page. Includes some QOL features like json formatting and recombining SSE chat streams - capture saving is done at the byte level and has minimal impact on llama-swap performance Fixes #464 Ref #503

mostlygeek · 2026-02-07T23:40:32Z

closed by #508

khimaros · 2026-02-08T14:47:11Z

awesome! a little harder to read than my PoC and doesn't show requests until they complete, but it will get the job done and save me from maintaining a fork 🤓

coderabbitai bot reviewed Feb 1, 2026

View reviewed changes

khimaros force-pushed the main branch from 8c4d351 to b776a41 Compare February 1, 2026 21:35

coderabbitai bot reviewed Feb 1, 2026

View reviewed changes

in memory request logging and observability

5c64989

khimaros force-pushed the main branch from b776a41 to 5c64989 Compare February 2, 2026 02:05

mostlygeek mentioned this pull request Feb 5, 2026

proxy,ui-svelte: add request/response capturing #508

Merged

mostlygeek closed this Feb 7, 2026

Conversation

khimaros commented Feb 1, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

khimaros commented Feb 1, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 1, 2026

Choose a reason for hiding this comment

Best practice (recommended): keep the <table> semantic and put a real interactive element in a cell

If you must attach on:click to something non-interactive

Why “clickable <tr>” is usually a bad idea

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

mostlygeek commented Feb 3, 2026

Uh oh!

h3po commented Feb 3, 2026

Uh oh!

mostlygeek commented Feb 3, 2026

Uh oh!

h3po commented Feb 3, 2026

Uh oh!

mostlygeek commented Feb 3, 2026

Uh oh!

khimaros commented Feb 3, 2026

Uh oh!

khimaros commented Feb 3, 2026

Uh oh!

h3po commented Feb 4, 2026

Uh oh!

mostlygeek commented Feb 5, 2026

Uh oh!

mostlygeek commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

khimaros commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

khimaros commented Feb 1, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 1, 2026 •

edited

Loading

Best practice (recommended): keep the `<table>` semantic and put a real interactive element in a cell

If you must attach `on:click` to something non-interactive

Why “clickable `<tr>`” is usually a bad idea

mostlygeek commented Feb 7, 2026 •

edited

Loading