Skip to content

in memory request logging and observability#503

Closed
khimaros wants to merge 1 commit intomostlygeek:mainfrom
khimaros:main
Closed

in memory request logging and observability#503
khimaros wants to merge 1 commit intomostlygeek:mainfrom
khimaros:main

Conversation

@khimaros
Copy link

@khimaros khimaros commented Feb 1, 2026

i tried to keep this as isolated as possible and minimize impact to the rest of the codebase. sorry i didn't run gofmt in a separate changelist before submitting this one so there is a bit of whitespace noise. fixes #464

Summary by CodeRabbit

  • New Features
    • Request monitoring: capture per-request metadata, durations, and streaming response bodies with real-time events.
    • New API endpoints to list requests and fetch request details; SSE now emits request events.
    • New "Requests" UI page and header nav item with list/detail panels, streaming and full-JSON views.
    • JsonView component for pretty-printed JSON display.
    • Frontend types and stores updated to support request logs and request detail fetching.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Feb 1, 2026

Walkthrough

Adds end-to-end request recording and streaming: new request monitor and events, captures request/response bodies in proxy handlers and metrics, exposes GET /api/requests and GET /api/requests/:id, and adds a Svelte Requests UI, types, and store integration.

Changes

Cohort / File(s) Summary
Event ID
proxy/events.go
Adds RequestEventID = 0x07.
Request monitor
proxy/request_monitor.go
New thread-safe requestMonitor, public RequestEntry/RequestEvent types, Add/Update/AppendResponse/Get methods, buffering and throttled event emission.
Proxy integration & metrics
proxy/proxymanager.go, proxy/metrics_monitor.go
Adds ProxyManager.requestMonitor field and recordRequest method; proxy handlers now create per-request recorder to capture small request bodies and stream responses; responseBodyCopier gains onWrite func([]byte) and reuse logic.
API & SSE
proxy/proxymanager_api.go
Adds GET /api/requests and GET /api/requests/:id, new SSE message type request, initial sync includes recent requests (bodies stripped), SSE buffer increased.
Frontend types & store
ui-svelte/src/lib/types.ts, ui-svelte/src/stores/api.ts
Adds RequestLog type and request event union member; relaxes ChatMessage typing; adds requests writable store and getRequestDetail helper; SSE handling updated to merge request events.
Frontend UI & components
ui-svelte/src/routes/Requests.svelte, ui-svelte/src/components/JsonView.svelte, ui-svelte/src/components/playground/ChatMessage.svelte
Adds Requests page (list/detail, streaming and JSON parsing, two-panel UI), JsonView component, and updates ChatMessage.svelte to use shared ChatMessage type.
Routing & header
ui-svelte/src/App.svelte, ui-svelte/src/components/Header.svelte
Registers /requests route and adds "Requests" navigation item in header.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • mostlygeek
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'in memory request logging and observability' directly reflects the main purpose of the changeset: implementing request logging and observability features.
Linked Issues check ✅ Passed The PR implements request body logging and observability to address issue #464's need to inspect HTTP requests and debug peer integration issues.
Out of Scope Changes check ✅ Passed Changes include request monitoring, API endpoints, UI components for viewing requests, and type updates that support the core objective of request logging observability.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@khimaros
Copy link
Author

khimaros commented Feb 1, 2026

provides a user friendly interface for browsing through different sections of a request (system prompt, user prompt, tools) and response (thoughts, tool calls, and answer)

image

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@proxy/proxymanager.go`:
- Around line 739-740: The recorded request body can be arbitrarily large;
truncate the body to a safe cap (e.g. 1 MiB) before passing it to
pm.recordRequest to avoid unbounded in-memory growth—modify the call site around
recorder, done := pm.recordRequest(c, modelID, string(bodyBytes)) to pass a
capped/truncated string (or add truncation inside pm.recordRequest) so only the
first 1<<20 bytes are retained, and keep the recorder/done flow unchanged;
reference pm.recordRequest, recorder, done and the bodyBytes variable when
making the change.
- Around line 603-612: The read-failure branch must stop proxying and return an
error to the client: when io.ReadAll(c.Request.Body) returns an err, log the
error with pm.proxyLogger.Errorf and then abort the request handling (do not
continue to upstream) by sending an HTTP error response (e.g., 500) and
returning from the handler; do not attempt to proxy with a partially consumed
c.Request.Body or set requestBody in that case. Update the block around
io.ReadAll, c.Request.Body, requestBody and the surrounding handler logic to
perform this early return on error (use the framework's abort/return method such
as c.AbortWithStatus/AbortWithStatusJSON or equivalent).
- Around line 1055-1078: In recordRequest, the responseBodyCopier created by
newBodyCopier is never assigned to the Gin context writer so error responses
(e.g., those sent via sendErrorResponse) bypass it; fix by assigning the
recorder (responseBodyCopier) to c.Writer immediately after creation so it
implements gin.ResponseWriter and captures all writes (ensure recorder.onWrite
remains set and cleanup restores original writer if needed).

In `@ui-svelte/src/routes/Requests.svelte`:
- Around line 23-30: The current merge in selectedRequest ({ ...detailedRequest,
...fromList }) allows empty list fields to overwrite fetched detail bodies;
instead, keep detailedRequest as the source of truth for bodies and only pull
live status fields from the list. Update the selectedRequest derivation to merge
so detailedRequest properties win for request_body/response_body (e.g., merge
detailedRequest last) and, if fromList exists, copy only the live status fields
(like status, statusText or whatever live fields your app uses) from fromList
into the final object; reference the selectedRequest variable and the
detailedRequest/fromList identifiers when making this change.
- Around line 257-277: The clickable <tr> currently uses onclick with viewDetail
and blocks keyboard users; remove the row-level onclick and instead render a
native interactive element (a <button> or <a>) inside a <td> for each row entry
(e.g., wrap the row content in a full-width button inside the first or a
dedicated <td>), keep using viewDetail(req) as the click handler on that
element, style it with CSS (display:block; width:100%; padding:inherit) so it
visually spans the row, and preserve selection logic using selectedId and ARIA
attributes on the button (e.g., aria-pressed or aria-current) while keeping
existing helpers like formatRelativeTime and formatDuration unchanged.
🧹 Nitpick comments (3)
ui-svelte/src/components/JsonView.svelte (1)

16-22: Consider adding clipboard error handling.

The navigator.clipboard.writeText call can fail (e.g., in non-HTTPS contexts or when clipboard permissions are denied). Consider adding user feedback for success/failure.

♻️ Optional: Add clipboard feedback
+<script lang="ts">
+  let { content = "" } = $props();
+  let copyStatus = $state<"idle" | "copied" | "error">("idle");
+
+  let formattedContent = $derived.by(() => {
+    try {
+      const obj = JSON.parse(content);
+      return JSON.stringify(obj, null, 2);
+    } catch (e) {
+      return content;
+    }
+  });
+
+  async function copyToClipboard() {
+    try {
+      await navigator.clipboard.writeText(formattedContent);
+      copyStatus = "copied";
+      setTimeout(() => copyStatus = "idle", 1500);
+    } catch {
+      copyStatus = "error";
+      setTimeout(() => copyStatus = "idle", 1500);
+    }
+  }
+</script>

Then update the button:

   <button
     class="absolute top-2 right-2 p-1 bg-white/10 hover:bg-white/20 rounded text-xs opacity-0 group-hover:opacity-100 transition-opacity"
-    onclick={() => navigator.clipboard.writeText(formattedContent)}
+    onclick={copyToClipboard}
     title="Copy to clipboard"
   >
-    Copy
+    {copyStatus === "copied" ? "Copied!" : copyStatus === "error" ? "Failed" : "Copy"}
   </button>
ui-svelte/src/stores/api.ts (1)

12-12: Consider adding a size limit to the requests store.

Unlike proxyLogs which has LOG_LENGTH_LIMIT, the requests array can grow unbounded during long sessions. If the backend sends many request events, this could consume significant browser memory over time.

♻️ Suggested: Add requests limit
 const LOG_LENGTH_LIMIT = 1024 * 100; /* 100KB of log data */
+const MAX_REQUESTS = 1000; /* Maximum number of requests to keep in memory */

 // Stores
 export const models = writable<Model[]>([]);

Then in the request handler:

           case "request": {
             const req = JSON.parse(message.data) as RequestLog;
             requests.update((prev) => {
               const index = prev.findIndex((r) => r.id === req.id);
               if (index === -1) {
-                return [req, ...prev];
+                const updated = [req, ...prev];
+                return updated.length > MAX_REQUESTS ? updated.slice(0, MAX_REQUESTS) : updated;
               } else {
                 const updated = [...prev];
                 updated[index] = req;
                 return updated;
               }
             });
             break;
           }
proxy/request_monitor.go (1)

66-84: Consider clearing respBuf after completion to reduce memory retention.
Once ResponseBody is set, keeping the buffer doubles memory for large responses.

♻️ Suggested tweak
 			e.Pending = false
 			event.Emit(RequestEvent{Entry: *e})
+			e.respBuf = bytes.Buffer{}
 			return

Comment on lines +603 to +612
var requestBody string
if c.Request.ContentLength > 0 && c.Request.ContentLength < 1024*1024 { // Only capture small bodies
bodyBytes, err := io.ReadAll(c.Request.Body)
if err != nil {
pm.proxyLogger.Errorf("Error reading request body for recording: %v", err)
} else {
c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
requestBody = string(bodyBytes)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t continue proxying after a failed body read.
If io.ReadAll fails, the request body may be partially consumed, and the upstream call will see a truncated or empty body. Bail out early and return an error response.

🔧 Suggested fix
-		bodyBytes, err := io.ReadAll(c.Request.Body)
-		if err != nil {
-			pm.proxyLogger.Errorf("Error reading request body for recording: %v", err)
-		} else {
-			c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
-			requestBody = string(bodyBytes)
-		}
+		bodyBytes, err := io.ReadAll(c.Request.Body)
+		if err != nil {
+			pm.sendErrorResponse(c, http.StatusBadRequest, "failed to read request body")
+			pm.proxyLogger.Errorf("Error reading request body for recording: %v", err)
+			return
+		}
+		c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
+		requestBody = string(bodyBytes)
🤖 Prompt for AI Agents
In `@proxy/proxymanager.go` around lines 603 - 612, The read-failure branch must
stop proxying and return an error to the client: when io.ReadAll(c.Request.Body)
returns an err, log the error with pm.proxyLogger.Errorf and then abort the
request handling (do not continue to upstream) by sending an HTTP error response
(e.g., 500) and returning from the handler; do not attempt to proxy with a
partially consumed c.Request.Body or set requestBody in that case. Update the
block around io.ReadAll, c.Request.Body, requestBody and the surrounding handler
logic to perform this early return on error (use the framework's abort/return
method such as c.AbortWithStatus/AbortWithStatusJSON or equivalent).

Comment on lines +1055 to +1078
func (pm *ProxyManager) recordRequest(c *gin.Context, modelID string, requestBody string) (*responseBodyCopier, func()) {
startTime := time.Now()
requestID := pm.requestMonitor.Add(&RequestEntry{
Timestamp: startTime,
Method: c.Request.Method,
Path: c.Request.URL.Path,
Model: modelID,
RequestBody: requestBody,
})

recorder := newBodyCopier(c.Writer)
recorder.onWrite = func(b []byte) {
pm.requestMonitor.AppendResponse(requestID, string(b))
}

return recorder, func() {
duration := time.Since(startTime)
respBody := ""
isStreaming := strings.Contains(recorder.Header().Get("Content-Type"), "text/event-stream")
if !isStreaming {
respBody = recorder.body.String()
}
pm.requestMonitor.Update(requestID, recorder.Status(), duration, respBody)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's locate the responseBodyCopier implementation
cd /tmp/repo 2>/dev/null || cd . 
git ls-files | head -20
rg -l "responseBodyCopier|newBodyCopier" --type go

Repository: mostlygeek/llama-swap

Length of output: 686


🏁 Script executed:

#!/bin/bash
# Find the responseBodyCopier type definition and newBodyCopier function
rg "type responseBodyCopier|newBodyCopier" -A 5 --type go

Repository: mostlygeek/llama-swap

Length of output: 2756


🏁 Script executed:

#!/bin/bash
# Find all methods implemented by responseBodyCopier
rg "func \(.*\*?responseBodyCopier\)" --type go

Repository: mostlygeek/llama-swap

Length of output: 388


🏁 Script executed:

#!/bin/bash
# Check how sendErrorResponse is implemented
rg "func.*sendErrorResponse" -A 20 --type go

Repository: mostlygeek/llama-swap

Length of output: 1185


🏁 Script executed:

#!/bin/bash
# Find where recordRequest is called
rg "recordRequest" -B 3 -A 10 --type go

Repository: mostlygeek/llama-swap

Length of output: 4222


🏁 Script executed:

#!/bin/bash
# Find the handler/middleware where recordRequest is used
rg "func.*gin.HandlerFunc|func.*gin.Context" -B 2 -A 25 --type go | head -200

Repository: mostlygeek/llama-swap

Length of output: 10675


🏁 Script executed:

#!/bin/bash
# Check if wrapHandler or other handlers assign c.Writer = recorder
rg "c.Writer\s*=" -B 5 -A 5 --type go proxy/

Repository: mostlygeek/llama-swap

Length of output: 47


🏁 Script executed:

#!/bin/bash
# Check the wrapHandler signature and implementation
rg "func.*wrapHandler" -A 30 --type go

Repository: mostlygeek/llama-swap

Length of output: 1814


Assign the recorder to c.Writer to capture error responses.

recordRequest creates a recorder but doesn't attach it to c.Writer, so error responses written via sendErrorResponse bypass the recorder entirely. This leaves request logs with empty bodies on error paths.

The fix is valid: responseBodyCopier embeds gin.ResponseWriter, satisfying the interface for assignment to c.Writer.

Suggested fix
 recorder := newBodyCopier(c.Writer)
+c.Writer = recorder
 recorder.onWrite = func(b []byte) {
🤖 Prompt for AI Agents
In `@proxy/proxymanager.go` around lines 1055 - 1078, In recordRequest, the
responseBodyCopier created by newBodyCopier is never assigned to the Gin context
writer so error responses (e.g., those sent via sendErrorResponse) bypass it;
fix by assigning the recorder (responseBodyCopier) to c.Writer immediately after
creation so it implements gin.ResponseWriter and captures all writes (ensure
recorder.onWrite remains set and cleanup restores original writer if needed).

Comment on lines +257 to +277
{#each sortedRequests as req (req.id)}
<tr
class="whitespace-nowrap text-sm cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-800 {selectedId === req.id ? 'bg-gray-100 dark:bg-gray-800' : ''}"
onclick={() => viewDetail(req)}
>
<td class="px-4 py-3">{req.id + 1}</td>
<td class="px-4 py-3">{formatRelativeTime(req.timestamp)}</td>
<td class="px-4 py-3 font-mono">{req.method}</td>
<td class="px-4 py-3 font-mono text-xs">{req.path}</td>
<td class="px-4 py-3">{req.model}</td>
<td class="px-4 py-3">
{#if req.pending}
<span class="text-yellow-500">pending</span>
{:else}
<span class={req.status >= 200 && req.status < 300 ? 'text-green-500' : 'text-red-500'}>
{req.status}
</span>
{/if}
</td>
<td class="px-4 py-3 text-gray-500">{req.pending ? "-" : formatDuration(req.duration)}</td>
</tr>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Svelte event handler syntax on:keydown keyboard events

💡 Result:

In Svelte, keyboard events use the normal DOM event directive syntax:

<input on:keydown={handleKeydown} />

The handler receives a KeyboardEvent, so you typically branch on event.key:

<script>
  function handleKeydown(event) {
    if (event.key === 'Enter') {
      // ...
    }
  }
</script>

<input on:keydown={handleKeydown} />

[1], [2]

You can also use Svelte’s event modifiers with | (these affect event behavior, not which key was pressed):

<input on:keydown|preventDefault|stopPropagation={handleKeydown} />

[1], [3]

If you want a global key handler, use <svelte:window>:

<svelte:window on:keydown={handleKeydown} />

[2]

Sources: [1] Svelte docs — Element directives (on:eventname, modifiers). [2] Svelte docs — <svelte:window> example with on:keydown and event.key. [3] Svelte docs — on: modifier list.


🌐 Web query:

Svelte accessible clickable table rows keyboard accessibility best practices

💡 Result:

Best practice (recommended): keep the <table> semantic and put a real interactive element in a cell

  • Don’t make the <tr> itself the “button/link”. Instead, put an actual <a> (navigation) or <button> (action) inside a <td> and style it to look like the whole row is clickable. Native links/buttons already have correct keyboard behavior (Enter; and Space for buttons). Svelte’s a11y guidance explicitly recommends this over adding click handlers to non-interactive elements. [1][2]

Svelte example (row navigates):

<tr>
  <td class="row">
    <a class="rowLink" href={`/users/${user.id}`}>
      {user.name}
    </a>
  </td>
  <td>{user.email}</td>
</tr>

<style>
  td.row { padding: 0; }
  a.rowLink { display: block; padding: .75rem 1rem; } /* makes the hit area “row-like” */
  tr:hover a.rowLink { text-decoration: underline; }
  a.rowLink:focus-visible { outline: 2px solid currentColor; outline-offset: 2px; }
</style>

If you must attach on:click to something non-interactive

Svelte will warn unless you add equivalent keyboard support and make it focusable (typically tabindex="0"), but it still advises using native <button>/<a> where possible. [1][2]

  • Add on:keydown (or on:keyup) for Enter (and Space if it’s acting like a button).
  • Avoid positive tabindex values (they break expected tab order). [1]

Why “clickable <tr>” is usually a bad idea

  • ARIA “table/row” patterns are structural and don’t define row-level keyboard interaction (“Keyboard Interaction: not applicable / none”). So if you turn rows into controls, you’re inventing interactions that assistive tech won’t expect. [3][4]

Sources: Svelte accessibility warnings (keyboard handlers + prefer <button>/<a>) [1][2], WAI-ARIA Authoring Practices / table pattern [3], MDN ARIA table/row roles (no keyboard interaction) [4].


Clickable rows lack keyboard accessibility. Users relying on keyboards cannot open a request, which blocks this workflow.

The suggested fix has two issues:

  1. Syntax error: Svelte uses on:keydown (not onkeydown). The directive syntax is on:eventname.
  2. Anti-pattern: Making <tr> act like a clickable button violates ARIA table semantics and accessibility best practices. ARIA table/row roles don't define keyboard interaction patterns, so assistive technology won't expect row-level interactions.

Recommended approach: Place a native <button> or <a> inside a <td>, and use CSS to style it to visually span the row (e.g., display: block; padding: inherit;). This gives keyboard support natively and maintains proper semantic HTML.

🤖 Prompt for AI Agents
In `@ui-svelte/src/routes/Requests.svelte` around lines 257 - 277, The clickable
<tr> currently uses onclick with viewDetail and blocks keyboard users; remove
the row-level onclick and instead render a native interactive element (a
<button> or <a>) inside a <td> for each row entry (e.g., wrap the row content in
a full-width button inside the first or a dedicated <td>), keep using
viewDetail(req) as the click handler on that element, style it with CSS
(display:block; width:100%; padding:inherit) so it visually spans the row, and
preserve selection logic using selectedId and ARIA attributes on the button
(e.g., aria-pressed or aria-current) while keeping existing helpers like
formatRelativeTime and formatDuration unchanged.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@proxy/proxymanager_api.go`:
- Around line 257-265: apiGetRequests is mutating the entries returned by
pm.requestMonitor.GetEntries() which may be references to stored data; instead
create a new slice of copied entries and clear RequestBody/ResponseBody on the
copies so the original data isn't modified. Locate apiGetRequests and GetEntries
usage, allocate a new slice with the same length, copy each entry (e.g., by
value or deep copy), set the RequestBody and ResponseBody on the copy to empty,
and return the new slice in c.JSON to avoid corrupting the stored entries.
🧹 Nitpick comments (4)
ui-svelte/src/lib/types.ts (1)

69-74: Type loosening: role is now string instead of a union.

Changing role from "user" | "assistant" | "system" to string removes compile-time safety for role values. If this is intentional to support additional roles (e.g., "tool"), consider documenting or using a broader union type for clarity.

ui-svelte/src/components/JsonView.svelte (1)

16-22: Add error handling for clipboard API.

The copy button directly calls navigator.clipboard.writeText() without error handling. This can throw if the clipboard API is unavailable (non-HTTPS contexts) or if the user denies permission. The ChatMessage.svelte component in this same codebase has a more robust implementation with fallback.

Suggested fix
+<script lang="ts">
+  let { content = "" } = $props();
+  let copied = $state(false);
+
+  let formattedContent = $derived.by(() => {
+    try {
+      const obj = JSON.parse(content);
+      return JSON.stringify(obj, null, 2);
+    } catch (e) {
+      return content;
+    }
+  });
+
+  async function copyToClipboard() {
+    try {
+      await navigator.clipboard.writeText(formattedContent);
+      copied = true;
+      setTimeout(() => (copied = false), 2000);
+    } catch (err) {
+      console.error("Failed to copy:", err);
+    }
+  }
+</script>

Then update the button:

   <button
     class="absolute top-2 right-2 p-1 bg-white/10 hover:bg-white/20 rounded text-xs opacity-0 group-hover:opacity-100 transition-opacity"
-    onclick={() => navigator.clipboard.writeText(formattedContent)}
+    onclick={copyToClipboard}
     title="Copy to clipboard"
   >
-    Copy
+    {copied ? "Copied!" : "Copy"}
   </button>
ui-svelte/src/routes/Requests.svelte (2)

32-42: Consider surfacing fetch errors to the user.

The error is logged to console but the user sees no indication that the detail fetch failed. For a debugging tool, this may cause confusion if the detail panel appears empty without explanation.

💡 Optional: Add error state feedback
 let isLoadingDetail = $state(false);
+let detailError = $state<string | null>(null);
 ...
 async function viewDetail(req: RequestLog) {
   selectedId = req.id;
   isLoadingDetail = true;
+  detailError = null;
   try {
     detailedRequest = await getRequestDetail(req.id);
   } catch (err) {
     console.error(err);
+    detailError = "Failed to load request details";
   } finally {
     isLoadingDetail = false;
   }
 }

450-484: Array tool arguments will display with numeric indices.

The typeof parsedArgs === 'object' check (line 457) includes arrays. If tool arguments are an array, Object.entries will show indices like "0", "1" as argument names. This may be confusing but won't break functionality.

💡 Optional: Add array check for cleaner display
-{`#if` parsedArgs && typeof parsedArgs === 'object'}
+{`#if` parsedArgs && typeof parsedArgs === 'object' && !Array.isArray(parsedArgs)}
   <!-- table rendering -->
+{:else if Array.isArray(parsedArgs)}
+  <JsonView content={JSON.stringify(parsedArgs)} />

Comment on lines +257 to +265
func (pm *ProxyManager) apiGetRequests(c *gin.Context) {
entries := pm.requestMonitor.GetEntries()
// Strip bodies for list view
for i := range entries {
entries[i].RequestBody = ""
entries[i].ResponseBody = ""
}
c.JSON(http.StatusOK, entries)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same mutation issue — modifying returned slice elements.

Similar to the SSE initial sync, this modifies entries[i] directly. If GetEntries() returns references, this corrupts the stored data.

Suggested fix
 func (pm *ProxyManager) apiGetRequests(c *gin.Context) {
 	entries := pm.requestMonitor.GetEntries()
-	// Strip bodies for list view
-	for i := range entries {
-		entries[i].RequestBody = ""
-		entries[i].ResponseBody = ""
-	}
-	c.JSON(http.StatusOK, entries)
+	// Strip bodies for list view - create copies to avoid mutating stored data
+	stripped := make([]RequestEntry, len(entries))
+	for i, e := range entries {
+		stripped[i] = e
+		stripped[i].RequestBody = ""
+		stripped[i].ResponseBody = ""
+	}
+	c.JSON(http.StatusOK, stripped)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (pm *ProxyManager) apiGetRequests(c *gin.Context) {
entries := pm.requestMonitor.GetEntries()
// Strip bodies for list view
for i := range entries {
entries[i].RequestBody = ""
entries[i].ResponseBody = ""
}
c.JSON(http.StatusOK, entries)
}
func (pm *ProxyManager) apiGetRequests(c *gin.Context) {
entries := pm.requestMonitor.GetEntries()
// Strip bodies for list view - create copies to avoid mutating stored data
stripped := make([]RequestEntry, len(entries))
for i, e := range entries {
stripped[i] = e
stripped[i].RequestBody = ""
stripped[i].ResponseBody = ""
}
c.JSON(http.StatusOK, stripped)
}
🤖 Prompt for AI Agents
In `@proxy/proxymanager_api.go` around lines 257 - 265, apiGetRequests is mutating
the entries returned by pm.requestMonitor.GetEntries() which may be references
to stored data; instead create a new slice of copied entries and clear
RequestBody/ResponseBody on the copies so the original data isn't modified.
Locate apiGetRequests and GetEntries usage, allocate a new slice with the same
length, copy each entry (e.g., by value or deep copy), set the RequestBody and
ResponseBody on the copy to empty, and return the new slice in c.JSON to avoid
corrupting the stored entries.

@mostlygeek
Copy link
Owner

Hi,

Thanks for submitting this PR. After building a few llm traffic capture tools I want this functionality to be independent of llama-swap.

The main reason is so there is a more room for it to develop its own unique feature set. That would be better for llama-swap and the inspector tool.

@h3po
Copy link

h3po commented Feb 3, 2026

would you maybe consider exposing the request-response streams via an api, so one does not have to put another proxy in front of llama-swap?

i'm also looking for a logging solution and came to the conclusion that it would be easiest to contribute it to llama-swap, landing me here. there is existing standalone software for this, for example llm-proxy - but it would require node, which i could do without... other solutions are heavy and need other supporting infrastructure.

@mostlygeek
Copy link
Owner

That's a good suggestion to expose the data via some sort of API.

@h3po what is the use case you have for req/resp in llama-swap?

@h3po
Copy link

h3po commented Feb 3, 2026

i use llama-swap a lot for quickly trying out different models/sampling configs/quantizations and also frontend software like rag databases, chat ui etc. it would be useful to gather logs for debugging and auditing on the llama-swap side instead of coming up with a way to log each and every client software separately.
i just spent a few hours trying to coax the logs out of a backend llama-server --log-verbose itself but with streaming responses it's hard to correlate the response with the request, since the full response text is never gathered/logged by the http server thread itself (just some other jinja parsing related outputs). apart from patching llama.cpp, the proxy that handles the client connection would have the easiest way to accumulate the response tokens i think.

@mostlygeek
Copy link
Owner

Thanks that's helpful context. I think perhaps there is a good middle ground here. A lightweight UI similar to what @khimaros created and also an API/hooks/plugin system for a deeper inspection. What is llama-swap anymore!? :D

@khimaros
Copy link
Author

khimaros commented Feb 3, 2026

this is my use case exactly as well. I've been running this PR on my Strix Halo and have found it incredibly useful for understanding how different clients behave.

@mostlygeek if there is anything you need from me to make this more attractive for merge, please let me know. happy to iterate on design or technical approach.

@khimaros
Copy link
Author

khimaros commented Feb 3, 2026

FWIW, i did try out some other proxies like Bifrost and LiteLLM but it's quite annoying maintaining two identical sources of truth for model lists (with a lot of clicking around a web admin interface). useful if also working with remote models but not helpful for my case.

@h3po
Copy link

h3po commented Feb 4, 2026

@khimaros thanks for implementing this. i'll suggest adding a filter for the request path; my log is full of /metrics requests because i use llama-server with --metrics (https://gist.github.com/h3po/f7703e7cc08cf7151b58820eaeccfbd9). also it would be nice if you would drop large components like type image_url from the body before the 1MB check

@mostlygeek
Copy link
Owner

@khimaros thanks for opening up this issue. After reviewing this PR, I think integrating capture data into the metrics monitor makes a lot of sense. I opened #508 and would appreciate your feedback if you have any.

mostlygeek added a commit that referenced this pull request Feb 7, 2026
Add saving request and response headers and bodies that go through
llama-swap in memory.

- captureBuffer added to configuration. Captures are enabled by default.
- 5MB of memory is allocated for req/response captures in a ring buffer.
Setting captureBuffer to 0 will disable captures.
- UI elements to view captured data added to Activity page. Includes
some
QOL features like json formatting and recombining SSE chat streams
- capture saving is done at the byte level and has minimal impact on
llama-swap performance

Fixes #464 
Ref #503
@mostlygeek
Copy link
Owner

mostlygeek commented Feb 7, 2026

closed by #508

@mostlygeek mostlygeek closed this Feb 7, 2026
@khimaros
Copy link
Author

khimaros commented Feb 8, 2026

awesome! a little harder to read than my PoC and doesn't show requests until they complete, but it will get the job done and save me from maintaining a fork 🤓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Request body logging

3 participants