Skip to content

merge upstream#622

Closed
overcuriousity wants to merge 46 commits intomostlygeek:mainfrom
overcuriousity:main
Closed

merge upstream#622
overcuriousity wants to merge 46 commits intomostlygeek:mainfrom
overcuriousity:main

Conversation

@overcuriousity
Copy link
Copy Markdown

No description provided.

overcuriousity and others added 30 commits January 30, 2026 15:22
Fix parseEndpointList to handle single and double quotes that are
treated as literal characters on Windows.

- Strip surrounding quotes before parsing comma-separated endpoints
- Fixes test failures on Windows CI

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
RPC health checking now runs continuously from process creation until
proxy shutdown, completely independent of whether the model is loaded,
starting, stopped, or in any other state.

- Start health checker in NewProcess when rpcHealthCheck is enabled
- Remove stopRPCHealthChecker - only stops on proxy shutdown
- Remove state checks from health checker goroutine
- Health status always reflects current RPC endpoint availability

Previously, the health checker only ran while a process was in StateReady,
causing stale health data when processes stopped. Now /v1/models always
shows accurate RPC health regardless of model state.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Work in progress on web configuration feature.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The requestTimeout feature was not working because the timeout
context was not connected to the HTTP request. When the timeout
fired, it attempted to kill the process but the reverse proxy
continued waiting for a response indefinitely.

- Use context.WithTimeout() to create a timeout context for the HTTP request
- Clone the request with the timeout context before proxying
- When timeout fires, the HTTP request is immediately cancelled
- Fix StopImmediately() to handle timeouts during model loading (StateStarting)
- Add unit test to verify timeout behavior

Before: requests would run for 60+ seconds despite requestTimeout: 20
After: requests terminate in exactly 20 seconds as configured

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add brief mention of requestTimeout feature in the customizable
features section of README.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…thcheck

Feat  conditional rpc healthcheck
Adjust RPC health check parameters to reduce false positives when endpoints
are under load and fix multiple security/correctness issues.

- increase RPC health check timeout from 500ms to 3s to handle busy servers
- decrease check interval from 30s to 10s for faster detection
- fix process_timeout_test missing context parameter
- fix config_embed to return cloned byte slice preventing mutation
- update ui dependencies: js-yaml 4.1.1, @codemirror/state 6.5.4, @codemirror/lang-yaml 6.1.2

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix editor cleanup and improve dark mode appearance with better colors,
contrast, and styling.

- Add proper editor disposal in $effect cleanup
- Update theme colors for better dark mode visibility
- Improve button styling with teal export button
- Better text contrast and subtle borders
- Refine error message styling

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Prevent stopCommand() from hanging when a timeout transitions
StateStarting to StateStopping before cmd.Start() completes.

- Close cmdWaitChan when cmd.Start() fails
- Add guard in stopCommand() to skip wait if process never started
- Add tests for hang scenarios
- Fix TestProcess_RequestTimeout to avoid calling t.Fatal from handler goroutine

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix ParseRPCEndpoints to handle Windows shlex behavior where
single-quoted strings with spaces are split into multiple arguments.

- Collect all non-flag arguments after --rpc flag
- Join them with space before parsing endpoint list
- Fixes test failure: TestParseRPCEndpoints_ValidFormats/multiple_endpoints_with_spaces_trimmed

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix data race between Process.start() and Process.stopCommand() by using a cmdStarted flag instead of checking cmd.Process directly. Also fix hang when StopImmediately() is called during startup.

Changes:
- Add cmdStarted bool flag to track if Start() completed successfully
- Initialize cancelUpstream and cmdWaitChan before Start() so stopCommand() can access them during startup
- Always call cancelUpstream() in stopCommand() to cancel context, even if Start() hasn't completed
- Only wait on cmdWaitChan if cmdStarted is true to avoid hanging
- Reset cmdStarted=false when process exits
- Change cancelUpstream==nil error to debug message (expected in tests using forceState())
- Use platform-appropriate sleep command in tests (timeout on Windows, sleep on Unix)

Fixes race condition and Windows test failures.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix theme compartment sharing bug and improve error response handling.

- create separate Compartment instances for each CodeMirror editor
- update createEditor to accept compartment parameter
- improve saveConfig error handling to parse both JSON and non-JSON responses
- include status code and statusText in error messages
Fix race condition in StopImmediately where state changes between
CurrentState() and swapState() could cause stop to abort. Add retry
loop to handle ErrExpectedStateMismatch.

Fix test race condition where assert.Error was called inside goroutine.
Move error assertion to main test goroutine using buffered channel.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix getSleepCommand to use full path to timeout.exe on Windows. This avoids conflict with GNU coreutils timeout command in Git Bash environments on CI.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Separate request context from timeout monitoring context to avoid misattributing parent-imposed deadlines. Create monitoring context from context.Background() to reliably detect configured timeout, while maintaining request timeout for proper cancellation.

- requestCtx: with timeout for request cancellation
- timeoutCtx: from Background() for monitoring only
- prevents false positives from parent context deadlines

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
overcuriousity and others added 16 commits January 31, 2026 20:18
…thcheck

Feat  conditional rpc healthcheck
I/O timeout errors now don't mark RPC endpoints as unhealthy.
Timeouts are logged at debug level and health state is preserved.

- detect timeout errors using net.Error.Timeout()
- add test for timeout handling behavior

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…thcheck

proxy: ignore I/O timeout in RPC health checks
Fix cursor jumping to top after typing by preventing reactive effect
from re-running on content changes. Use untrack() to read config state
without creating reactive dependency, ensuring editor is only created
once and not destroyed/recreated on each keystroke.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
ui-svelte: fix Config editor cursor jumping on input
Resolved conflicts:
- Merged both Playground and Config features in App.svelte
- Combined dependencies from both branches in package.json
- Removed old ui/ files (replaced by ui-svelte/)
- Regenerated package-lock.json with merged dependencies
Merge latest upstream changes including:
- global TTL feature
- setParamsByID filter
- request/response capturing
- playground UI (chat, images, speech, transcription, rerank)
- incremental streaming markdown
- infill timings support
- in-flight request tracking
- smart auto-scroll in LogPanel
- copy button for code blocks
- OGG audio format support
- cuda13 architecture support

Conflict resolutions:
- proxy/process_test.go: kept context.Background() param + upstream var rename
- proxy/proxymanager_api.go: kept both config editor and captures APIs
- ui-svelte/package.json: kept both codemirror and markdown/katex deps
- ui-svelte/src/App.svelte: kept both Config and Playground routes
- ui-svelte/src/components/Header.svelte: fixed $location to $currentRoute
- ui/package*: accepted upstream deletion of old React UI
Merge upstream changes (globalTTL, captures, playground, streaming markdown,
rerank, infill timings, in-flight tracking, setParamsByID) with fork features
(config editor, timeout handling, RPC health checks, context parameter).

- resolve App.svelte conflict: keep Config + PlaygroundStub routes
Copilot AI review requested due to automatic review settings April 1, 2026 09:52
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 62bedcfd-472d-4ede-afb1-a6d3e5403c35

📥 Commits

Reviewing files that changed from the base of the PR and between 9559009 and e295d15.

⛔ Files ignored due to path filters (1)
  • ui-svelte/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (22)
  • Makefile
  • README.md
  • config-schema.json
  • config.example.yaml
  • config_embed.go
  • docs/configuration.md
  • llama-swap.go
  • proxy/config/config.go
  • proxy/config/config_test.go
  • proxy/config/model_config.go
  • proxy/process.go
  • proxy/process_rpc_health_test.go
  • proxy/process_test.go
  • proxy/process_timeout_test.go
  • proxy/processgroup.go
  • proxy/processgroup_test.go
  • proxy/proxymanager.go
  • proxy/proxymanager_api.go
  • ui-svelte/package.json
  • ui-svelte/src/App.svelte
  • ui-svelte/src/components/Header.svelte
  • ui-svelte/src/routes/Config.svelte

Walkthrough

The pull request introduces request timeout enforcement for model inference, adds RPC health checking for distributed inference models with TCP endpoint monitoring, implements configuration API endpoints, replaces the React UI with a new Svelte-based UI with configuration editor, updates the Makefile to build the Svelte UI, and refactors configuration hot-reloading in the main application.

Changes

Cohort / File(s) Summary
Build System & Config Metadata
Makefile, config_embed.go
Switched UI build target from ui/ to ui-svelte/, updated node_modules caching. Added Go embed of config.example.yaml with public accessor function.
Configuration Schema & Documentation
config-schema.json, config.example.yaml, docs/configuration.md, README.md
Added two new per-model config properties: requestTimeout (integer, enforces request deadline) and rpcHealthCheck (boolean, enables RPC endpoint health monitoring). Updated example configs and documentation accordingly.
Core App & Config Parsing
llama-swap.go, proxy/config/config.go, proxy/config/config_test.go, proxy/config/model_config.go
Refactored config change event handling to always listen (not conditionally deferred), added ProxyManager config path/example setters. Implemented ParseRPCEndpoints() parser with validation for RPC endpoint extraction from command strings. Added requestTimeout and rpcHealthCheck fields to ModelConfig.
Process Lifecycle & Concurrency
proxy/process.go, proxy/process_test.go, proxy/process_rpc_health_test.go, proxy/process_timeout_test.go
Added startup tracking (cmdStarted flag) to prevent hangs when command start fails. Implemented per-request timeout enforcement with background monitoring and forced process termination. Added RPC health checking with background ticker-based TCP health monitoring, initial unhealthy state, and IsRPCHealthy() accessor. Improved StopImmediately() state transition robustness and stopCommand() error handling.
Process Group & ProxyManager
proxy/processgroup.go, proxy/processgroup_test.go, proxy/proxymanager.go, proxy/proxymanager_api.go
Threaded shutdownCtx through ProcessGroup into each Process. Added RPC health filtering to /v1/models endpoint (excludes unhealthy models). Return HTTP 503 for inference on unhealthy RPC models. Introduced new API endpoints: GET /config/current, GET /config/example, POST /config for configuration management.
Svelte UI with Config Editor
ui-svelte/package.json, ui-svelte/src/App.svelte, ui-svelte/src/components/Header.svelte, ui-svelte/src/routes/Config.svelte
Added CodeMirror and yaml parsing dependencies. Registered new /config route with navigation link in header. Implemented Config.svelte component: dual CodeMirror editors (editable current config, read-only example), YAML validation, load/save/import/export functionality, dynamic dark-mode theming via Compartments.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • mostlygeek
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR merges upstream changes that add a web-based configuration editor to the Svelte UI, plus new backend support for reading/writing the running config file, and introduces two new model-level operational controls: RPC endpoint health gating and per-request timeouts.

Changes:

  • Add /config UI route with a CodeMirror YAML editor (current config editable + embedded example config read-only), including import/export and client-side YAML validation.
  • Add authenticated backend API endpoints to fetch the current config file, fetch the embedded example config, and update the config (triggering a reload event).
  • Introduce rpcHealthCheck (filter unhealthy RPC models from /v1/models and return 503 on inference) and requestTimeout (kill model process if a request exceeds a configured duration), with accompanying schema/docs/tests.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
ui-svelte/src/routes/Config.svelte New configuration editor page (CodeMirror + YAML validation + import/export/save).
ui-svelte/src/components/Header.svelte Add navigation link to the new /config route.
ui-svelte/src/App.svelte Register /config route in the SPA router.
ui-svelte/package.json Add CodeMirror + js-yaml dependencies needed for the config editor.
ui-svelte/package-lock.json Lockfile updates for the new UI dependencies.
README.md Update feature list to mention request timeout protection and RPC health checking.
proxy/proxymanager.go Pass shutdown context into process groups; filter /v1/models based on RPC health; add setters for config path/example.
proxy/proxymanager_api.go Add /api/config/* endpoints to read/write config and trigger reload events.
proxy/processgroup.go Thread shutdown context through to each Process instance.
proxy/processgroup_test.go Update tests for new NewProcessGroup(..., shutdownCtx) signature.
proxy/process.go Implement requestTimeout enforcement and background RPC endpoint health checking; improve stop/start race handling.
proxy/process_timeout_test.go Add tests around request timeout behavior.
proxy/process_test.go Update tests for new NewProcess(..., shutdownCtx) signature and add stop/start edge-case tests.
proxy/process_rpc_health_test.go Add tests for RPC health checking behavior.
proxy/config/model_config.go Add rpcHealthCheck and requestTimeout fields to model config with defaults.
proxy/config/config.go Add ParseRPCEndpoints helper for extracting --rpc endpoints from model commands.
proxy/config/config_test.go Add unit tests for ParseRPCEndpoints.
Makefile Update UI build/install targets to use the Svelte UI directory.
llama-swap.go Persist config path/example into each newly created ProxyManager; always listen for config-change events to reload.
docs/configuration.md Document new config options/features (including requestTimeout and rpcHealthCheck).
config.example.yaml Add examples/documentation for requestTimeout and distributed RPC health checking.
config-schema.json Extend JSON schema with requestTimeout and rpcHealthCheck.
config_embed.go Embed config.example.yaml into the binary for the UI “Example Config” view.
Files not reviewed (1)
  • ui-svelte/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread proxy/proxymanager_api.go
Comment on lines +332 to +333
// Write to config file
if err := os.WriteFile(configPath, body, 0644); err != nil {
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apiUpdateConfig writes the config with a fixed mode of 0644, which can unintentionally widen permissions (e.g., from 0600) and expose secrets such as API keys. Preserve the existing file mode (stat + write with same permissions) or default to a more restrictive mode (e.g., 0600) when creating the file.

Suggested change
// Write to config file
if err := os.WriteFile(configPath, body, 0644); err != nil {
// Determine file permissions: preserve existing mode if the file exists,
// otherwise use a restrictive default (0600).
var perm os.FileMode = 0600
if info, statErr := os.Stat(configPath); statErr == nil {
perm = info.Mode().Perm()
}
// Write to config file
if err := os.WriteFile(configPath, body, perm); err != nil {

Copilot uses AI. Check for mistakes.
Comment thread proxy/proxymanager_api.go
pm.sendErrorResponse(c, http.StatusBadRequest, "Config file path not set")
return
}

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apiUpdateConfig is a state-changing endpoint reachable from browsers, but it doesn't include any CSRF mitigation (e.g., Origin/Referer validation or requiring a custom header like x-api-key). Because apiKeyAuth() accepts Basic auth (which browsers may automatically attach cross-site), a malicious site could potentially trigger a config overwrite if the user has authenticated. Consider rejecting requests with missing/mismatched Origin, or requiring x-api-key for mutating /api/config operations.

Suggested change
// Basic CSRF mitigation: require a custom header that browsers cannot send
// with simple cross-site form submissions or image requests.
if c.GetHeader("X-API-Key") == "" {
pm.sendErrorResponse(c, http.StatusForbidden, "Missing X-API-Key header")
return
}

Copilot uses AI. Check for mistakes.
Comment thread proxy/process.go

ctx, cancel := context.WithCancel(p.shutdownCtx)
p.rpcHealthCancel = cancel
p.rpcHealthTicker = time.NewTicker(10 * time.Second)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RPC health check interval is hard-coded to 10s (time.NewTicker(10 * time.Second)), but the docs/config examples added in this PR state 30 seconds. Either update the documentation to match, or make the interval configurable so behavior and docs stay consistent.

Suggested change
p.rpcHealthTicker = time.NewTicker(10 * time.Second)
p.rpcHealthTicker = time.NewTicker(30 * time.Second)

Copilot uses AI. Check for mistakes.
Comment thread proxy/process.go
Comment on lines +1023 to +1041
allHealthy := true

for _, endpoint := range p.rpcEndpoints {
dialer := net.Dialer{Timeout: 3 * time.Second}
conn, err := dialer.Dial("tcp", endpoint)
if err != nil {
// Ignore I/O timeout errors - don't mark as unhealthy
if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
p.proxyLogger.Debugf("<%s> RPC endpoint %s timeout (ignoring): %v", p.ID, endpoint, err)
continue
}
p.proxyLogger.Warnf("<%s> RPC endpoint %s unhealthy: %v", p.ID, endpoint, err)
allHealthy = false
break
}
conn.Close()
}

wasHealthy := p.rpcHealthy.Load()
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkRPCHealth() treats TCP dial timeouts as "ignore" and continues, which can result in allHealthy staying true and the model being considered healthy even when an endpoint is unreachable (a timeout is usually a strong signal of unhealthiness). If you want to avoid flapping, consider keeping the previous health state on timeout or marking the endpoint unhealthy after N consecutive timeouts, rather than counting timeouts as healthy.

Suggested change
allHealthy := true
for _, endpoint := range p.rpcEndpoints {
dialer := net.Dialer{Timeout: 3 * time.Second}
conn, err := dialer.Dial("tcp", endpoint)
if err != nil {
// Ignore I/O timeout errors - don't mark as unhealthy
if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
p.proxyLogger.Debugf("<%s> RPC endpoint %s timeout (ignoring): %v", p.ID, endpoint, err)
continue
}
p.proxyLogger.Warnf("<%s> RPC endpoint %s unhealthy: %v", p.ID, endpoint, err)
allHealthy = false
break
}
conn.Close()
}
wasHealthy := p.rpcHealthy.Load()
// Start with the assumption that all endpoints are healthy; adjust based on checks.
allHealthy := true
wasHealthy := p.rpcHealthy.Load()
anyTimeout := false
anyFailure := false
for _, endpoint := range p.rpcEndpoints {
dialer := net.Dialer{Timeout: 3 * time.Second}
conn, err := dialer.Dial("tcp", endpoint)
if err != nil {
// For I/O timeout errors, record the timeout but don't immediately flip health.
if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
p.proxyLogger.Debugf("<%s> RPC endpoint %s timeout (ignoring for health flip): %v", p.ID, endpoint, err)
anyTimeout = true
continue
}
// Non-timeout errors are treated as unhealthy.
p.proxyLogger.Warnf("<%s> RPC endpoint %s unhealthy: %v", p.ID, endpoint, err)
allHealthy = false
anyFailure = true
break
}
conn.Close()
}
// If there were only timeouts (no definite successes or failures), keep the previous state
// instead of forcing the health to healthy.
if anyFailure {
allHealthy = false
} else if anyTimeout && !anyFailure {
allHealthy = wasHealthy
} else {
// No failures and no timeouts: all endpoints responded successfully.
allHealthy = true
}

Copilot uses AI. Check for mistakes.
Comment thread config.example.yaml
# rpcHealthCheck: enable TCP health checks for RPC endpoints
# - optional, default: false
# - when enabled, parses --rpc host:port[,host:port,...] from cmd
# - performs TCP connectivity checks every 30 seconds
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says RPC connectivity checks run "every 30 seconds", but the implementation introduced in proxy/process.go runs the ticker every 10 seconds. Please align the example documentation with the actual behavior (or make the interval configurable).

Suggested change
# - performs TCP connectivity checks every 30 seconds
# - performs TCP connectivity checks every 10 seconds

Copilot uses AI. Check for mistakes.
Comment thread config-schema.json
"rpcHealthCheck": {
"type": "boolean",
"default": false,
"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 30 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema description says RPC health checks run "every 30 seconds", but the current implementation introduced in proxy/process.go uses a 10-second ticker. Please update the schema description (or the code) so they match.

Suggested change
"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 30 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."
"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 10 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +66
// TestProcess_RequestTimeout verifies that requestTimeout actually kills the process
func TestProcess_RequestTimeout(t *testing.T) {
// Create error channel to report handler errors from the mock server goroutine
srvErrCh := make(chan error, 1)

// Create a mock server that simulates a long-running inference
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Logf("Mock server received request")

// Simulate streaming response that takes 60 seconds
w.Header().Set("Content-Type", "text/event-stream")
w.WriteHeader(http.StatusOK)

flusher, ok := w.(http.Flusher)
if !ok {
srvErrCh <- fmt.Errorf("Expected http.ResponseWriter to be an http.Flusher")
return
}

// Stream data for 60 seconds
for i := 0; i < 60; i++ {
select {
case <-r.Context().Done():
t.Logf("Mock server: client disconnected after %d seconds", i)
return
default:
fmt.Fprintf(w, "data: token %d\n\n", i)
flusher.Flush()
time.Sleep(1 * time.Second)
}
}
t.Logf("Mock server completed full 60 second response")
}))
defer mockServer.Close()

// Setup process logger - use NewLogMonitor() to avoid race in test
processLogger := NewLogMonitor()
proxyLogger := NewLogMonitor()

// Create process with 5 second request timeout
cfg := config.ModelConfig{
Proxy: mockServer.URL,
CheckEndpoint: "none", // skip health check
RequestTimeout: 5, // 5 second timeout
}

p := NewProcess("test-timeout", 30, cfg, processLogger, proxyLogger, context.Background())
p.gracefulStopTimeout = 2 * time.Second // shorter for testing

// Manually set state to ready (skip actual process start)
p.forceState(StateReady)

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test description says it verifies requestTimeout "kills the process", but the test forces the process into StateReady without starting an upstream command. In that setup StopImmediately() can't actually terminate a real process (and stopCommand() short-circuits when cancelUpstream is nil), so the test is really only validating request cancellation/timeout behavior. Consider renaming/rewording the test (or adjusting it) so the assertion matches what is actually exercised.

Copilot uses AI. Check for mistakes.
Comment on lines +86 to +124
func TestProcess_RPCHealthCheckTimeoutIgnored(t *testing.T) {
testLogger := NewLogMonitorWriter(io.Discard)
proxyLogger := NewLogMonitorWriter(io.Discard)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()

// Use an IP address that will timeout (non-routable IP)
// 192.0.2.0/24 is reserved for documentation/testing (RFC 5737)
modelConfig := config.ModelConfig{
Cmd: "llama-server --rpc 192.0.2.1:50051",
Proxy: "http://localhost:8080",
RPCHealthCheck: true,
}

process := NewProcess("test-model", 5, modelConfig, testLogger, proxyLogger, ctx)

// Verify endpoints were parsed
assert.NotEmpty(t, process.rpcEndpoints, "RPC endpoints should be parsed from cmd")
assert.Equal(t, []string{"192.0.2.1:50051"}, process.rpcEndpoints)

// Initially should be unhealthy (false) until first check
assert.False(t, process.rpcHealthy.Load(), "RPC health should start as false")

// Manually run health check - this should timeout but not mark as unhealthy
process.checkRPCHealth()

// After timeout, should remain at initial state (false) but not be marked unhealthy
// The key is that timeout doesn't change the state - it's effectively a no-op
// To test this properly, let's set it to healthy first, then see if timeout changes it
process.rpcHealthy.Store(true)
initialState := process.rpcHealthy.Load()
assert.True(t, initialState, "Should be healthy before timeout check")

// Run health check that will timeout
process.checkRPCHealth()

// After timeout, should still be healthy (timeout is ignored)
assert.True(t, process.rpcHealthy.Load(), "Should remain healthy after timeout")
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestProcess_RPCHealthCheckTimeoutIgnored assumes dialing 192.0.2.1:50051 will reliably time out, but on some OS/network setups it may fail fast (e.g., "no route to host"), making the test flaky and environment-dependent. Consider refactoring RPC health checking to allow injecting a dial function (so you can deterministically return a timeout), or otherwise avoid asserting on timeout-specific behavior using real networking.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants