Skip to content

Improve testing around using SIGKILL#127

Merged
mostlygeek merged 2 commits intomainfrom
more-unload-125
May 14, 2025
Merged

Improve testing around using SIGKILL#127
mostlygeek merged 2 commits intomainfrom
more-unload-125

Conversation

@mostlygeek
Copy link
Copy Markdown
Owner

@mostlygeek mostlygeek commented May 14, 2025

No functionality changes. Improving testing for stop timeout and the usage of SIGKILL (#125)

Summary by CodeRabbit

  • New Features

    • Added an option for the simple responder to ignore SIGTERM signals and enhanced signal handling for more robust shutdown control.
    • The simple responder now prints its process ID unless running in silent mode.
  • Improvements

    • Centralized and made configurable the graceful shutdown timeout for process management.
    • Enhanced test reliability by ensuring fresh test runs without caching.
    • Improved test helper modularity for port allocation.
  • Bug Fixes

    • Added a test to verify forced process termination when graceful shutdown is ignored.
  • Other

    • Updated logging level in a proxy manager test for clearer output.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented May 14, 2025

Walkthrough

The changes introduce enhancements to signal handling in the simple responder, centralize the process graceful stop timeout as a configurable field, and improve test modularity and coverage. New command-line flags and helper functions are added, and a new test verifies forced process termination behavior. Minor adjustments are made to logging levels in tests.

Changes

File(s) Change Summary
Makefile Added -count=1 to go test commands in test and test-all targets to disable test caching.
misc/simple-responder/simple-responder.go Introduced ignore-sig-term flag, changed signal handling to process signals in a loop, print PID unless silent, and allow ignoring SIGTERM or double-SIGINT shutdown.
proxy/helpers_test.go Added getTestPort() helper for unique port allocation; refactored getTestSimpleResponderConfig() to use new helper and delegate config creation.
proxy/process.go Added gracefulStopTimeout field to Process struct, initialized in constructor, and used in shutdown methods instead of hardcoded timeout.
proxy/process_test.go Added TestProcess_ForceStopWithKill to verify SIGKILL is sent when graceful stop timeout is reached and process ignores SIGTERM.
proxy/proxymanager_test.go Changed logging level in test configuration from "debug" to "warn".

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SimpleResponder
    participant OS

    User->>SimpleResponder: Start with flags (e.g., --ignore-sig-term)
    SimpleResponder->>OS: Register signal handlers
    OS-->>SimpleResponder: Send SIGINT/SIGTERM/other signals
    SimpleResponder->>SimpleResponder: Loop: handle signals
    alt SIGINT (first)
        SimpleResponder->>SimpleResponder: Log received SIGINT
    else SIGINT (second)
        SimpleResponder->>SimpleResponder: Shutdown and exit
    else SIGTERM and ignore-sig-term=true
        SimpleResponder->>SimpleResponder: Ignore SIGTERM
    else SIGTERM and ignore-sig-term=false
        SimpleResponder->>SimpleResponder: Shutdown and exit
    else Other signal
        SimpleResponder->>SimpleResponder: Immediate shutdown
    end
Loading
sequenceDiagram
    participant Test
    participant Proxy
    participant UpstreamProcess

    Test->>Proxy: Start process with ignore SIGTERM
    Proxy->>UpstreamProcess: Launch command
    Test->>Proxy: Send HTTP request
    Proxy->>UpstreamProcess: Forward request
    Test->>Proxy: Call StopImmediately (short timeout)
    Proxy->>UpstreamProcess: Send SIGTERM
    UpstreamProcess-->>Proxy: Ignore SIGTERM
    Proxy->>UpstreamProcess: After timeout, send SIGKILL
    UpstreamProcess-->>Proxy: Terminated by SIGKILL
    Proxy->>Test: Responds with partial output and error
Loading

Suggested labels

enhancement

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@mostlygeek mostlygeek added the enhancement New feature or request label May 14, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
Makefile (1)

22-27: Flagging test cache disable may lengthen CI runs

Adding -count=1 guarantees fresh test execution, which is useful when tests rely on side effects (random ports, file‐system, etc.).
Be aware it also disables Go’s build cache completely, so local/CI cycles will be slower. Consider either:

  1. Leaving the flag off by default and letting devs run make test-no-cache when needed, or
  2. Guarding it behind an environment toggle (NO_CACHE=1 make test).

No change required if deterministic re-runs outweigh the extra minutes.

misc/simple-responder/simple-responder.go (2)

29-30: Flag description nitpick

ignore-sig-term is clear, but the flag comment could mention why you might need it—e.g. “simulate a mis-behaving child that ignores SIGTERM”. This helps future readers/tests.


209-214: Name shadowing & missing context import

Inside the run loop you redeclare signal := <-sigChan, shadowing the imported os/signal package. Although legal, it hampers readability and tooling autocomplete.

-    signal := <-sigChan
+    sig := <-sigChan

Also, if you adopt the graceful-shutdown suggestion below you’ll need to import "context".

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 519c3a4 and 40cc6e2.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • misc/simple-responder/simple-responder.go (3 hunks)
  • proxy/helpers_test.go (1 hunks)
  • proxy/process.go (4 hunks)
  • proxy/process_test.go (1 hunks)
  • proxy/proxymanager_test.go (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
proxy/helpers_test.go (1)
proxy/config.go (1)
  • ModelConfig (17-29)
🔇 Additional comments (11)
proxy/proxymanager_test.go (1)

337-344: LogLevel: "warn" change looks good

Raising the verbosity from "error" to "warn" should surface flaky behaviour without flooding output. 👍

misc/simple-responder/simple-responder.go (1)

195-198: Nice touch: printing PID only when not silent

Conditional PID output is handy for tests that need the PID yet keeps logs quiet in CI when --silent is set. 👍

proxy/process.go (4)

70-72: Great addition: Process graceful stop timeout field.

Adding a dedicated field for the graceful stop timeout enhances configurability and testing flexibility. This change properly centralizes a previously hardcoded value.


99-100: LGTM: Proper initialization of the graceful stop timeout.

The default value of 5 seconds maintains backward compatibility with the previous implementation.


357-357: LGTM: Using the configurable timeout in StopImmediately.

Correctly uses the new field instead of the hardcoded timeout value.


370-370: LGTM: Using the configurable timeout in Shutdown.

Correctly uses the new field instead of the hardcoded timeout value.

proxy/process_test.go (3)

397-445: Excellent test coverage for the SIGKILL behavior.

This test thoroughly verifies the behavior when a process ignores SIGTERM and must be terminated with SIGKILL:

  1. It uses the --ignore-sig-term flag to create a process that won't respond to SIGTERM
  2. Sets a short timeout to make testing faster
  3. Verifies the process state transitions correctly
  4. Checks that HTTP responses are handled properly during forced termination

The test provides good coverage for issue #125 mentioned in the PR objectives.


417-417: LGTM: Setting a shorter timeout for faster testing.

Appropriately reduces the timeout for testing purposes without affecting production behavior.


433-435: Good assertion for verifying the interrupted response.

This assertion confirms that the response was interrupted by the SIGKILL signal, validating the end-to-end behavior of the forced termination.

proxy/helpers_test.go (2)

51-59: Good refactoring: Extracting port allocation logic.

Extracting the port allocation logic into a dedicated function improves code organization and maintainability. The mutex handling ensures thread safety for concurrent tests.


61-63: Clean implementation: Using the new port allocation function.

The refactored implementation properly delegates to the new getTestPort() function and then to getTestSimpleResponderConfigPort, improving modularity.

Comment on lines +215 to +238
countSigInt := 0

runloop:
for {
signal := <-sigChan
switch signal {
case syscall.SIGINT:
countSigInt++
if countSigInt > 1 {
break runloop
} else {
log.Println("Recieved SIGINT, send another SIGINT to shutdown")
}
case syscall.SIGTERM:
if *ignoreSigTerm {
log.Println("Ignoring SIGTERM")
} else {
log.Println("Recieved SIGTERM, shutting down")
break runloop
}
default:
break runloop
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Graceful HTTP shutdown missing

Breaking the loop logs “shutting down” but never closes the http.Server, so existing connections may hang until the process is killed. A small addition ensures a tidy exit:

@@
 	default:
 		break runloop
 	}
 }
 
 log.Println("simple-responder shutting down")
+
+// attempt graceful shutdown of the HTTP server
+ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+defer cancel()
+if err := srv.Shutdown(ctx); err != nil {
+	log.Printf("server shutdown error: %v\n", err)
+}

Remember to add import "context" at the top.

@mostlygeek mostlygeek merged commit 7f37bcc into main May 14, 2025
2 checks passed
@mostlygeek mostlygeek deleted the more-unload-125 branch May 14, 2025 04:21
rohitpaul pushed a commit to rohitpaul/llama-swap that referenced this pull request Mar 29, 2026
* Add test for SIGKILL of process
* silent TestProxyManager_RunningEndpoint debug output
* Ref mostlygeek#125
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant