Skip to content

[client] Use setsid to avoid the parent process from being killed via HUP by login#4900

Merged
lixmal merged 2 commits intomainfrom
fix-util-linux-login-ssh
Dec 5, 2025
Merged

[client] Use setsid to avoid the parent process from being killed via HUP by login#4900
lixmal merged 2 commits intomainfrom
fix-util-linux-login-ssh

Conversation

@lixmal
Copy link
Copy Markdown
Collaborator

@lixmal lixmal commented Dec 3, 2025

Describe your changes

Issue ticket number and link

#4869

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • New Features

    • Detects util-linux login on Linux and adapts how login commands are constructed and wrapped for sessions.
    • Platform-specific no-op detection added for non-Linux environments.
  • Bug Fixes

    • Validates and handles empty remote commands to prevent erroneous executions.
    • Logs full executed command lines and ensures session exit status is recorded for clearer troubleshooting.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 3, 2025

Walkthrough

Added a util‑linux probe and Server field loginIsUtilLinux (initialized in Start); Linux login command construction can be wrapped with setsid when util‑linux is detected; PTY handling logging now prints full command args and JS/Windows PTY paths exit with status 1 on error.

Changes

Cohort / File(s) Summary
Util‑linux detection & Server state
client/ssh/server/server.go, client/ssh/server/command_execution_unix.go, client/ssh/server/command_execution_js.go, client/ssh/server/command_execution_windows.go
Added func (s *Server) detectUtilLinuxLogin(ctx context.Context) bool (Unix: runs login --version with 500ms timeout and checks for "util-linux"; JS/Windows return false). Added loginIsUtilLinux bool to Server and initialize it in Start.
Linux login command & setsid wrapping
client/ssh/server/userswitching_unix.go
Added getLinuxLoginCmd(loginPath, username, remoteIP string) (string, []string) and made Linux getLoginCmd delegate to it: select login args (Arch/PAM remote logic) and, if loginIsUtilLinux, attempt to wrap the login invocation with setsid (resolved via exec.LookPath); log and fall back when setsid unavailable.
PTY logging and guards
client/ssh/server/command_execution_unix.go, client/ssh/server/command_execution_js.go, client/ssh/server/command_execution_windows.go
Updated PTY handling logging to print joined execCmd.Args (full command). JS now calls session.Exit(1) and logs exit when PTY execution unsupported. Windows added a guard for empty command (log, session.Exit(1), return).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify the Unix probe's context timeout (500ms), subprocess invocation, and output matching for "util-linux".
  • Review setsid lookup and argument construction (placement of -w -c and argument ordering) and fallback logging behavior.
  • Ensure loginIsUtilLinux is set in Start before any login command construction paths run.
  • Check PTY logging change for sensitive data exposure and formatting; confirm JS/Windows exit handling is correct.

Poem

🐇 I sniff the login, quick and spry,
I check if util‑linux waves hi,
If found, I wrap commands with gentle setsid cheer,
If not, the old path stays near,
Hopping bytes and bash—ever nimbly here.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is largely incomplete; the 'Describe your changes' section is empty, lacking substantive explanation of the modifications across multiple files. Add a detailed explanation under 'Describe your changes' that summarizes the util-linux detection mechanism, setsid wrapping logic, and affected files.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: using setsid to prevent parent process termination via HUP when util-linux login is detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-util-linux-login-ssh

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lixmal lixmal changed the title [client] Use setsid to avoid the parent process from being killed via HUP by l… [client] Use setsid to avoid the parent process from being killed via HUP by login Dec 3, 2025
@lixmal lixmal force-pushed the fix-util-linux-login-ssh branch from 40e8af6 to 8946677 Compare December 3, 2025 08:01
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
client/ssh/server/userswitching_unix.go (1)

109-126: Setsid wrapping logic is sound with graceful degradation.

The implementation correctly wraps util-linux login with setsid -w -c to prevent vhangup() from killing the parent process. The flags are appropriate:

  • -w ensures setsid waits for the child to exit
  • -c sets the controlling terminal

The graceful degradation (lines 118-122) logs a warning but continues without setsid if it's unavailable. While this means the fix won't be applied in such cases, it prevents complete failure and is a reasonable trade-off.

The TODO comment on line 112 suggests eliminating the external setsid dependency by using syscall.Setsid() + TIOCSCTTY + syscall.Exec(). Would you like me to help implement this approach or open an issue to track it?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a232cf6 and 8946677.

📒 Files selected for processing (5)
  • client/ssh/server/command_execution_js.go (1 hunks)
  • client/ssh/server/command_execution_unix.go (2 hunks)
  • client/ssh/server/command_execution_windows.go (1 hunks)
  • client/ssh/server/server.go (2 hunks)
  • client/ssh/server/userswitching_unix.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-14T13:05:31.729Z
Learnt from: lixmal
Repo: netbirdio/netbird PR: 4015
File: client/ssh/server/userswitching_windows.go:89-139
Timestamp: 2025-11-14T13:05:31.729Z
Learning: In client/ssh/server/executor_windows.go, the WindowsExecutorConfig struct's Pty, PtyWidth, and PtyHeight fields are intentionally left unused for now and will be implemented in a future update.

Applied to files:

  • client/ssh/server/command_execution_windows.go
🧬 Code graph analysis (3)
client/ssh/server/command_execution_js.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
client/ssh/server/userswitching_unix.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
client/ssh/server/command_execution_unix.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: iOS / Build
  • GitHub Check: Android / Build
  • GitHub Check: Client / Unit
  • GitHub Check: Linux
  • GitHub Check: Darwin
  • GitHub Check: Windows
  • GitHub Check: Client / Unit
  • GitHub Check: release
  • GitHub Check: JS / Lint
  • GitHub Check: Client / Unit
🔇 Additional comments (9)
client/ssh/server/command_execution_js.go (1)

45-48: LGTM!

The stub implementation correctly returns false for the JS/WASM platform, consistent with other platform-specific detection methods in this file.

client/ssh/server/userswitching_unix.go (2)

90-90: Good refactoring to dedicated helper.

Delegating Linux-specific login command construction to getLinuxLoginCmd improves code organization and maintainability.


98-107: Arch Linux special case handling looks correct.

The conditional logic properly detects Arch Linux without PAM remote support and adjusts the login arguments accordingly. This addresses platform-specific differences in login implementations.

client/ssh/server/server.go (2)

141-142: Field addition follows existing pattern.

The loginIsUtilLinux field is appropriately placed and follows the same caching pattern as suSupportsPty, storing the detection result for use during command execution.


197-197: Initialization follows established pattern.

The detection is appropriately performed once during server startup, matching the approach used for suSupportsPty detection on line 196.

client/ssh/server/command_execution_windows.go (2)

386-389: LGTM!

The stub implementation correctly returns false for Windows, matching the pattern of other Unix-specific detection methods and enabling cross-platform compilation.


393-400: Defensive validation added for PTY command execution.

This check prevents executeCommandWithPty from being called with an empty command string, logging an error and exiting with status 1 if the condition is violated.

Note: This validation appears unrelated to the main PR objective (setsid wrapping for util-linux login). It's a defensive check that may have been added as a separate fix. The Unix version of executeCommandWithPty (client/ssh/server/command_execution_unix.go) does not have an equivalent check.

Could you clarify if this validation change is intentionally part of this PR, or if it was discovered and fixed during related work? If it's a separate concern, consider extracting it to a dedicated commit for clearer history.

client/ssh/server/command_execution_unix.go (2)

166-166: Logging improvement provides better visibility.

Changing from logging just execCmd.Path to strings.Join(execCmd.Args, " ") provides full visibility into the constructed command, including the setsid wrapper and all arguments. This aids in debugging and verification of the changes introduced in this PR.


78-95: Detection logic is sound and assumptions are verified.

The implementation correctly identifies util-linux login by:

  1. Running login --version with an appropriate 500ms timeout
  2. Falling back to false (assuming shadow-utils) if the command fails—shadow-utils login does not support --version
  3. Checking for "util-linux" in the output string—util-linux consistently outputs login (util-linux) X.Y.Z

The string matching approach is pragmatic and reliable across distributions, with no concerns about format changes in the near term.

@lixmal lixmal force-pushed the fix-util-linux-login-ssh branch from 8946677 to fcf8a17 Compare December 3, 2025 08:09
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
client/ssh/server/userswitching_unix.go (1)

128-132: Consider making fileExists a package-level helper.

This is a simple utility function with a receiver that isn't used. Consider making it a standalone function if used elsewhere, or keeping as-is if this is the only usage site.

-func (s *Server) fileExists(path string) bool {
+func fileExists(path string) bool {
 	_, err := os.Stat(path)
 	return err == nil
 }

You'd also need to update the call sites at lines 103.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8946677 and fcf8a17.

📒 Files selected for processing (5)
  • client/ssh/server/command_execution_js.go (1 hunks)
  • client/ssh/server/command_execution_unix.go (3 hunks)
  • client/ssh/server/command_execution_windows.go (1 hunks)
  • client/ssh/server/server.go (2 hunks)
  • client/ssh/server/userswitching_unix.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • client/ssh/server/server.go
  • client/ssh/server/command_execution_js.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-14T13:05:31.729Z
Learnt from: lixmal
Repo: netbirdio/netbird PR: 4015
File: client/ssh/server/userswitching_windows.go:89-139
Timestamp: 2025-11-14T13:05:31.729Z
Learning: In client/ssh/server/executor_windows.go, the WindowsExecutorConfig struct's Pty, PtyWidth, and PtyHeight fields are intentionally left unused for now and will be implemented in a future update.

Applied to files:

  • client/ssh/server/command_execution_windows.go
📚 Learning: 2025-11-13T00:29:53.247Z
Learnt from: lixmal
Repo: netbirdio/netbird PR: 4015
File: client/cmd/ssh_exec_unix.go:53-74
Timestamp: 2025-11-13T00:29:53.247Z
Learning: In client/ssh/server/executor_unix.go, the method ExecuteWithPrivilegeDrop(ctx context.Context, config ExecutorConfig) has a void return type (no error return). It handles failures by exiting the process directly with appropriate exit codes rather than returning errors to the caller.

Applied to files:

  • client/ssh/server/command_execution_windows.go
🧬 Code graph analysis (3)
client/ssh/server/userswitching_unix.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
client/ssh/server/command_execution_unix.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
client/ssh/server/command_execution_windows.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Signal / Unit (amd64)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Client / Unit
  • GitHub Check: Windows
  • GitHub Check: Linux
  • GitHub Check: Darwin
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: iOS / Build
  • GitHub Check: Android / Build
  • GitHub Check: JS / Lint
  • GitHub Check: release_ui_darwin
  • GitHub Check: release
🔇 Additional comments (6)
client/ssh/server/command_execution_unix.go (2)

79-100: Well-implemented detection with proper safeguards.

The implementation correctly:

  1. Guards with runtime.GOOS != "linux" even under unix build tag (defensive)
  2. Uses a reasonable 500ms timeout to avoid blocking
  3. Gracefully handles errors by returning false (shadow-utils assumption)
  4. Documents the rationale with a reference to the Debian bug

171-171: Improved logging for debugging.

Logging execCmd.Args instead of just execCmd.Path provides better visibility into the full command being executed, which is helpful for troubleshooting login issues. Since this is an interactive shell command (not user-provided secrets), this is safe to log.

client/ssh/server/command_execution_windows.go (2)

386-389: Correct platform stub implementation.

The Windows stub correctly returns false since util-linux is not applicable on Windows. This maintains interface consistency across platforms.


392-403: Good defensive guard for empty commands.

Adding explicit handling for empty command strings prevents unexpected behavior during PTY execution. The early exit with proper logging and session exit is appropriate.

client/ssh/server/userswitching_unix.go (2)

90-90: Clean delegation to platform-specific helper.

The refactoring to delegate Linux-specific logic to getLinuxLoginCmd improves code organization and maintainability.


98-126: Solid implementation addressing the vhangup() issue.

The logic correctly:

  1. Handles the Arch Linux edge case for missing /etc/pam.d/remote
  2. Returns early if util-linux is not detected
  3. Gracefully falls back when setsid is unavailable (with warning)
  4. Uses -w -c flags appropriately: -w waits for the child process and returns its exit status, -c sets the current terminal as the controlling terminal for the new session, together preventing vhangup() from killing the parent

The TODO comment (lines 112-113) acknowledges a cleaner future approach using syscall.Setsid() + TIOCSCTTY + syscall.Exec() to eliminate the external dependency.

Comment thread client/ssh/server/userswitching_unix.go Outdated
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Dec 3, 2025

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
client/ssh/server/userswitching_unix.go (2)

99-127: getLinuxLoginCmd correctly handles Arch and util‑linux setsid wrapping; consider caching setsid lookup

The Arch‑specific PAM check, the default -f <user> -h <remoteIP> -p branch, and the conditional setsid -w -c <loginPath> ... wrapping when loginIsUtilLinux is true all look consistent with the intended behavior and provide a reasonable fallback when setsid is missing (warning + plain login).

One improvement you might consider (non‑blocking) is caching the setsid lookup / availability in the Server struct or via a sync.Once, so you avoid repeated exec.LookPath("setsid") calls and duplicate warnings on every new session.

Please verify this on at least one util‑linux system (and one non‑util‑linux system) to confirm that:

  • the setsid path solves the original HUP issue, and
  • the fallback path still behaves as before when loginIsUtilLinux is false or setsid is not present.

129-133: fileExists helper is fine for current use; optional refinement if you need richer error handling

Using os.Stat and returning err == nil is sufficient for simple checks like /etc/arch-release and /etc/pam.d/remote, where treating any error as “does not exist / not usable” is acceptable.

If you ever need to distinguish “file truly missing” from other failures (permissions, transient I/O), you could extend this helper to branch on specific error kinds rather than collapsing everything into a boolean.

If you decide to tighten this helper later, please review all its call sites to ensure they actually want differentiated error handling instead of the current simple boolean.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcf8a17 and 06a92d8.

📒 Files selected for processing (1)
  • client/ssh/server/userswitching_unix.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/ssh/server/userswitching_unix.go (1)
client/ssh/server/server.go (1)
  • Server (115-143)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
  • GitHub Check: release
  • GitHub Check: Windows
  • GitHub Check: release_ui_darwin
  • GitHub Check: Linux
  • GitHub Check: Android / Build
  • GitHub Check: Client / Unit
  • GitHub Check: Darwin
🔇 Additional comments (1)
client/ssh/server/userswitching_unix.go (1)

90-91: Linux branch delegation to getLinuxLoginCmd looks correct

Delegating the Linux-specific bits into getLinuxLoginCmd keeps getLoginCmd focused, and reusing addrPort.Addr().String() preserves the previous host handling while enabling the new util‑linux logic.

@alexmoras
Copy link
Copy Markdown

Working on Ubuntu 25.10 on multiple devices. /proc/self/loginuid seems to be showing a random ID rather than 0 I get when logging in as root normally, but it doesn't seem to be affecting anything. Its resolved the underlying issue for me.

@afonsofrancof
Copy link
Copy Markdown

This is working perfectly for me.
/proc/self/loginuid is set to 1000 when I login with my user, which is the correct one.

@alexmoras
Copy link
Copy Markdown

Interesting! For me, I get:

root@vps:~# cat /proc/self/loginuid
4294967295

4294967295 is the decimal version of 0xFFFFFFFF which is the "unset" value.

It doesn't matter whether I log in as root or my normal user, I still get the unset value. This is running on Ubuntu 25.10 with the binaries from this PR.

I'm not sure if this expected behaviour?

@lixmal lixmal merged commit f538e6e into main Dec 5, 2025
35 of 36 checks passed
@lixmal lixmal deleted the fix-util-linux-login-ssh branch December 5, 2025 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants