Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate process detection #2184

Merged
merged 3 commits into from
Feb 11, 2025
Merged

Fix duplicate process detection #2184

merged 3 commits into from
Feb 11, 2025

Conversation

DL6ER
Copy link
Member

@DL6ER DL6ER commented Feb 9, 2025

What does this implement/fix?

So far, FTL tried to prevent multiple processes from being started on the same machine. However, as the pihole-FTL binary became kind of a Swiss-army-knife offering integrated features such as sqlite3, sqlite3_rsync, sha256sum, gzip, lua, dhcp-discover, and others (like a small TLS certificate managing tool set), it becomes more and more likely that there are long-lived interactive pihole-FTL instances on the machine. They may coexist without any issues, however, once such an interactive process is running, restarting the original pihole-FTL may be prevented under certain circumstances.

An earlier attempt was made in #2183 to allow full concurrent FTL processes but it had its own pitfalls and this PR proposes a different, much simpler solution:

  1. Remove procps scanning for other running pihole-FTL processes due to possible detection (and startup prevention) of legit long-lived other processes like pihole-FTL sqlite3, etc. - rely on the PID file alone
  2. Use individual shared memory objects per FTL process so accidentally running duplicates don't interfere with each other. This can be seen as the fallback solution in case the PID file-based duplicate detection did not work due to security restrictions concerning process detection on the system (see comment in function daemon.c:another_FTL() for further context)

No test changes are required as FTL continues to behave as before.

edit We use /proc/<PID>/status instead of kill(<PID>, 0) for the reasons mentioned below in the comments.


Related issue or feature (if applicable): https://discourse.pi-hole.net/t/i-cant-reach-the-pi-hole-v6-web-interface/75639/27

Pull request in docs with documentation (if applicable): N/A


By submitting this pull request, I confirm the following:

  1. I have read and understood the contributors guide, as well as this entire template. I understand which branch to base my commits and Pull Requests against.
  2. I have commented my proposed changes within the code.
  3. I am willing to help maintain this change if there are issues with it later.
  4. It is compatible with the EUPL 1.2 license
  5. I have squashed any insignificant commits. (git rebase)

Checklist:

  • The code change is tested and works locally.
  • I based my code and PRs against the repositories developmental branch.
  • I signed off all commits. Pi-hole enforces the DCO for all contributions
  • I signed all my commits. Pi-hole requires signatures to verify authorship
  • I have read the above and my PR is ready for review.

DL6ER added 2 commits February 9, 2025 19:04
…possible detection (and startup prevention) of legit long-lived other processes like "pihole-FTL sqlite3", etc.

Signed-off-by: DL6ER <[email protected]>
…running duplicates don't interfere with each other. This can be seen as the fallback solution in case the PID file-based duplicate detection did not work due to security restrictions concerning process deetection on the system (see comment in function daemon.c:another_FTL() for further context)

Signed-off-by: DL6ER <[email protected]>
@DL6ER DL6ER requested a review from a team February 9, 2025 18:14
@DL6ER DL6ER mentioned this pull request Feb 9, 2025
5 tasks
Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw two issues here:

2025-02-09 21:42:01.106 CET [552753M] INFO: All threads joined
2025-02-09 21:42:01.107 CET [552753M] WARNING: Unable to remove PID file: Permission denied
2025-02-09 21:42:01.110 CET [552753M] INFO: Stored 0 API sessions in the database
2025-02-09 21:42:02.088 CET [552753M] INFO: ########## FTL terminated after 1m 13s  (code 0)! ##########


Assuming PID 552815 is dead is a lie, its perfectly viable.

2025-02-09 21:49:38.960 CET [553153M] INFO: Parsed config file /etc/pihole/pihole.toml successfully
!!! WARNING: Writing to FTL's log file failed!
2025-02-09 21:49:38.962 CET [553153M] INFO: PID file contains PID 552815 (dead), we are 553153
!!! WARNING: Writing to FTL's log file failed!
2025-02-09 21:49:38.962 CET [553153M] WARNING: Insufficient permissions to set process priority to -10 (CAP_SYS_NICE required), process priority remains at 0
!!! WARNING: Writing to FTL's log file failed!
2025-02-09 21:49:38.987 CET [553153M] WARNING: Starting pihole-FTL as user nanopi is not recommended
!!! WARNING: Writing to FTL's log file failed!
2025-02-09 21:49:38.988 CET [553153M] WARNING: Unable to write PID to file: Permission denied
!!! WARNING: Writing to FTL's log file failed!
2025-02-09 21:49:38.988 CET [553153M] INFO: PID of FTL process: 553153
!!! WARNING: Writing to FTL's log file failed!

dnsmasq: failed to create listening socket for port 53: Permission denied

This can easily be tested: start one FTL process via systemctl start pihole-FTL and afterwards run pihole-FTL -f

… processes's state as the latter may not be allowed if the other process is running as another user and we don't have CAP_KILL (or am root)

Signed-off-by: DL6ER <[email protected]>
@DL6ER DL6ER requested a review from yubiuser February 10, 2025 17:19
@DL6ER
Copy link
Member Author

DL6ER commented Feb 10, 2025

Yeah, so using kill is not the right way when we don't have CAP_KILL. We can achieve the same using /proc, commit pushed.

Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a duplicated log entry

2025-02-11 17:31:33.268 CET [602209M] INFO: PID file does not exist or not readable
2025-02-11 17:31:33.268 CET [602209M] INFO: No other running FTL process found.
2025-02-11 17:31:33.275 CET [602209M] INFO: PID of FTL process: 602209
2025-02-11 17:31:33.276 CET [602209M] INFO: listening on 0.0.0.0 port 53
2025-02-11 17:31:33.277 CET [602209M] INFO: listening on :: port 53
2025-02-11 17:31:33.282 CET [602209M] INFO: PID of FTL process: 602209
2025-02-11 17:31:33.285 CET [602209M] INFO: Database version is 21

This looks good

2025-02-11 17:34:53.924 CET [602238M] INFO: Parsed config file /etc/pihole/pihole.toml successfully
!!! WARNING: Writing to FTL's log file failed!
2025-02-11 17:34:53.926 CET [602238M] CRIT: pihole-FTL is already running (PID 602209)!
!!! WARNING: Writing to FTL's log file failed!
nanopi@nanopi:~$ 

@DL6ER
Copy link
Member Author

DL6ER commented Feb 11, 2025

There is a duplicated log entry

These two lines normally log two different PIDs - once before and once after forking. The only case where these two are identical is running pihole-FTL -f. Did you test in docker?

@yubiuser
Copy link
Member

These two lines normally log two different PIDs - once before and once after forking. The only case where these two are identical is running pihole-FTL -f. Did you test in docker?

No. Bare metal. This is a pure systemctl restart pihole-FTL log snippet. (Looking at htop the systemd service we use starts FTL as pihole-FTL -f)

@rdwebdesign
Copy link
Member

Maybe the PR description should be adjusted to match the final solution using /proc/<PID>/status instead of a call to kill(<PID>, 0).

@DL6ER DL6ER merged commit d70af72 into development Feb 11, 2025
19 checks passed
@DL6ER DL6ER deleted the fix/duplicate_processes branch February 11, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants