[Feature] Basic systemd service monitoring by smtucker · Pull Request #1153 · henrygd/beszel

smtucker · 2025-09-08T04:29:01Z

This introduces a new feature to monitor systemd services on Linux hosts. The agent now collects service statuses and sends them to the hub, where they are displayed on the system detail page and summarized on the main dashboard. #722

Key Changes:

Agent:
- The agent now collects systemd service names and their statuses (active, inactive, failed, etc.).
- It gracefully handles permissions by attempting to connect to the system-wide systemd instance first, then falling back to the user-level instance if necessary. This ensures functionality whether the agent is run as root or a standard user.
System Detail Page:
- A new "Systemd Services" card displays a detailed list of all services.
- The view is collapsed by default, showing only failed services for immediate attention. An expander reveals the full list.
- Services are sorted to prioritize failed ones, and each status is color-coded with an indicator dot for quick visual assessment.
- A filter allows users to search for specific services by name.
Dashboard:
- The main systems table now includes a "Services" column.
- This column displays the number of failed services in red or a green checkmark if all services are running correctly, providing an at-a-glance health check.

henrygd · 2025-09-09T17:18:40Z

Thanks, I'll pull and check it out 👍

henrygd · 2025-09-09T17:56:48Z

Sorry, I was reorganizing things the past few days and had to force push to resolve the conflicts.

tecosaur · 2025-09-10T04:43:15Z

After watching #510 with interest, it's great to see this 😀

As someone who's running a bunch of apps/services with systemd instead of docker (on NixOS), I'd be interested to know if individual services can be shown/monitored in the same way as containers shown in the screenshot at the start of the readme (with status, CPU usage, memory usage, etc.)?

Some way of configuring particular services to be shown/monitered in the same way as containers currently are would be fantastic.

smtucker · 2025-09-10T11:35:18Z

After watching #510 with interest, it's great to see this 😀

As someone who's running a bunch of apps/services with systemd instead of docker (on NixOS), I'd be interested to know if individual services can be shown/monitored in the same way as containers shown in the screenshot at the start of the readme (with status, CPU usage, memory usage, etc.)?

Some way of configuring particular services to be shown/monitered in the same way as containers currently are would be fantastic.

Cool idea! This doesn't add resource monitoring for services, just an 'at a glance' for seeing if services have failed. It's definitely possible to add grabbing the memory and CPU usage since the agent here has access to the whole dbus properties context. However, I would be interested to know what @henrygd thinks about this and how to best configure specifying which services to do that for.

I'll have some time next week to play around with that.

svenvg93 · 2025-09-10T15:35:33Z

Cool idea! This doesn't add resource monitoring for services, just an 'at a glance' for seeing if services have failed. It's definitely possible to add grabbing the memory and CPU usage since the agent here has access to the whole dbus properties context. However, I would be interested to know what @henrygd thinks about this and how to best configure specifying which services to do that for.

I'll have some time next week to play around with that.

Just my two cents—if you decide to add it (which I’d love!), maybe consider adding a separate systemd tab, similar to what I did for the Docker section in this PR. That way it’s separated out, and we don’t end up with an endless scrolling page of charts 😆

henrygd · 2025-09-11T01:59:40Z

Agreed, I think we need to start splitting things up into different pages. Eventually we should have the following:

Host system metrics
Container (Docker, Podman) metrics ([Feature] Improve Container monitoring #928)
Systemd service metrics
SMART data (S.M.A.R.T support #614)
Proxmox VMs / LXC stats (Add Proxmox VE stats for VMs and LXCs #539)

I definitely want to pull resource usage from the systemd services if possible, so it's more in line with the Docker monitoring.

I also want to start putting the new data into their own tables / rows instead of using the JSON blobs. That way we can actually query the individual items effectively. For example, if you want to display the top 50 services across all systems using the most memory. This is something I'll change on my end, you don't need to do anything right now in the PRs.

chrisdeeming · 2025-09-12T11:41:56Z

Looking forward to this. Good work @smtucker and @henrygd

christophdb · 2025-09-17T19:07:15Z

I am so happy to see SMART data on your upcoming feature list. That is awesome.

tecosaur · 2025-09-18T03:30:26Z

I think we need to start splitting things up into different pages. Eventually we should have the following…

@henrygd that sounds great. Would you consider also providing the option to hide unused/less relevant pages?

For example, I'd want to hide the container and proxomox/vms pages, and I imagine some people who are running everything through containers might want to hide the systemd page.

smtucker · 2025-09-18T03:40:37Z

I've had a chance to spend some more time on this. While adding the service metrics, I refactored the implementation to better align with how container statistics are collected.

Here’s a summary of the changes:

Collects CPU and memory usage for each systemd service.
Adds a systemdManager to the agent.
Sends service statistics in CombinedData instead of SystemData.
Stores the records in a dedicated systemd_stats collection in the database.
The systemd services table now displays and allows sorting by CPU and memory usage.

I'll likely wait until #928 is merged before doing any more frontend work on this. I'm looking forward to seeing how dividing the UI into separate pages fleshes things out and would like to make it easier for this to be consistent.

henrygd · 2025-09-18T16:54:33Z

Awesome, thanks @smtucker! Really appreciate your work.

Just so you're aware of the timeline, I'm trying to wrap up with a big feature that includes some housekeeping to hopefully make the whole system a bit more flexible.

It will likely be another week or two before I will be able to merge this and the other PRs mentioned above.

@tecosaur Yes, we'll probably hide unused pages by default and add config options to exclude things you don't want to monitor.

tecosaur · 2025-11-04T05:04:07Z

Thanks for updates Shelby and Hank. Now that #928 had been merged, I'm hoping that this PR in able to follow.

It looks like this work is pretty much good to go, is that right?

henrygd · 2025-11-04T18:24:30Z

I'll get to this shortly, just have a few small things to wrap up.

mufeedali · 2025-11-06T06:40:57Z

Would this add support for alerts when a systemd service goes down?

henrygd · 2025-11-06T16:45:12Z

@mufeedali Not quite yet. We will look into adding that after this is merged.

henrygd · 2025-11-06T17:37:14Z

Working on this now.

For performance reasons I think I'm going to change this to collect every 10 min instead of every 1 min.

Then I'm thinking we can have these columns for the services table:

We can also add a page similar to /containers that shows all services from all systems, though this may be a huge number of services.

tecosaur · 2025-11-07T02:48:10Z

For performance reasons I think I'm going to change this to collect every 10 min instead of every 1 min.

What about live monitoring using the Systemd dbus API? https://www.freedesktop.org/wiki/Software/systemd/dbus/ I see that Go even has conveniences around this in the dbus package: https://pkg.go.dev/github.com/coreos/go-systemd/dbus?utm_source=godoc#Conn.SubscribeUnits.

I'd think this would be worth having for unit status at least.

henrygd · 2025-11-07T18:55:29Z

That's what we're using. I improved perf a bit so it may be fine to collect every minute. I'll deploy and see how she goes.

Co-authored-by: Shelby Tucker <shelby.tucker@gmail.com>

henrygd · 2025-11-10T22:14:41Z

This is merged now, thanks very much!

I did change it to collect every 10 minutes, otherwise I was seeing around 5x CPU usage for the agent.

Should have a release out in the next few days, just need to write the docs and test a little further.

Justinzobel · 2025-11-13T00:22:41Z

This is amazing, thank you very much!

henrygd · 2025-11-13T16:14:37Z

I just realized I forgot to include the column in the 'All Systems' table. This will be added soon.

FixNinja · 2025-11-13T18:26:44Z

Is there an environment variable we can use to hide the service we don't want to monitor?

henrygd · 2025-11-13T21:04:17Z

@FixNinja May I ask why you want to do this?

In the next release we can let you supply your own patterns for which services you want to monitor, so you could do something like SERVICE_PATTERNS=*foo*,*bar*.

I think you could exclude something with SERVICE_PATTERNS=[!foo]*. Would that work for you?

chrisdeeming · 2025-11-13T22:24:46Z

@FixNinja May I ask why you want to do this?

In the next release we can let you supply your own patterns for which services you want to monitor, so you could do something like SERVICE_PATTERNS=*foo*,*bar*.

I think you could exclude something with SERVICE_PATTERNS=[!foo]*. Would that work for you?

I'd appreciate limited service monitoring too. Primarily so I can focus on things that represent issues for the services we run.

I know it has been mentioned already but alerts for those services on failure is going to be the most important part of this for us.

This is excellent work so far though and I love how this product is shaping up.

OM-NATH · 2025-11-13T22:44:25Z

@FixNinja May I ask why you want to do this?

In the next release we can let you supply your own patterns for which services you want to monitor, so you could do something like SERVICE_PATTERNS=*foo*,*bar*.

I think you could exclude something with SERVICE_PATTERNS=[!foo]*. Would that work for you?

Perfect, that's all I was looking for. The idea is that a server may have dozens of services running, and only a few of those are really important. SERVICE_PATTERNS will give us the flexibility to choose which ones we want to monitor. One small suggestion: it looks like right now we are pulling all the services, including exited ones. It would be better if we only pull the running (auto start) ones to reduce clutter.

smtucker · 2025-11-13T23:30:08Z

Perfect, that's all I was looking for. The idea is that a server may have dozens of services running, and only a few of those are really important. SERVICE_PATTERNS will give us the flexibility to choose which ones we want to monitor. One small suggestion: it looks like right now we are pulling all the services, including exited ones. It would be better if we only pull the running (auto start) ones to reduce clutter.

I totally get why some people only want to monitor specific things, and for those users, having SERVICE_PATTERNS to specify which services to show in the frontend makes total sense for that use case.

However, regarding the suggestion to only pull running units by default, I personally feel it should still default to showing all the data received from the systemd API, including non-running units.

My personal thinking:

We only know if the unit is active or not after getting its status from the systemd API, so we already have the information. Filtering it in the backend at that point doesn't really save performance.
The frontend already allows you to filter if you want to reduce clutter.
It seems more fluid to manually filter if desired, than to require a user to manually enable everything if that's what they want.

@henrygd Thank you very much for considering, improving, and merging this pull request!

Justinzobel · 2025-11-14T02:24:16Z

My 2 cents, monitor everything by default and implement a way to configure what services to monitor (could be done in 2 ways):

Option 1:
Add it in Settings in the web UI
This would list every unique systemd service by name. Then the user can select which ones are important and they would be monitored on every instance. Basically a list of all names in one column on the left, then the user can click Add to move it to the monitored column (right).

Option 2:
Config file on the agent machines /etc/beszel-agent.conf with SYSTEMD_SERVICES='apache2;mysql;sshd'

Darkrock04 · 2025-11-15T13:15:13Z

This introduces a new feature to monitor systemd services on Linux hosts. The agent now collects service statuses and sends them to the hub, where they are displayed on the system detail page and summarized on the main dashboard. #722

Key Changes:

* Agent:
  
  * The agent now collects systemd service names and their statuses (active, inactive, failed, etc.).
  * It gracefully handles permissions by attempting to connect to the system-wide systemd instance first, then falling back to the user-level instance if necessary. This ensures functionality whether the agent is run as root or a standard user.

* System Detail Page:
  
  * A new "Systemd Services" card displays a detailed list of all services.
  * The view is collapsed by default, showing only failed services for immediate attention. An expander reveals the full list.
  * Services are sorted to prioritize failed ones, and each status is color-coded with an indicator dot for quick visual assessment.
  * A filter allows users to search for specific services by name.

* Dashboard:
  
  * The main systems table now includes a "Services" column.
  * This column displays the number of failed services in red or a green checkmark if all services are running correctly, providing an at-a-glance health check.

it is showing blank in my dashboard no ✅ this sign

jappi00 · 2025-11-17T12:54:07Z

@Darkrock04 can confirm that.

henrygd · 2025-11-17T21:14:39Z

@Darkrock04 @jappi00 Are you on 0.16.1? I forgot to include the services column in 0.16.0.

jappi00 · 2025-11-17T21:17:06Z

Jeah I'am on 16.1.

Maybe it is docker related? The agent runs in docker.66

EvilBMP · 2025-11-20T09:57:45Z

I also have the problem, that agents with version 0.16.1 don't send any systemd service information anymore. Agents still on 0.16.0 send systemd service data.

My dashboard is already on 0.16.1 - agents differ from 0.15.2 to 0.16.1 - everything running via Docker.

henrygd · 2025-11-24T19:22:16Z

Do the agent logs show any warnings or errors?

EvilBMP · 2025-11-25T16:22:03Z

Yes, clients that have been upgraded to 0.16.1 show the known error/ warning mentioned from other users above

WARN Error connecting to systemd err="dial unix /var/run/dbus/system_bus_socket: connect: no such file or directory" ref=https://beszel.dev/guide/systemd

Even if I downgrade to 0.16.0 again (with complete docker prune between), the warning persists! Clients that weren't upgraded yet and are on 0.16.0 from a former version 0.15.x are working as expected.

Docker Compose Configs are the same everywhere - only SSH Key and Token vary:

services:
  beszel-agent:
    image: "henrygd/beszel-agent"
    container_name: "beszel-agent"
    restart: unless-stopped
    network_mode: host
    security_opt:
      - apparmor:unconfined
    volumes:
      - ./beszel_agent_data:/var/lib/beszel-agent
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /run/dbus/system_bus_socket:/run/dbus/system_bus_socket:ro
      # monitor other disks / partitions by mounting a folder in /extra-filesystems
      # - /mnt/disk/.beszel:/extra-filesystems/sda1:ro
    environment:
      LISTEN: 45876
      KEY: "..."
      TOKEN: ...
      HUB_URL: ...

I really can't explain this behavior :-\

henrygd · 2025-11-25T17:15:26Z

Try changing the mount point to use /var/run instead of /run:

volumes:
    - /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket:ro

jappi00 · 2025-11-27T00:11:21Z

Hello @henrygd,

I added the security_opt and the volume and it works now. I would suggest adding that information to the docs.

henrygd · 2025-11-27T00:25:14Z

@jappi00 We have that documented on the Systemd Services page here: https://beszel.dev/guide/systemd

RalphPungaKronbergs · 2025-12-11T17:41:03Z

Hello,

using the env var SERVICE_PATTERNS to limit the services to monitor works like a charm.

What I would like to know is whether an alarm/notification is triggered if a service stops?

Thx,
Ralph

henrygd · 2025-12-12T18:13:42Z

Service alerts have not been implemented yet.

xd003 · 2026-01-07T11:12:56Z

I am running agent v0.17.0
The agent docker volume contains both of the following

- /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket:ro
- /var/run/systemd/private:/var/run/systemd/private:ro

Logs don't show any error but i don't see any systemd information on my dashboard

henrygd force-pushed the systemd branch from 0b753a8 to 208e119 Compare September 9, 2025 17:54

smtucker force-pushed the systemd branch from 0bb9d80 to f328b19 Compare September 18, 2025 04:21

smtucker and others added 3 commits September 26, 2025 08:08

basic systemd service monitoring

58619db

update to work after /internal rename

b4613f6

monitor systemd service cpu and memory usage

77f2636

smtucker force-pushed the systemd branch from f328b19 to 77f2636 Compare September 26, 2025 12:15

hatch01 mentioned this pull request Oct 8, 2025

nixos/beszel: init NixOS/nixpkgs#380731

Merged

13 tasks

henrygd mentioned this pull request Nov 7, 2025

Add offline caching to the agent #1317

Open

4 tasks

henrygd changed the base branch from main to 1153-systemd-services November 10, 2025 20:29

henrygd merged commit 40b3951 into henrygd:1153-systemd-services Nov 10, 2025

henrygd added a commit that referenced this pull request Nov 10, 2025

add basic systemd service monitoring (#1153)

01d2056

Co-authored-by: Shelby Tucker <shelby.tucker@gmail.com>

hatch01 mentioned this pull request Nov 11, 2025

nixos/beszel: allow S.M.A.R.T monitoring NixOS/nixpkgs#460730

Merged

13 tasks

henrygd added a commit that referenced this pull request Nov 13, 2025

add SERVICE_PATTERNS env var (#1153)

f64478b

Uh oh!

Conversation

smtucker commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henrygd commented Sep 9, 2025

Uh oh!

henrygd commented Sep 9, 2025

Uh oh!

tecosaur commented Sep 10, 2025

Uh oh!

smtucker commented Sep 10, 2025

Uh oh!

svenvg93 commented Sep 10, 2025

Uh oh!

henrygd commented Sep 11, 2025

Uh oh!

chrisdeeming commented Sep 12, 2025

Uh oh!

christophdb commented Sep 17, 2025

Uh oh!

tecosaur commented Sep 18, 2025

Uh oh!

smtucker commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henrygd commented Sep 18, 2025

Uh oh!

tecosaur commented Nov 4, 2025

Uh oh!

henrygd commented Nov 4, 2025

Uh oh!

mufeedali commented Nov 6, 2025

Uh oh!

henrygd commented Nov 6, 2025

Uh oh!

henrygd commented Nov 6, 2025

Uh oh!

tecosaur commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henrygd commented Nov 7, 2025

Uh oh!

henrygd commented Nov 10, 2025

Uh oh!

Justinzobel commented Nov 13, 2025

Uh oh!

henrygd commented Nov 13, 2025

Uh oh!

FixNinja commented Nov 13, 2025

Uh oh!

henrygd commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chrisdeeming commented Nov 13, 2025

Uh oh!

OM-NATH commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smtucker commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Justinzobel commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Darkrock04 commented Nov 15, 2025

Uh oh!

jappi00 commented Nov 17, 2025

Uh oh!

henrygd commented Nov 17, 2025

Uh oh!

jappi00 commented Nov 17, 2025

Uh oh!

EvilBMP commented Nov 20, 2025

Uh oh!

henrygd commented Nov 24, 2025

Uh oh!

EvilBMP commented Nov 25, 2025

Uh oh!

henrygd commented Nov 25, 2025

smtucker commented Sep 8, 2025 •

edited

Loading

smtucker commented Sep 18, 2025 •

edited

Loading

tecosaur commented Nov 7, 2025 •

edited

Loading

henrygd commented Nov 13, 2025 •

edited

Loading

OM-NATH commented Nov 13, 2025 •

edited

Loading

smtucker commented Nov 13, 2025 •

edited

Loading

Justinzobel commented Nov 14, 2025 •

edited

Loading

RalphPungaKronbergs commented Dec 11, 2025 •

edited

Loading