docker healthcheck creates zombie processes if TLS is enabled with self signed certificate #7463

hendrik1120 · 2024-06-28T13:41:32Z

Version

v4.38.9

Deployment Method

Docker

Reverse Proxy

Traefik

Reverse Proxy Version

3.0.3

Description

Hi,

I would like to follow up on my Discord message and properly document this issue as a bug.

Although this issue is not within the Authelia application itself, it is within the Authelia codebase and is also present by default in every Docker image.

If Authelia has been configured with the TLS option enabled (see below), the Docker healthcheck command using wget will create a zombie process at every execution. The default Docker healthcheck interval is set to 30 seconds, which will create 2,880 processes per day (24 hours * 60 minutes/hour * 2 processes/minute). When using Debian as the Docker host, the process limit is around 7,000 processes per container. Therefore, the Authelia container will become unhealthy after about 2.5 days, leading Traefik to drop requests to this container.

In the rare case where the host OS (Unraid) does not limit the processes, the host will eventually run out of PIDs and crash.

Reproduction

This issue can be replicated using the authelia docker image with the TLS option enabled in the configuration.yml.

server:
  tls:
    key: '/config/private.pem'
    certificate: '/config/public.crt'

The key and certificate can be created using the command from the docs.

authelia crypto certificate rsa generate --common-name example.com --directory /config

To speed up the issue, I suggest increasing the healtheck interval using docker-compose:

    healthcheck:
      interval: 0.01s

To check the amount of processes created, I ran docker stats on the host or ps inside the container.

Expectations

The container healthcheck should not fail after a certain period of time.

Configuration (Authelia)

---
theme: dark

storage:
  encryption_key: 'a_very_important_secret'
  local:
    path: '/config/db.sqlite3'

log:
  level: trace

server:
  tls:
    key: '/config/private.pem'
    certificate: '/config/public.crt'

session:
  secret: 'insecure_session_secret'
  cookies:
    - domain: 'example.com'
      authelia_url: 'https://auth.example.com'
      default_redirection_url: 'https://www.example.com'
      name: 'authelia_session'

notifier:
  disable_startup_check: false
  filesystem:
    filename: '/config/notification.txt'

access_control:
  default_policy: 'one_factor'

identity_validation:
  reset_password:
    jwt_lifespan: '5 minutes'
    jwt_algorithm: 'HS256'
    jwt_secret: 'insecure_secret'

authentication_backend:
  file:
    path: '/config/users.yml'
    watch: false
    search:
      email: false
      case_insensitive: false
    password:
      algorithm: 'argon2'
...

Build Information

Last Tag: v4.38.9
State: tagged clean
Branch: v4.38.9
Commit: 2798576ee25a56fd4c14814acd087b6e92f3978b
Build Number: 30034
Build OS: linux
Build Arch: amd64
Build Compiler: gc
Build Date: Sun, 16 Jun 2024 19:47:16 +1000
Extra: 

Go:
    Version: go1.22.2
    Module Path: github.com/authelia/authelia/v4
    Executable Path: github.com/authelia/authelia/v4/cmd/authelia

Logs (Authelia)

time="2024-06-28T15:39:34+02:00" level=debug msg="Loaded Configuration Sources" caller="github.com/authelia/authelia/v4/internal/commands/context.go:266 (*CmdCtx).LogConfigure" files="[/config/configuration.yml]" filters="[template]"
time="2024-06-28T15:39:34+02:00" level=debug msg="Logging Initialized" caller="github.com/authelia/authelia/v4/internal/commands/context.go:267 (*CmdCtx).LogConfigure" fields.level=trace file= format= keep_stdout=false
time="2024-06-28T15:39:34+02:00" level=debug msg="Process user information" caller="github.com/authelia/authelia/v4/internal/commands/context.go:307 (*CmdCtx).LogProcessCurrentUserRunE" gid=99 uid=99
time="2024-06-28T15:39:34+02:00" level=warning msg="Configuration: access_control: no rules have been specified so the 'default_policy' of 'one_factor' is going to be applied to all requests" caller="github.com/authelia/authelia/v4/internal/commands/context.go:317 (*CmdCtx).ConfigValidateLogRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Authelia v4.38.9 is starting" caller="github.com/authelia/authelia/v4/internal/commands/root.go:62 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Log severity set to trace" caller="github.com/authelia/authelia/v4/internal/logging/logger.go:93 setLevelStr"
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting scan of directory  for certificates" caller="github.com/authelia/authelia/v4/internal/utils/crypto.go:350 NewX509CertPool"
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:105 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is being checked for updates" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:338 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is already up to date" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:356 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:112 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:115 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:122 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:125 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:132 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:135 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:146 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting Services" caller="github.com/spf13/[email protected]/command.go:985 (*Command).execute"
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:222 handleRouter" implementation=Legacy methods="*" path_prefix=/api/verify
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=AuthRequest methods="[GET HEAD]" path=/api/authz/auth-request
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:233 handleRouter" implementation=ExtAuthz methods="*" path_prefix=/api/authz/ext-authz
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=ForwardAuth methods="[GET HEAD]" path=/api/authz/forward-auth
time="2024-06-28T15:39:34+02:00" level=trace msg="Service Loaded" caller="github.com/authelia/authelia/v4/internal/commands/services.go:319 servicesRun" server=main service=server
time="2024-06-28T15:39:34+02:00" level=debug msg="Create Server Service (metrics) skipped" caller="github.com/authelia/authelia/v4/internal/commands/services.go:318 servicesRun"
time="2024-06-28T15:39:34+02:00" level=info msg="Startup complete" caller="github.com/authelia/authelia/v4/internal/commands/root.go:94 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Listening for TLS connections on '[::]:9091' path '/'" caller="github.com/authelia/authelia/v4/internal/commands/services.go:137 (*ServerService).Run" server=main service=server
root@debian:~/docker# docker logs authelia
time="2024-06-28T15:39:34+02:00" level=debug msg="Loaded Configuration Sources" caller="github.com/authelia/authelia/v4/internal/commands/context.go:266 (*CmdCtx).LogConfigure" files="[/config/configuration.yml]" filters="[template]"
time="2024-06-28T15:39:34+02:00" level=debug msg="Logging Initialized" caller="github.com/authelia/authelia/v4/internal/commands/context.go:267 (*CmdCtx).LogConfigure" fields.level=trace file= format= keep_stdout=false
time="2024-06-28T15:39:34+02:00" level=debug msg="Process user information" caller="github.com/authelia/authelia/v4/internal/commands/context.go:307 (*CmdCtx).LogProcessCurrentUserRunE" gid=99 uid=99
time="2024-06-28T15:39:34+02:00" level=warning msg="Configuration: access_control: no rules have been specified so the 'default_policy' of 'one_factor' is going to be applied to all requests" caller="github.com/authelia/authelia/v4/internal/commands/context.go:317 (*CmdCtx).ConfigValidateLogRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Authelia v4.38.9 is starting" caller="github.com/authelia/authelia/v4/internal/commands/root.go:62 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Log severity set to trace" caller="github.com/authelia/authelia/v4/internal/logging/logger.go:93 setLevelStr"
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting scan of directory  for certificates" caller="github.com/authelia/authelia/v4/internal/utils/crypto.go:350 NewX509CertPool"
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:105 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is being checked for updates" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:338 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is already up to date" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:356 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:112 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:115 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:122 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:125 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:132 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:135 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:146 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting Services" caller="github.com/spf13/[email protected]/command.go:985 (*Command).execute"
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:222 handleRouter" implementation=Legacy methods="*" path_prefix=/api/verify
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=AuthRequest methods="[GET HEAD]" path=/api/authz/auth-request
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:233 handleRouter" implementation=ExtAuthz methods="*" path_prefix=/api/authz/ext-authz
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=ForwardAuth methods="[GET HEAD]" path=/api/authz/forward-auth
time="2024-06-28T15:39:34+02:00" level=trace msg="Service Loaded" caller="github.com/authelia/authelia/v4/internal/commands/services.go:319 servicesRun" server=main service=server
time="2024-06-28T15:39:34+02:00" level=debug msg="Create Server Service (metrics) skipped" caller="github.com/authelia/authelia/v4/internal/commands/services.go:318 servicesRun"
time="2024-06-28T15:39:34+02:00" level=info msg="Startup complete" caller="github.com/authelia/authelia/v4/internal/commands/root.go:94 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Listening for TLS connections on '[::]:9091' path '/'" caller="github.com/authelia/authelia/v4/internal/commands/services.go:137 (*ServerService).Run" server=main service=server
time="2024-06-28T15:39:39+02:00" level=trace msg="Request hit" caller="github.com/authelia/authelia/v4/internal/middlewares/log_request.go:12 handleRouter.LogRequest.func40" method=GET path=/api/health remote_ip="::1"
time="2024-06-28T15:39:39+02:00" level=trace msg="Replied (status=200)" caller="github.com/authelia/authelia/v4/internal/middlewares/log_request.go:16 handleRouter.LogRequest.func40" method=GET path=/api/health remote_ip="::1"

Logs (Proxy / Application)

No response

Documentation

https://www.authelia.com/configuration/miscellaneous/server/#tls
https://www.authelia.com/overview/security/measures/#mutual-tls

Pre-Submission Checklist

I agree to follow the Code of Conduct
This is a bug report and not a support request
I have read the security policy and this bug report is not a security issue or security related issue
I have either included the complete configuration file or I am sure it's unrelated to the configuration
I have either included the complete debug / trace logs or the output of the build-info command if the logs are not relevant
I have provided all of the required information in full with the only alteration being reasonable sanitization in accordance with the Troubleshooting Sanitization reference guide
I have checked for related proxy or application logs and included them if available
I have checked for related issues and checked the documentation

The text was updated successfully, but these errors were encountered:

james-d-elliott · 2024-07-06T00:03:07Z

Looks to be a bug with busybox wget's ssl_client. This should fix it #7498 (though it'll add ~4MB to our image). Please give it a shot and let me know.

hendrik1120 · 2024-07-06T07:39:03Z

Can confirm, #7498 does fix the issue.

LimeTech (creator of UNRAID) is also working on a fix for their docker implementation to prevent process leaks from crashing the host.

james-d-elliott · 2024-07-06T07:53:59Z

Great to know. Can't really see why the busybox wget has not been fixed. The issue occurs with any TLS request it looks like.

hendrik1120 · 2024-07-06T07:59:56Z

Yeah, just found this: https://bugs.busybox.net/show_bug.cgi?id=15967

Adding --initto the docker run command has been suggested, but I really don't think that is an acceptable solution.
Might be some weird interaction with docker, as using --init does actually also fix the issue.

This should prevent the issues with the broken busybox wget by installing the musl wget. It should be noted that while this is labelled as a fix this does not fix the actual issue as the issue is an upstream bug with the busybox wget TLS client where it appears to leave zombie processes on any TLS request. This is just a workaround. Fixes #7463 Signed-off-by: James Elliott <[email protected]>

james-d-elliott · 2024-07-06T08:21:50Z

Hmm, I'm kind of inclined to say it may be the correct solution and I may revert the merged one, and instead update the docs. Considering the only instance where this has been discovered is an edge case (most people do not run the container with TLS but instead use a proxy to perform termination) it'd make more sense to do that then to force everyone to download a larger image.

Init runs tini, which is included with docker older than 1.24. I will do some more research I think.

hendrik1120 · 2024-07-06T11:54:31Z

Yeah, the whole thing kinda is an edge case on an edge case and is very unlikely to cause these massive problems to anyone else.
I am just using TLS between authelia and traefik. It's probably overkill, but “For the best security protection, configuration with TLS is highly recommended. …” from the docs.

I agree with you, but if someone forgets to set this uncommon docker setting, it could be really hard to troubleshoot. I am not sure how predictable the authelia app itself fails when being PID limited, which might cause more edge cases (even security related maybe?). It also depends on the os and if the docker compose pids_limit is set.

james-d-elliott · 2024-07-07T02:04:40Z

I'll update the docs on that to clarify secure networks vs insecure ones. i.e. a iptables network on a host is significantly more secure as they'd likely have to compromise a host with root equivalent access as well to compromise that network, making the effort inconsequential in a vast majority of cases since they'd be able to alter configs themselves if they have root equivalent access.

hendrik1120 · 2024-08-28T08:30:32Z

Resolved in v4.38.10

hendrik1120 added priority/4/normal Normal priority items status/needs-triage Issues which have not expressly been classified by a team member yet type/bug/unconfirmed Unconfirmed Bugs labels Jun 28, 2024

james-d-elliott mentioned this issue Jul 6, 2024

fix: busybox wget zombie ssl client #7498

Merged

james-d-elliott closed this as completed in #7498 Jul 6, 2024

james-d-elliott reopened this Jul 6, 2024

hendrik1120 closed this as completed Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker healthcheck creates zombie processes if TLS is enabled with self signed certificate #7463

docker healthcheck creates zombie processes if TLS is enabled with self signed certificate #7463

hendrik1120 commented Jun 28, 2024 •

edited

Loading

james-d-elliott commented Jul 6, 2024 •

edited

Loading

hendrik1120 commented Jul 6, 2024

james-d-elliott commented Jul 6, 2024

hendrik1120 commented Jul 6, 2024 •

edited

Loading

james-d-elliott commented Jul 6, 2024 •

edited

Loading

hendrik1120 commented Jul 6, 2024 •

edited

Loading

james-d-elliott commented Jul 7, 2024

hendrik1120 commented Aug 28, 2024

docker healthcheck creates zombie processes if TLS is enabled with self signed certificate #7463

docker healthcheck creates zombie processes if TLS is enabled with self signed certificate #7463

Comments

hendrik1120 commented Jun 28, 2024 • edited Loading

Version

Deployment Method

Reverse Proxy

Reverse Proxy Version

Description

Reproduction

Expectations

Configuration (Authelia)

Build Information

Logs (Authelia)

Logs (Proxy / Application)

Documentation

Pre-Submission Checklist

james-d-elliott commented Jul 6, 2024 • edited Loading

hendrik1120 commented Jul 6, 2024

james-d-elliott commented Jul 6, 2024

hendrik1120 commented Jul 6, 2024 • edited Loading

james-d-elliott commented Jul 6, 2024 • edited Loading

hendrik1120 commented Jul 6, 2024 • edited Loading

james-d-elliott commented Jul 7, 2024

hendrik1120 commented Aug 28, 2024

hendrik1120 commented Jun 28, 2024 •

edited

Loading

james-d-elliott commented Jul 6, 2024 •

edited

Loading

hendrik1120 commented Jul 6, 2024 •

edited

Loading

james-d-elliott commented Jul 6, 2024 •

edited

Loading

hendrik1120 commented Jul 6, 2024 •

edited

Loading