Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker healthcheck creates zombie processes if TLS is enabled with self signed certificate #7463

Closed
8 tasks done
hendrik1120 opened this issue Jun 28, 2024 · 8 comments · Fixed by #7498
Closed
8 tasks done
Labels
priority/4/normal Normal priority items priority/6/very-low Very Low priority items status/investigating Needs investigation type/bug/third-party Bugs with third party software, not with Authelia itself.

Comments

@hendrik1120
Copy link
Contributor

hendrik1120 commented Jun 28, 2024

Version

v4.38.9

Deployment Method

Docker

Reverse Proxy

Traefik

Reverse Proxy Version

3.0.3

Description

Hi,

I would like to follow up on my Discord message and properly document this issue as a bug.

Although this issue is not within the Authelia application itself, it is within the Authelia codebase and is also present by default in every Docker image.

If Authelia has been configured with the TLS option enabled (see below), the Docker healthcheck command using wget will create a zombie process at every execution. The default Docker healthcheck interval is set to 30 seconds, which will create 2,880 processes per day (24 hours * 60 minutes/hour * 2 processes/minute). When using Debian as the Docker host, the process limit is around 7,000 processes per container. Therefore, the Authelia container will become unhealthy after about 2.5 days, leading Traefik to drop requests to this container.

In the rare case where the host OS (Unraid) does not limit the processes, the host will eventually run out of PIDs and crash.

Reproduction

This issue can be replicated using the authelia docker image with the TLS option enabled in the configuration.yml.

server:
  tls:
    key: '/config/private.pem'
    certificate: '/config/public.crt'

The key and certificate can be created using the command from the docs.

authelia crypto certificate rsa generate --common-name example.com --directory /config

To speed up the issue, I suggest increasing the healtheck interval using docker-compose:

    healthcheck:
      interval: 0.01s

To check the amount of processes created, I ran docker stats on the host or ps inside the container.

Expectations

The container healthcheck should not fail after a certain period of time.

Configuration (Authelia)

---
theme: dark

storage:
  encryption_key: 'a_very_important_secret'
  local:
    path: '/config/db.sqlite3'

log:
  level: trace

server:
  tls:
    key: '/config/private.pem'
    certificate: '/config/public.crt'

session:
  secret: 'insecure_session_secret'
  cookies:
    - domain: 'example.com'
      authelia_url: 'https://auth.example.com'
      default_redirection_url: 'https://www.example.com'
      name: 'authelia_session'

notifier:
  disable_startup_check: false
  filesystem:
    filename: '/config/notification.txt'

access_control:
  default_policy: 'one_factor'

identity_validation:
  reset_password:
    jwt_lifespan: '5 minutes'
    jwt_algorithm: 'HS256'
    jwt_secret: 'insecure_secret'

authentication_backend:
  file:
    path: '/config/users.yml'
    watch: false
    search:
      email: false
      case_insensitive: false
    password:
      algorithm: 'argon2'
...

Build Information

Last Tag: v4.38.9
State: tagged clean
Branch: v4.38.9
Commit: 2798576ee25a56fd4c14814acd087b6e92f3978b
Build Number: 30034
Build OS: linux
Build Arch: amd64
Build Compiler: gc
Build Date: Sun, 16 Jun 2024 19:47:16 +1000
Extra: 

Go:
    Version: go1.22.2
    Module Path: github.com/authelia/authelia/v4
    Executable Path: github.com/authelia/authelia/v4/cmd/authelia

Logs (Authelia)

time="2024-06-28T15:39:34+02:00" level=debug msg="Loaded Configuration Sources" caller="github.com/authelia/authelia/v4/internal/commands/context.go:266 (*CmdCtx).LogConfigure" files="[/config/configuration.yml]" filters="[template]"
time="2024-06-28T15:39:34+02:00" level=debug msg="Logging Initialized" caller="github.com/authelia/authelia/v4/internal/commands/context.go:267 (*CmdCtx).LogConfigure" fields.level=trace file= format= keep_stdout=false
time="2024-06-28T15:39:34+02:00" level=debug msg="Process user information" caller="github.com/authelia/authelia/v4/internal/commands/context.go:307 (*CmdCtx).LogProcessCurrentUserRunE" gid=99 uid=99
time="2024-06-28T15:39:34+02:00" level=warning msg="Configuration: access_control: no rules have been specified so the 'default_policy' of 'one_factor' is going to be applied to all requests" caller="github.com/authelia/authelia/v4/internal/commands/context.go:317 (*CmdCtx).ConfigValidateLogRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Authelia v4.38.9 is starting" caller="github.com/authelia/authelia/v4/internal/commands/root.go:62 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Log severity set to trace" caller="github.com/authelia/authelia/v4/internal/logging/logger.go:93 setLevelStr"
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting scan of directory  for certificates" caller="github.com/authelia/authelia/v4/internal/utils/crypto.go:350 NewX509CertPool"
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:105 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is being checked for updates" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:338 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is already up to date" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:356 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:112 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:115 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:122 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:125 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:132 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:135 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:146 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting Services" caller="github.com/spf13/[email protected]/command.go:985 (*Command).execute"
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:222 handleRouter" implementation=Legacy methods="*" path_prefix=/api/verify
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=AuthRequest methods="[GET HEAD]" path=/api/authz/auth-request
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:233 handleRouter" implementation=ExtAuthz methods="*" path_prefix=/api/authz/ext-authz
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=ForwardAuth methods="[GET HEAD]" path=/api/authz/forward-auth
time="2024-06-28T15:39:34+02:00" level=trace msg="Service Loaded" caller="github.com/authelia/authelia/v4/internal/commands/services.go:319 servicesRun" server=main service=server
time="2024-06-28T15:39:34+02:00" level=debug msg="Create Server Service (metrics) skipped" caller="github.com/authelia/authelia/v4/internal/commands/services.go:318 servicesRun"
time="2024-06-28T15:39:34+02:00" level=info msg="Startup complete" caller="github.com/authelia/authelia/v4/internal/commands/root.go:94 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Listening for TLS connections on '[::]:9091' path '/'" caller="github.com/authelia/authelia/v4/internal/commands/services.go:137 (*ServerService).Run" server=main service=server
root@debian:~/docker# docker logs authelia
time="2024-06-28T15:39:34+02:00" level=debug msg="Loaded Configuration Sources" caller="github.com/authelia/authelia/v4/internal/commands/context.go:266 (*CmdCtx).LogConfigure" files="[/config/configuration.yml]" filters="[template]"
time="2024-06-28T15:39:34+02:00" level=debug msg="Logging Initialized" caller="github.com/authelia/authelia/v4/internal/commands/context.go:267 (*CmdCtx).LogConfigure" fields.level=trace file= format= keep_stdout=false
time="2024-06-28T15:39:34+02:00" level=debug msg="Process user information" caller="github.com/authelia/authelia/v4/internal/commands/context.go:307 (*CmdCtx).LogProcessCurrentUserRunE" gid=99 uid=99
time="2024-06-28T15:39:34+02:00" level=warning msg="Configuration: access_control: no rules have been specified so the 'default_policy' of 'one_factor' is going to be applied to all requests" caller="github.com/authelia/authelia/v4/internal/commands/context.go:317 (*CmdCtx).ConfigValidateLogRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Authelia v4.38.9 is starting" caller="github.com/authelia/authelia/v4/internal/commands/root.go:62 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Log severity set to trace" caller="github.com/authelia/authelia/v4/internal/logging/logger.go:93 setLevelStr"
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting scan of directory  for certificates" caller="github.com/authelia/authelia/v4/internal/utils/crypto.go:350 NewX509CertPool"
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:105 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is being checked for updates" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:338 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=info msg="Storage schema is already up to date" caller="github.com/authelia/authelia/v4/internal/storage/sql_provider.go:356 (*SQLProvider).StartupCheck"
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:112 doStartupChecks" provider=storage
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:115 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:122 doStartupChecks" provider=user
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:125 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:132 doStartupChecks" provider=notification
time="2024-06-28T15:39:34+02:00" level=trace msg="Performing Startup Check" caller="github.com/authelia/authelia/v4/internal/commands/root.go:135 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Startup Check Completed Successfully" caller="github.com/authelia/authelia/v4/internal/commands/root.go:146 doStartupChecks" provider=ntp
time="2024-06-28T15:39:34+02:00" level=trace msg="Starting Services" caller="github.com/spf13/[email protected]/command.go:985 (*Command).execute"
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:222 handleRouter" implementation=Legacy methods="*" path_prefix=/api/verify
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=AuthRequest methods="[GET HEAD]" path=/api/authz/auth-request
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:233 handleRouter" implementation=ExtAuthz methods="*" path_prefix=/api/authz/ext-authz
time="2024-06-28T15:39:34+02:00" level=trace msg="Registering Authz Endpoint" caller="github.com/authelia/authelia/v4/internal/server/handlers.go:242 handleRouter" implementation=ForwardAuth methods="[GET HEAD]" path=/api/authz/forward-auth
time="2024-06-28T15:39:34+02:00" level=trace msg="Service Loaded" caller="github.com/authelia/authelia/v4/internal/commands/services.go:319 servicesRun" server=main service=server
time="2024-06-28T15:39:34+02:00" level=debug msg="Create Server Service (metrics) skipped" caller="github.com/authelia/authelia/v4/internal/commands/services.go:318 servicesRun"
time="2024-06-28T15:39:34+02:00" level=info msg="Startup complete" caller="github.com/authelia/authelia/v4/internal/commands/root.go:94 (*CmdCtx).RootRunE"
time="2024-06-28T15:39:34+02:00" level=info msg="Listening for TLS connections on '[::]:9091' path '/'" caller="github.com/authelia/authelia/v4/internal/commands/services.go:137 (*ServerService).Run" server=main service=server
time="2024-06-28T15:39:39+02:00" level=trace msg="Request hit" caller="github.com/authelia/authelia/v4/internal/middlewares/log_request.go:12 handleRouter.LogRequest.func40" method=GET path=/api/health remote_ip="::1"
time="2024-06-28T15:39:39+02:00" level=trace msg="Replied (status=200)" caller="github.com/authelia/authelia/v4/internal/middlewares/log_request.go:16 handleRouter.LogRequest.func40" method=GET path=/api/health remote_ip="::1"

Logs (Proxy / Application)

No response

Documentation

https://www.authelia.com/configuration/miscellaneous/server/#tls
https://www.authelia.com/overview/security/measures/#mutual-tls

Pre-Submission Checklist

  • I agree to follow the Code of Conduct

  • This is a bug report and not a support request

  • I have read the security policy and this bug report is not a security issue or security related issue

  • I have either included the complete configuration file or I am sure it's unrelated to the configuration

  • I have either included the complete debug / trace logs or the output of the build-info command if the logs are not relevant

  • I have provided all of the required information in full with the only alteration being reasonable sanitization in accordance with the Troubleshooting Sanitization reference guide

  • I have checked for related proxy or application logs and included them if available

  • I have checked for related issues and checked the documentation

@hendrik1120 hendrik1120 added priority/4/normal Normal priority items status/needs-triage Issues which have not expressly been classified by a team member yet type/bug/unconfirmed Unconfirmed Bugs labels Jun 28, 2024
@james-d-elliott james-d-elliott added type/bug/third-party Bugs with third party software, not with Authelia itself. status/investigating Needs investigation priority/6/very-low Very Low priority items and removed type/bug/unconfirmed Unconfirmed Bugs status/needs-triage Issues which have not expressly been classified by a team member yet labels Jul 5, 2024
@james-d-elliott
Copy link
Member

james-d-elliott commented Jul 6, 2024

Looks to be a bug with busybox wget's ssl_client. This should fix it #7498 (though it'll add ~4MB to our image). Please give it a shot and let me know.

@hendrik1120
Copy link
Contributor Author

Can confirm, #7498 does fix the issue.

LimeTech (creator of UNRAID) is also working on a fix for their docker implementation to prevent process leaks from crashing the host.

@james-d-elliott
Copy link
Member

Great to know. Can't really see why the busybox wget has not been fixed. The issue occurs with any TLS request it looks like.

@hendrik1120
Copy link
Contributor Author

hendrik1120 commented Jul 6, 2024

Yeah, just found this: https://bugs.busybox.net/show_bug.cgi?id=15967

Adding --initto the docker run command has been suggested, but I really don't think that is an acceptable solution.
Might be some weird interaction with docker, as using --init does actually also fix the issue.

james-d-elliott added a commit that referenced this issue Jul 6, 2024
This should prevent the issues with the broken busybox wget by installing the musl wget. It should be noted that while this is labelled as a fix this does not fix the actual issue as the issue is an upstream bug with the busybox wget TLS client where it appears to leave zombie processes on any TLS request. This is just a workaround.

Fixes #7463

Signed-off-by: James Elliott <[email protected]>
@james-d-elliott
Copy link
Member

james-d-elliott commented Jul 6, 2024

Hmm, I'm kind of inclined to say it may be the correct solution and I may revert the merged one, and instead update the docs. Considering the only instance where this has been discovered is an edge case (most people do not run the container with TLS but instead use a proxy to perform termination) it'd make more sense to do that then to force everyone to download a larger image.

Init runs tini, which is included with docker older than 1.24. I will do some more research I think.

@hendrik1120
Copy link
Contributor Author

hendrik1120 commented Jul 6, 2024

Yeah, the whole thing kinda is an edge case on an edge case and is very unlikely to cause these massive problems to anyone else.
I am just using TLS between authelia and traefik. It's probably overkill, but “For the best security protection, configuration with TLS is highly recommended. …” from the docs.

I agree with you, but if someone forgets to set this uncommon docker setting, it could be really hard to troubleshoot. I am not sure how predictable the authelia app itself fails when being PID limited, which might cause more edge cases (even security related maybe?). It also depends on the os and if the docker compose pids_limit is set.

@james-d-elliott
Copy link
Member

I'll update the docs on that to clarify secure networks vs insecure ones. i.e. a iptables network on a host is significantly more secure as they'd likely have to compromise a host with root equivalent access as well to compromise that network, making the effort inconsequential in a vast majority of cases since they'd be able to alter configs themselves if they have root equivalent access.

@hendrik1120
Copy link
Contributor Author

Resolved in v4.38.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/4/normal Normal priority items priority/6/very-low Very Low priority items status/investigating Needs investigation type/bug/third-party Bugs with third party software, not with Authelia itself.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants