Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a HEALTHCHECK to the images #1237

Open
kodedylf opened this issue May 16, 2024 · 7 comments
Open

Add a HEALTHCHECK to the images #1237

kodedylf opened this issue May 16, 2024 · 7 comments

Comments

@kodedylf
Copy link

kodedylf commented May 16, 2024

When helping people on StackOverflow, I often see people having problems with Docker Compose and applications that try to connect to the database immediately on startup. When using a simple docker compose 'depends_on' directive, compose only waits for the database to be started before starting the application. The application then fails because the database isn't ready to accept connections yet.

The solution is to add a healthcheck to the database container in docker compose and have the application wait for the database to enter a healthy state.

It would be nice if the healthcheck was included in the Postgres image by default. A simple HEALTHCHECK ["pg_isready"] should be enough.

@LaurentGoderre
Copy link
Member

Some of this is discussed in #920

@mwisnicki
Copy link

Why not include healthcheck script but not enable it by default to make things a bit easier?

@mwisnicki
Copy link

mwisnicki commented Aug 3, 2024

IMHO included healthcheck should have these features:

  • ensures all init scripts are done
  • ensures db can be queried (SELECT 1)
  • allows user to provide custom query via env variable

@hibuna
Copy link

hibuna commented Aug 10, 2024

I was also struggling with a health check until I discovered that pg_ready was not what I was looking for. Here I posted my solution on stackoverflow and explain why this is the case.

My healthcheck:

test: ["CMD-SHELL", "psql -U ${DB_USER} -d ${DB_MAIN} -c 'SELECT 1' || exit 1"]

@tianon
Copy link
Member

tianon commented Aug 12, 2024

See #920 (comment), especially:

Using --host 127.0.0.1 with pg_isready or psql for checking the status of the container from within it is going to be the safed/best supported option (see https://github.com/docker-library/healthcheck/blob/master/postgres/docker-healthcheck for an example that does exactly that).

(Your example will report "healthy" before the container is actually ready for external connections because it will happily connect to the unix socket of the initialization instance.)

@fumoboy007
Copy link

pg_isready --host=127.0.0.1 || exit 1 seems like a sufficiently good default, no? For more specific use cases, users of the image can override the HEALTHCHECK instruction to change the command, the options, or to disable it altogether.


Rebuttal of the FAQ below…

  • many users will have their own idea of what "healthy" means and credentials change over time making generic health checks hard to define

As mentioned above, users can customize the instruction as they wish. Checking whether the server is ready to accept a connection seems like a reasonable default.

after upgrading their images, current users will have extra unexpected load on their systems for healthchecks they don't necessarily need/want and may be unaware of

Yes, this is a small breaking change. But we should also be open to breaking changes; otherwise, we will be stuck with less-than-ideal solutions. Besides, in what use cases would it make sense not to add the HEALTHCHECK instruction?

Kubernetes does not use Docker's heath checks (opting instead for separate liveness and readiness probes)

Kubernetes disables Docker’s HEALTHCHECK instruction (like anyone else can!), so adding the instruction to the image does not impact Kubernetes.

sometimes things like databases will take too long to initialize, and a defined health check will often cause the orchestration system to prematurely kill the container (docker-library/mysql#439)

The root cause here is that the configured timeout is too small. The solution is to increase the timeout, not to disable the HEALTHCHECK instruction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants