[bitnami/postgresql-ha] Cloning data from primary node fails due to liveness/rediness probes #3556

Antiarchitect · 2020-08-30T14:38:37Z

Which chart:
bitnami/postgresql-ha 3.5.9

Describe the bug
Upscaling fails on "Cloning data from primary node..." with large db (18GB) probably due to liveness/readiness probes

To Reproduce

Set replicaCount to 1
Restore large db from the dump
Try to upscale to 2 replicas

Expected behavior
Some mechanism to avoid this.

P.S. If I turn off liveness/readiness - all is OK and both replicas have these last strings in log:

postgresql-repmgr 14:33:15.70 INFO  ==> Starting PostgreSQL in background...
postgresql-repmgr 14:33:15.84 INFO  ==> ** Starting repmgrd **
[2020-08-30 14:33:15] [NOTICE] repmgrd (repmgrd 5.1.0) starting up
INFO:  set_repmgrd_pid(): provided pidfile is /opt/bitnami/repmgr/tmp/repmgr.pid
[2020-08-30 14:33:15] [NOTICE] starting monitoring of node "pg-ha-postgresql-1" (ID: 1001)

But when I try to turn on liveness/readiness the second pod (pg-ha-postgresql-1) have to fully resync by some reason and starts failing again due to liveness/readiness turned on again

Version of Helm and Kubernetes:

Output of helm version:

version.BuildInfo{Version:"v3.3.0", GitCommit:"8a4aeec08d67a7b84472007529e8097ec3742105", GitTreeState:"dirty", GoVersion:"go1.14.7"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:04:18Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Additional context
NONE

The text was updated successfully, but these errors were encountered:

carrodher · 2020-08-31T14:32:37Z

Hi, thanks for using this bitnami chart, did you try modifying the parameters of the probe instead of disabling them? Maybe this action is taking so long and you need to increase the probes' parameters, see https://github.com/bitnami/charts/blob/master/bitnami/postgresql-ha/values.yaml#L189

Apart from that, what is the error that appears in the logs when the issue is reached? What says kubectl describe POD?

Antiarchitect · 2020-08-31T14:37:33Z

If I tune readiness/liveness I cannot predict even nearly when 1TB database will replicate. So it's better turn them off. My current issue for now is #3563

stale · 2020-09-17T04:56:40Z

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

percenuage · 2024-10-15T16:32:41Z

Hello, I have exactly the same issue as @Antiarchitect. The primary node has 300Gi of data, I want to have a second replica (1 -> 2) but the time of liveness is too short during data sync. Have you any solution? Changing/disabling liveness seems to be strange. In other hand, it's logic that the replica should not be live until the data are not fully synchronized.
Thanks for your help.

Which chart:
bitnami/postgresql-ha 14.2.8

Antiarchitect changed the title ~~[bitnami/postgresql-ha]~~ [bitnami/postgresql-ha] Cloning data from primary node fails due to liveness/rediness probes Aug 30, 2020

stale bot added the stale 15 days without activity label Sep 17, 2020

Antiarchitect closed this as completed Sep 17, 2020

jsoref mentioned this issue Jan 7, 2021

Cloning to resync should be part of init container #4894

Closed

carrodher added the postgresql-ha label Oct 16, 2024

percenuage mentioned this issue Oct 16, 2024

[bitnami/postgresql-ha] Cloning huge data from primary node fails due to livenessProbe #29948

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bitnami/postgresql-ha] Cloning data from primary node fails due to liveness/rediness probes #3556

[bitnami/postgresql-ha] Cloning data from primary node fails due to liveness/rediness probes #3556

Antiarchitect commented Aug 30, 2020

carrodher commented Aug 31, 2020

Antiarchitect commented Aug 31, 2020

stale bot commented Sep 17, 2020

percenuage commented Oct 15, 2024 •

edited

Loading

[bitnami/postgresql-ha] Cloning data from primary node fails due to liveness/rediness probes #3556

[bitnami/postgresql-ha] Cloning data from primary node fails due to liveness/rediness probes #3556

Comments

Antiarchitect commented Aug 30, 2020

carrodher commented Aug 31, 2020

Antiarchitect commented Aug 31, 2020

stale bot commented Sep 17, 2020

percenuage commented Oct 15, 2024 • edited Loading

percenuage commented Oct 15, 2024 •

edited

Loading