Restoring from S3 on a different machine #1066

asoltesz · 2020-06-10T17:18:25Z

Please provide the following information when submitting an issue (feature requests or general comments can skip this):

pgBackRest version:

version 2.25

PostgreSQL version:

12.3

Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each:

CentOS 7

Did you install pgBackRest from source or from a package?

package (CrunchyData Postgres Operator)

Please attach the following as applicable:

Stanza-create fails with this:

time="2020-06-10T16:47:04Z" level=info msg="pgo-backrest starts"
time="2020-06-10T16:47:04Z" level=info msg="debug flag set to false"
time="2020-06-10T16:47:04Z" level=info msg="backrest stanza-create command requested"
time="2020-06-10T16:47:04Z" level=info msg="backrest command will be executed for both local and s3 storage"
time="2020-06-10T16:47:04Z" level=info msg="command to execute is [pgbackrest stanza-create  --db-host=172.18.0.13 --db-path=/pgdata/hippo && pgbackrest stanza-create  --db-host=172.18.0.13 --db-path=/pgdata/hippo --repo-type=s3]"
time="2020-06-10T16:47:04Z" level=info msg="command is pgbackrest stanza-create  --db-host=172.18.0.13 --db-path=/pgdata/hippo && pgbackrest stanza-create  --db-host=172.18.0.13 --db-path=/pgdata/hippo --repo-type=s3 "
time="2020-06-10T16:47:06Z" level=error msg="command terminated with exit code 28"
time="2020-06-10T16:47:06Z" level=info msg="output=[]"
time="2020-06-10T16:47:06Z" level=info msg="stderr=[ERROR: [028]: backup and archive info files exist but do not match the database\n       HINT: is this the correct stanza?\n       HINT: did an error occur during stanza-upgrade?\n]"
time="2020-06-10T16:47:06Z" level=error msg="command terminated with exit code 28"

Describe the issue:

I am having trouble understanding how one can restore in the following situation:

Backups were saved to a remote S3 repo
Postgres host and local backrest repo completely destroyed (whole machine crashed)
I have a new machine with the same Postgres/pgbackrest version installed
I would like to restore from S3 but the system-id on the new machine will always be different from the the one that was used to create the backups to S3, so create-stanza fails on the existing S3 bucket

I didn't find information in the Guide for this situation.

How do I create the stanza and restore to the new machine?

The text was updated successfully, but these errors were encountered:

dwsteele · 2020-06-10T18:33:05Z

This is covered here: https://pgbackrest.org/user-guide-centos7.html#replication/hot-standby. You just need to modify the recovery settings to whatever you need to recover your primary. PITR instructions are here: https://pgbackrest.org/user-guide-centos7.html#pitr

[ERROR: [028]: backup and archive info files exist but do not match the database

There's no need to create the stanza again -- it's already created. All you need is an empty PGDATA dir or specify --delta when you restore.

asoltesz · 2020-06-13T11:51:05Z

Thanks, I managed to do the restore.

Venryx · 2021-08-29T13:05:33Z

For others finding this, just wanted to mention that if you're getting the backup and archive info files exist but do not match the database error when using:

spec:
  backups:
    pgbackrest:
      restore:
        [...]

Then try using the alternate restore approach:

spec:
  dataSource:
    [...]

API reference: https://access.crunchydata.com/documentation/postgres-operator/v5/tutorial/disaster-recovery/

It appears the second type can work even if the database system-id differs between the backup and the target cluster, whereas the first cannot. (However, don't be like me and assume the backup is failing if it sits there for a while; in my case, I have to wait 2.5 minutes before any of the backup's files start actually being restored. So be patient before changing further settings or the like.)

For reference, here is the code that causes the error:

pgbackrest/src/command/check/common.c

Lines 152 to 157 in bd0081f

    
           if (pgVersion != archiveInfoPg.version || pgSystemId != archiveInfoPg.systemId) 
        
           { 
        
               THROW(FileInvalidError, "backup and archive info files exist but do not match the database\n" 
        
                   "HINT: is this the correct stanza?\n" 
        
                   "HINT: did an error occur during stanza-upgrade?"); 
        
           }

Venryx · 2021-08-29T14:35:07Z

After more experimenting, I found that the the error can occur for the dataSource approach as well; however, it only has it for a specific scenario:

A backup-repo was created for the old system-identifier.
However, the backup-repo never had a base-backup pushed to it.

When a new postgres cluster is launched (with a new system-identifier), I'd tend to expect that the cluster would look into the repo, and either:
A) Load in the configuration from it (so the system-id matches for subsequent backups/restores), or...
B) Realize that there are no actual base-backups in the backup-repo, and thus ignore it (or just log a warning).

Instead, the postgres-operator notices the backup-repo, and complains about it, but doesn't offer an easy way to solve it:

You can't just ignore/overwrite the mismatched backup-repo, because PGO doesn't offer a way to do so. (it errors from the config/system-id mismatch, before you're able to have it do any backup reading/writing)
You can't tell PGO to load in the configuration (which includes the system-id), because there is no base-backup that you can point the dataSource entry to.

A third option, which does work, is to delete the backup-repo folder in the cloud manually. Then PGO sees there is no mismatch, creates a new cluster, and populates the backup-repo with its own configuration.

This works, but is not terribly obvious to new users; perhaps a special error message could be displayed for the backup-repo exists, but without actual backups case, to clarify to new users what should be done.

EDIT: I put some further (arguably more helpful) notes on stanza-related issues here: https://github.com/debate-map/app/blob/56180dca95148d3af65aa14626093d62dca432fc/README.md?plain=1#L618

…_ovh, read/write to different buckets (for the db-backups). * Finally figured out why the system-id mismatch-error is necessary, and how to avoid/deal-with it. (basically, if you're going to be using a backup-repo contents, you need to initialize your database instance from one of its backups; this is necessary because of the way postgres physical backups work; see here for some more info: pgbackrest/pgbackrest#1066 (comment)) Because of the limitations of physical backups, I plan to set up weekly (or so) logical backups as well. That's for another time though, as physical backups should be fine for now. (ie. while I'm on the same postgres version)

dwsteele · 2021-08-31T10:38:35Z

A) Load in the configuration from it (so the system-id matches for subsequent backups/restores)

The system identifier cannot be updated in Postgres.

Realize that there are no actual base-backups in the backup-repo, and thus ignore it (or just log a warning).

The thing to do here is issue a stanza-upgrade or maybe better, a stanza-delete/stanza-create since the repo is pretty useless without backups.

This works, but is not terribly obvious to new users; perhaps a special error message could be displayed for the backup-repo exists, but without actual backups case, to clarify to new users what should be done.

This seems like something you should suggest at https://github.com/CrunchyData/postgres-operator. Actually, that pretty much goes for all of this.

dwsteele self-assigned this Jun 10, 2020

dwsteele added the question label Jun 10, 2020

asoltesz closed this as completed Jun 13, 2020

github-actions bot locked and limited conversation to collaborators May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restoring from S3 on a different machine #1066

Restoring from S3 on a different machine #1066

asoltesz commented Jun 10, 2020

dwsteele commented Jun 10, 2020

asoltesz commented Jun 13, 2020

Venryx commented Aug 29, 2021 •

edited

Loading

Venryx commented Aug 29, 2021 •

edited

Loading

dwsteele commented Aug 31, 2021

Restoring from S3 on a different machine #1066

Restoring from S3 on a different machine #1066

Comments

asoltesz commented Jun 10, 2020

dwsteele commented Jun 10, 2020

asoltesz commented Jun 13, 2020

Venryx commented Aug 29, 2021 • edited Loading

Venryx commented Aug 29, 2021 • edited Loading

dwsteele commented Aug 31, 2021

Venryx commented Aug 29, 2021 •

edited

Loading

Venryx commented Aug 29, 2021 •

edited

Loading