Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to specify custom schema for verification in postgres #139

Merged
merged 7 commits into from
Jan 17, 2025

Conversation

esune
Copy link
Member

@esune esune commented Dec 19, 2024

Added the ability to specify a custom schema name for backup verification in postgres when using the configuration file spec.
Following the convention used for JDBC, the schema can be specified by adding ?verifySchema=my_schema to the database spec.

This was developed as a set of postgres-specific overrides since other supported database providers do not have the concept of schema and it was simpler to customize one instance of the function to grab the database name from the connection spec rather than handling the different scenarios in the same function, but it can be refactored if necessary.

One thing this pattern could also be used for in the future is to specify the auth database for mongodb in the spec rather than using an environment variable like it is currently set-up to do.

Opening in draft mode while testing.

@esune esune requested review from i5okie and WadeBarnes December 19, 2024 22:45
Copy link
Member

@WadeBarnes WadeBarnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far. A few comments and recommendations ...

The TABLE_SCHEMA variable has been used to this point to define the schema for verification. It's used by backup.mariadb.plugin and backup.postgres.plugin. The downside - It's global, hance this PR.

I think the TABLE_SCHEMA variable should be deprecated in favor of this new approach, where the new approach would contextually override the value in TABLE_SCHEMA.

Being able to backup a specific schema is an interesting addition, but I'm not sure it should be the default. The way it has worked up to now is the backup is a full backup, and then the verification verifies the schema of interest. This way when you restore the database, you're restoring a snapshot of everything, not just a single schema.

I also see that the new schema configuration setting is not being used for verification, the verification process is still only using the TABLE_SCHEMA variable.

@esune
Copy link
Member Author

esune commented Dec 21, 2024

Implementation was updated to support backupSchema and verifySchema parameters:

  • if backupSchema is specified, only the matching schema will be backed-up, otherwise the default mode will be used to backup everything
  • if verifySchema is specified, the specified schema name will be used for verification. Code falls-back to use TABLE_SCHEMA if nothing is specified, which in turn falls back to verifying public if unset

Pondering whether to use backupSchema/verifySchema or just backup/verify as parameters: the first naming convention is explicit, but might be redundant given what the code is for?

More testing currently underway.

@WadeBarnes
Copy link
Member

Pondering whether to use backupSchema/verifySchema or just backup/verify as parameters: the first naming convention is explicit, but might be redundant given what the code is for?

I personally prefer the explicit backupSchema/verifySchema.

Copy link
Member

@WadeBarnes WadeBarnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending the results of further testing.

esune added 2 commits January 7, 2025 13:46
Signed-off-by: Emiliano Suñé <[email protected]>
Signed-off-by: Emiliano Suñé <[email protected]>
@esune
Copy link
Member Author

esune commented Jan 8, 2025

I tweaked the pg_dump command: unfortunately leaving a space between ${_schemaParam} and ${_database} would cause the command to fail when executed in a bash script - it would be successful when executed in the shell directly.

This results in successful backups, however I am getting errors with verification when executed manually for any of the backed-up databases (usually an error around the user already existing).

@esune
Copy link
Member Author

esune commented Jan 9, 2025

See #132 (comment) for info about the verification failures. Leaving this PR as draft until I can understand more about how to fix the issue.

@esune esune linked an issue Jan 10, 2025 that may be closed by this pull request
@esune esune marked this pull request as ready for review January 10, 2025 22:11
@esune esune requested a review from WadeBarnes January 10, 2025 22:11
@esune
Copy link
Member Author

esune commented Jan 10, 2025

Finally ready for review. I added code to post-process the roles dump and remove the default users (the user running the command as well as the postgres user) and verification is now successful. If any other users are in the roles dump, they will be retained.

Unfortunately it is not (easily) feasible to take old backups generated with versions 2.7.0->2.9.0 and do the same post-processing as they are verified/restored: I think a reasonable pattern would be to add documentation about the need to remove those lines from the backup before attempting a restore.

@esune esune changed the title Add ability to specify custom schema to back-up in postgres Add ability to specify custom schema for verification in postgres Jan 16, 2025
@esune
Copy link
Member Author

esune commented Jan 16, 2025

Removed backupSchema for the time being as it seems to not be working reliably. verifySchema works as expected.

Copy link
Member

@WadeBarnes WadeBarnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@WadeBarnes WadeBarnes merged commit b198ce5 into bcgov:master Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PostgreSQL backup verification is failing in v2.8.1 and v2.9.0
2 participants