Skip to content

Go library for diffing Postgres schemas and generating SQL migrations

License

Notifications You must be signed in to change notification settings

stripe/pg-schema-diff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pg-schema-diff

run_tests Go Report Card Go Reference GitHub Release

Computes the diff(erences) between Postgres database schemas and generates the SQL required to get your database schema from point A to B with minimal downtime & locks. This enables you to take your database and migrate it to any desired schema defined in plain DDL.

The tooling attempts to use native postgres migration operations to perform online migrations and avoid locking wherever possible. Not all migrations will be lock-free and some might require downtime, but the hazards system will warn you ahead of time when that's the case. Stateful online migration techniques, like shadow tables, aren't yet supported.

Online index Replacement

Your project's diff:

$ git diff
diff --git a/schema/schema.sql b/schema/schema.sql
index cc3a14b..cf4b32d 100644
--- a/schema/schema.sql
+++ b/schema/schema.sql
@@ -2,5 +2,5 @@ CREATE TABLE foobar(
  	created_at timestamp,
 	message text
 );
-CREATE INDEX message_idx ON foobar(message);
+CREATE INDEX message_idx ON foobar(message, created_at);

The generated plan (queries using message_idx will always have an index backing them, even while the new index is being built):

$ pg-schema-diff plan --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir ./schema
################################ Generated plan ################################
1. ALTER INDEX "message_idx" RENAME TO "pgschemadiff_tmpidx_message_idx_IiaKzkvPQtyA7ob9piVqiQ";
        -- Statement Timeout: 3s

2. CREATE INDEX CONCURRENTLY message_idx ON public.foobar USING btree (message, created_at);
        -- Statement Timeout: 20m0s
        -- Lock Timeout: 3s
        -- Hazard INDEX_BUILD: This might affect database performance. Concurrent index builds require a non-trivial amount of CPU, potentially affecting database performance. They also can take a while but do not lock out writes.

3. DROP INDEX CONCURRENTLY "pgschemadiff_tmpidx_message_idx_IiaKzkvPQtyA7ob9piVqiQ";
        -- Statement Timeout: 20m0s
        -- Lock Timeout: 3s
        -- Hazard INDEX_DROPPED: Dropping this index means queries that use this index might perform worse because they will no longer will be able to leverage it.

Online NOT NULL constraint creation

Your project's diff:

diff --git a/schema/schema.sql b/schema/schema.sql
index cc3a14b..5a1cec2 100644
--- a/schema/schema.sql
+++ b/schema/schema.sql
@@ -1,5 +1,5 @@
 CREATE TABLE foobar(
- 	created_at timestamp,
+ 	created_at timestamp NOT NULL,
 	message text
 );
 CREATE INDEX message_idx ON foobar(message);

The generated plan (leverages check constraints to eliminate the need for a long-lived access-exclusive lock on the table):

$ pg-schema-diff plan --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir ./schema
################################ Generated plan ################################
1. ALTER TABLE "public"."foobar" ADD CONSTRAINT "pgschemadiff_tmpnn_BCOxMXqAQwaXlKPCRXoMMg" CHECK("created_at" IS NOT NULL) NOT VALID;
        -- Statement Timeout: 3s

2. ALTER TABLE "public"."foobar" VALIDATE CONSTRAINT "pgschemadiff_tmpnn_BCOxMXqAQwaXlKPCRXoMMg";
        -- Statement Timeout: 3s

3. ALTER TABLE "public"."foobar" ALTER COLUMN "created_at" SET NOT NULL;
        -- Statement Timeout: 3s

4. ALTER TABLE "public"."foobar" DROP CONSTRAINT "pgschemadiff_tmpnn_BCOxMXqAQwaXlKPCRXoMMg";
        -- Statement Timeout: 3s

Key features

  • Declarative schema migrations
  • The use of postgres native operations for zero-downtime migrations wherever possible:
    • Concurrent index builds
    • Online index replacement: If some index is changed, the new version will be built before the old version is dropped, preventing a window where no index is backing queries
    • Online constraint builds: Constraints (check, foreign key) are added as INVALID before being validated, eliminating the need for a long access-exclusive lock on the table
    • Online NOT NULL constraint creation using check constraints to eliminate the need for an access-exclusive lock on the table
    • Prioritized index builds: Building new indexes is always prioritized over deleting old indexes
  • A comprehensive set of features to ensure the safety of planned migrations:
    • Operators warned of dangerous operations.
    • Migration plans are validated first against a temporary database exactly as they would be performed against the real database.
  • Strong support of partitions

Install

CLI

go install github.com/stripe/pg-schema-diff/cmd/pg-schema-diff@latest

Library

go get -u github.com/stripe/pg-schema-diff@latest

Using CLI

1. Apply schema to fresh database

Create a directory to hold your schema files. Then, generate sql files and place them into a schema dir.

mkdir schema
echo "CREATE TABLE foobar (id int);" > schema/foobar.sql
echo "CREATE TABLE bar (id varchar(255), message TEXT NOT NULL);" > schema/bar.sql

Apply the schema to a fresh database. The connection string spec can be found here. Setting the PGPASSWORD env var will override any password set in the connection string and is recommended.

pg-schema-diff apply --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir schema 

2. Updating schema

Update the SQL file(s)

echo "CREATE INDEX message_idx ON bar(message)" >> schema/bar.sql

Apply the schema. Any hazards in the generated plan must be approved

pg-schema-diff apply --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir schema --allow-hazards INDEX_BUILD

Using Library

Docs to use the library can be found here. Check out the CLI for an example implementation with the library

1. Generating plan

// The tempDbFactory is used in plan generation to extract the new schema and validate the plan
tempDbFactory, err := tempdb.NewOnInstanceFactory(ctx, func(ctx context.Context, dbName string) (*sql.DB, error) {
	copiedConfig := connConfig.Copy()
	copiedConfig.Database = dbName
	return openDbWithPgxConfig(copiedConfig)
})
if err != nil {
	panic("Generating the TempDbFactory failed")
}
defer tempDbFactory.Close()
// Generate the migration plan
plan, err := diff.Generate(ctx, connPool, diff.DDLSchemaSource(ddl),
	diff.WithTempDbFactory(tempDbFactory),
	diff.WithDataPackNewTables(),
)
if err != nil {
	panic("Generating the plan failed")
}	

2. Applying plan

We leave plan application up to the user. For example, you might want to take out a session-level advisory lock if you are concerned about concurrent migrations on your database. You might also want a second user to approve the plan before applying it.

Example apply:

for _, stmt := range plan.Statements {
	if _, err := conn.ExecContext(ctx, fmt.Sprintf("SET SESSION statement_timeout = %d", stmt.Timeout.Milliseconds())); err != nil {
		panic(fmt.Sprintf("setting statement timeout: %s", err))
	}
	if _, err := conn.ExecContext(ctx, fmt.Sprintf("SET SESSION lock_timeout = %d", stmt.LockTimeout.Milliseconds())); err != nil {
		panic(fmt.Sprintf("setting lock timeout: %s", err))
	}
	if _, err := conn.ExecContext(ctx, stmt.ToSQL()); err != nil {
		panic(fmt.Sprintf("executing migration statement. the database maybe be in a dirty state: %s: %s", stmt, err))
	}
}

Supported Postgres versions

Supported: 14, 15, 16, 17
Unsupported: <= 13 are not supported. Use at your own risk.

Unsupported migrations

An abridged list of unsupported migrations:

  • Views (Planned)
  • Privileges (Planned)
  • Types (Only enums are currently supported)
  • Renaming. The diffing library relies on names to identify the old and new versions of a table, index, etc. If you rename an object, it will be treated as a drop and an add

Contributing

This project is in its early stages. We appreciate all the feature/bug requests we receive, but we have limited cycles to review direct code contributions at this time. See Contributing to learn more.

About

Go library for diffing Postgres schemas and generating SQL migrations

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages