Super fast and lightweight DB migration tool written in go. migrator outperforms other market leading DB migration frameworks by a few orders of magnitude when comparing both execution time and memory consumption (see PERFORMANCE.md).
migrator manages and versions all the DB changes for you and completely eliminates manual and error-prone administrative tasks. migrator versions can be used for auditing and compliance purposes. migrator not only supports single schemas, but also comes with a multi-schema support out of the box. Making it an ideal DB migrations solution for multi-tenant SaaS products.
migrator runs as a HTTP GraphQL service and can be easily integrated into existing continuous integration and continuous delivery pipelines. migrator can also sync existing migrations from legacy frameworks making the technology switch even more straightforward.
migrator supports reading DB migrations from:
- local folder (any Docker/Kubernetes deployments)
- AWS S3
- Azure Blob Containers
migrator supports the following multi-tenant databases:
- PostgreSQL 9.6+ and all its flavours
- MySQL 5.6+ and all its flavours
- Microsoft SQL Server 2008 R2+
The official docker image is available on:
- docker hub at: lukasz/migrator
- alternative mirror at: ghcr.io/lukaszbudnik/migrator
The image is ultra lightweight and has a size of 30MB. Ideal for micro-services deployments!
To return build information together with a list of supported API versions execute:
curl http://localhost:8080/
Sample HTTP response:
{
"release": "refs/tags/v2021.1.0",
"sha": "3ede93745e459e1214513b21ef76d94d09d10ae7",
"apiVersions": ["v2"]
}
API v2 is a GraphQL API. API v2 was introduced in migrator v2020.1.0.
API v2 introduced a formal concept of a DB version. Every migrator action creates a new DB version. Version logically groups all applied DB migrations for auditing and compliance purposes. You can browse versions together with executed DB migrations using the GraphQL API.
Returns migrator's config as application/x-yaml
.
Sample request:
curl http://localhost:8080/v2/config
Sample HTTP response:
baseLocation: test/migrations
driver: sqlserver
dataSource: sqlserver://SA:[email protected]:32774/?database=migratortest&connection+timeout=1&dial+timeout=1
singleMigrations:
- ref
- config
tenantMigrations:
- tenants
pathPrefix: /
Returns migrator's GraphQL schema as plain/text
.
Although migrator supports GraphQL introspection it is much more convenient to get the schema in the plain text.
Sample request:
curl http://localhost:8080/v2/schema
The API v2 GraphQL schema and its description is as follows:
schema {
query: Query
mutation: Mutation
}
enum MigrationType {
SingleMigration
TenantMigration
SingleScript
TenantScript
}
enum Action {
// Apply is the default action, migrator reads all source migrations and applies them
Apply
// Sync is an action where migrator reads all source migrations and marks them as applied in DB
// typical use cases are:
// importing source migrations from a legacy tool or synchronising tenant migrations when tenant was created using external tool
Sync
}
scalar Time
interface Migration {
name: String!
migrationType: MigrationType!
sourceDir: String!
file: String!
contents: String!
checkSum: String!
}
type SourceMigration implements Migration {
name: String!
migrationType: MigrationType!
sourceDir: String!
file: String!
contents: String!
checkSum: String!
}
type DBMigration implements Migration {
id: Int!
name: String!
migrationType: MigrationType!
sourceDir: String!
file: String!
contents: String!
checkSum: String!
schema: String!
created: Time!
}
type Tenant {
name: String!
}
type Version {
id: Int!
name: String!
created: Time!
dbMigrations: [DBMigration!]!
}
input SourceMigrationFilters {
name: String
sourceDir: String
file: String
migrationType: MigrationType
}
input VersionInput {
versionName: String!
action: Action = Apply
dryRun: Boolean = false
}
input TenantInput {
tenantName: String!
versionName: String!
action: Action = Apply
dryRun: Boolean = false
}
type Summary {
// date time operation started
startedAt: Time!
// how long the operation took in seconds
duration: Float!
// number of tenants in the system
tenants: Int!
// number of loaded and applied single schema migrations
singleMigrations: Int!
// number of loaded multi-tenant schema migrations
tenantMigrations: Int!
// number of applied multi-tenant schema migrations (equals to tenants * tenantMigrations)
tenantMigrationsTotal: Int!
// sum of singleMigrations and tenantMigrationsTotal
migrationsGrandTotal: Int!
// number of loaded and applied single schema scripts
singleScripts: Int!
// number of loaded multi-tenant schema scripts
tenantScripts: Int!
// number of applied multi-tenant schema migrations (equals to tenants * tenantScripts)
tenantScriptsTotal: Int!
// sum of singleScripts and tenantScriptsTotal
scriptsGrandTotal: Int!
}
type CreateResults {
summary: Summary!
version: Version
}
type Query {
// returns array of SourceMigration objects
// all parameters are optional and can be used to filter source migrations
// note that if the input query includes "contents" field this operation can produce large amounts of data
// if you want to return "contents" field it may be better to get individual source migrations using sourceMigration(file: String!)
sourceMigrations(filters: SourceMigrationFilters): [SourceMigration!]!
// returns a single SourceMigration
// this operation can be used to fetch a complete SourceMigration including "contents" field
// file is the unique identifier for a source migration file which you can get from sourceMigrations()
sourceMigration(file: String!): SourceMigration
// returns array of Version objects
// file is optional and can be used to return versions in which given source migration file was applied
// note that if input query includes DBMigration array and "contents" field this operation can produce large amounts of data
// if you want to return "contents" field it may be better to get individual versions using either
// version(id: Int!) or even get individual DB migration using dbMigration(id: Int!)
versions(file: String): [Version!]!
// returns a single Version
// id is the unique identifier of a version which you can get from versions()
// note that if input query includes "contents" field this operation can produce large amounts of data
// if you want to return "contents" field it may be better to get individual DB migration using dbMigration(id: Int!)
version(id: Int!): Version
// returns a single DBMigration
// this operation can be used to fetch a complete DBMigration including "contents" field
// id is the unique identifier of a DB migration which you can get from versions(file: String) or version(id: Int!)
dbMigration(id: Int!): DBMigration
// returns array of Tenant objects
tenants(): [Tenant!]!
}
type Mutation {
// creates new DB version by applying all eligible DB migrations & scripts
createVersion(input: VersionInput!): CreateResults!
// creates new tenant by applying only tenant-specific DB migrations & scripts, also creates new DB version
createTenant(input: TenantInput!): CreateResults!
}
This is a GraphQL endpoint which handles both query and mutation requests.
There are code generators available which can generate client code based on GraphQL schema. This would be the preferred way of consuming migrator's GraphQL endpoint.
In Quick Start Guide there are a few curl examples to get you started.
API v1 was sunset in v2021.0.0.
The documentation is available in a separate document API v1.
migrator uses request tracing via X-Request-ID
header. This header can be used with all requests for tracing and/or auditing purposes. If this header is absent migrator will generate one for you.
You can apply your first migrations with migrator in literally a few seconds. There is a ready-to-use docker-compose file which sets up migrator and test databases.
Get the source code:
git clone https://github.com/lukaszbudnik/migrator.git
cd migrator
Points to note:
- migrator aims to support 3 latest go versions (these versions are automatically built and tested by GitHub Actions)
- docker images are built using latest stable go version
- dependabot automatically updates go and docker dependencies on a weekly basis
- every merge to
main
branch triggers CI/CD pipeline which publishesedge
tag to both docker hub lukasz/migrator and ghcr.io/lukaszbudnik/migrator - major/minor releases are coordinated via GitHub Projects
Start migrator and setup test DB containers using docker-compose:
docker-compose -f ./test/docker-compose.yaml up
docker-compose will start and configure the following services:
migrator
- service using latest official migrator image, listening on port8181
migrator-dev
- service built from local branch, listening on port8282
postgres
- PostgreSQL service, listening on port54325
mysql
- MySQL service, listening on port3306
mssql
- MS SQL Server, listening on port1433
Note: Every database container has a ready-to-use migrator config in
test
directory. You can edittest/docker-compose.yaml
file and switch to a different database. By defaultmigrator
andmigrator-dev
services usetest/migrator-docker.yaml
which connects tomysql
service.
docker-compose will start 2 migrator services. The first one migrator
will use the latest official migrator docker image from docker hub lukasz/migrator. The second one migrator-dev
will be built automatically by docker-compose from your local branch.
In order to run the docker container remember to:
- mount a volume with migrations, for example:
/data
- specify location of migrator configuration file, for convenience it is usually located under
/data
directory; it defaults to/data/migrator.yaml
and can be overridden by setting environment variableMIGRATOR_YAML
The docker-compose will mount volumes with sample configuration and test migrations for you. See test/docker-compose.yaml
for details.
Note: For production deployments please see Tutorials section. It contains walkthoughs of deployments to AWS ECS, AWS EKS, and Azure AKS.
The docker-compose will start 2 migrator services as listed above. The latest stable migrator version listens on port 8181
. migrator built from the local branch listens on port 8282
.
Set the port accordingly:
MIGRATOR_PORT=8181
Create new version, return version id and name together with operation summary:
# versionName parameter is required and can be:
# 1. your version number
# 2. if you do multiple deploys to dev envs perhaps it could be a version number concatenated with current date time
# 3. or if you do CI/CD the commit sha (recommended)
COMMIT_SHA="acfd70fd1f4c7413e558c03ed850012627c9caa9"
# new lines are used for readability but have to be removed from the actual request
cat <<EOF | tr -d "\n" > create_version.txt
{
"query": "
mutation CreateVersion(\$input: VersionInput!) {
createVersion(input: \$input) {
version {
id,
name,
}
summary {
startedAt
tenants
migrationsGrandTotal
scriptsGrandTotal
}
}
}",
"operationName": "CreateVersion",
"variables": {
"input": {
"versionName": "$COMMIT_SHA"
}
}
}
EOF
# and now execute the above query
curl -d @create_version.txt http://localhost:$MIGRATOR_PORT/v2/service
Create new tenant, run in dry-run mode, run Sync
action (instead of default Apply
), return version id and name, DB migrations, and operation summary:
# versionName parameter is required and can be:
# 1. your version number
# 2. if you do multiple deploys to dev envs perhaps it could be a version number concatenated with current date time
# 3. or if you do CI/CD the commit sha (recommended)
COMMIT_SHA="acfd70fd1f4c7413e558c03ed850012627c9caa9"
# tenantName parameter is also required (should not come as a surprise since we want to create new tenant)
TENANT_NAME="new_customer_of_yours"
# new lines are used for readability but have to be removed from the actual request
cat <<EOF | tr -d "\n" > create_tenant.txt
{
"query": "
mutation CreateTenant(\$input: TenantInput!) {
createTenant(input: \$input) {
version {
id,
name,
dbMigrations {
id,
file,
schema
}
}
summary {
startedAt
tenants
migrationsGrandTotal
scriptsGrandTotal
}
}
}",
"operationName": "CreateTenant",
"variables": {
"input": {
"dryRun": true,
"action": "Sync",
"versionName": "$COMMIT_SHA - $TENANT_NAME",
"tenantName": "$TENANT_NAME"
}
}
}
EOF
# and now execute the above query
curl -d @create_tenant.txt http://localhost:$MIGRATOR_PORT/v2/service
Migrator supports multiple operations in a single GraphQL query. Let's fetch source single migrations, source tenant migrations, and tenants in a single GraphQL query:
# new lines are used for readability but have to be removed from the actual request
cat <<EOF | tr -d "\n" > query.txt
{
"query": "
query Data(\$singleMigrationsFilters: SourceMigrationFilters, \$tenantMigrationsFilters: SourceMigrationFilters) {
singleTenantSourceMigrations: sourceMigrations(filters: \$singleMigrationsFilters) {
file
migrationType
}
multiTenantSourceMigrations: sourceMigrations(filters: \$tenantMigrationsFilters) {
file
migrationType
checkSum
}
tenants {
name
}
}",
"operationName": "Data",
"variables": {
"singleMigrationsFilters": {
"migrationType": "SingleMigration"
},
"tenantMigrationsFilters": {
"migrationType": "TenantMigration"
}
}
}
EOF
# and now execute the above query
curl -d @query.txt http://localhost:$MIGRATOR_PORT/v2/service
For more GraphQL query and mutation examples see data/graphql_test.go
.
Let's see how to configure migrator.
migrator configuration file is a simple YAML file. Take a look at a sample migrator.yaml
configuration file which contains the description, correct syntax, and sample values for all available properties.
# required, location where all migrations are stored, see singleSchemas and tenantSchemas below
baseLocation: test/migrations
# required, SQL go driver implementation used, see section "Supported databases"
driver: postgres
# required, dataSource format is specific to SQL go driver implementation used, see section "Supported databases"
dataSource: "user=postgres dbname=migrator_test host=192.168.99.100 port=55432 sslmode=disable"
# optional, override only if you have a specific way of determining tenants, default is:
tenantSelectSQL: "select name from migrator.migrator_tenants"
# optional, override only if you have a specific way of creating tenants, default is:
tenantInsertSQL: "insert into migrator.migrator_tenants (name) values ($1)"
# optional, override only if you have a specific schema placeholder, default is:
schemaPlaceHolder: { schema }
# required, directories of single schema SQL migrations, these are subdirectories of baseLocation
singleMigrations:
- public
- ref
- config
# optional, directories of tenant schemas SQL migrations, these are subdirectories of baseLocation
tenantMigrations:
- tenants
# optional, directories of single SQL scripts which are applied always, these are subdirectories of baseLocation
singleScripts:
- config-scripts
# optional, directories of tenant SQL script which are applied always for all tenants, these are subdirectories of baseLocation
tenantScripts:
- tenants-scripts
# optional, default is 8080
port: 8080
# path prefix is optional and defaults to '/'
# path prefix is used for application HTTP request routing by Application Load Balancers/Application Gateways
# for example when deploying to AWS ECS and using AWS ALB the path prefix could set as below
# then all HTTP requests should be prefixed with that path, for example: /migrator/v1/config, /migrator/v1/migrations/source, etc.
pathPrefix: /migrator
# the webhook configuration section is optional
# the default Content-Type header is application/json but can be overridden via webHookHeaders below
webHookURL: https://your.server.com/services/TTT/BBB/XXX
# if the webhook expects a payload in a specific format there is an option to provide a payload template
# see webhook template for more information
webHookTemplate: '{"text": "New version: ${summary.versionId} started at: ${summary.startedAt} and took ${summary.duration}. Full results are: ${summary}"}'
# should you need more control over HTTP headers use below
webHookHeaders:
- "Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l"
- "Content-Type: application/json"
- "X-Custom-Header: value1,value2"
# optional, allows to filter logs produced by migrator, valid values are: DEBUG, INFO, ERROR, PANIC
# defaults to INFO
logLevel: INFO
migrator supports env variables substitution in config file. All patterns matching ${NAME}
will look for env variable NAME
. Below are some common use cases:
dataSource: "user=${DB_USER} password=${DB_PASSWORD} dbname=${DB_NAME} host=${DB_HOST} port=${DB_PORT}"
webHookHeaders:
- "X-Security-Token: ${SECURITY_TOKEN}"
By default when a webhook is configured migrator will post a JSON representation of Summary
struct to its endpoint.
If your webhook expects a payload in a specific format (say Slack or MS Teams incoming webhooks) there is an option to configure a webHookTemplate
property in migrator's configuration file. The template can have the following placeholders:
${summary}
- will be replaced by a JSON representation ofSummary
struct, all double quotes will be escaped so that the template remains a valid JSON document${summary.field}
- will be replaced by a given field ofSummary
struct
Placeholders can be mixed:
webHookTemplate: '{"text": "New version created: ${summary.versionId} started at: ${summary.startedAt} and took ${summary.duration}. Migrations/scripts total: ${summary.migrationsGrandTotal}/${summary.scriptsGrandTotal}. Full results are: ${summary}"}'
Migrations can be read from local disk, AWS S3, Azure Blob Containers. I'm open to contributions to add more cloud storage options.
If baseLocation
property is a path (either relative or absolute) local storage implementation is used:
# relative path
baseLocation: test/migrations
# absolute path
baseLocation: /project/migrations
If baseLocation
starts with s3://
prefix, AWS S3 implementation is used. In such case the baseLocation
property is treated as a bucket name followed by optional prefix:
# S3 bucket
baseLocation: s3://your-bucket-migrator
# S3 bucket with optional prefix
baseLocation: s3://your-bucket-migrator/appcodename/prod/artefacts
migrator uses official AWS SDK for Go and uses a well known default credential provider chain.
If baseLocation
matches ^https://.*\.blob\.core\.windows\.net/.*
regex, Azure Blob implementation is used. In such case the baseLocation
property is treated as a container URL. The URL can have optional prefix too:
# Azure Blob container URL
baseLocation: https://storageaccountname.blob.core.windows.net/mycontainer
# Azure Blob container URL with optional prefix
baseLocation: https://storageaccountname.blob.core.windows.net/mycontainer/appcodename/prod/artefacts
migrator uses official Azure SDK for Go and supports authentication using Storage Account Key (via AZURE_STORAGE_ACCOUNT
and AZURE_STORAGE_ACCESS_KEY
env variables) as well as much more flexible (and recommended) Azure Active Directory Managed Identity.
Currently migrator supports the following databases including their flavours (like Percona, MariaDB for MySQL, etc.). Please review the Go driver implementation for information about all supported features and how dataSource
configuration property should look like.
Schema-based multi-tenant database, with transactions spanning DDL statements, driver used: https://github.com/lib/pq.
The following versions and flavours are supported:
- PostgreSQL
- Amazon RDS PostgreSQL - PostgreSQL-compatible relational database built for the cloud
- Amazon Aurora PostgreSQL - PostgreSQL-compatible relational database built for the cloud
- Google CloudSQL PostgreSQL - PostgreSQL-compatible relational database built for the cloud
Database-based multi-tenant database, transactions do not span DDL statements, driver used: https://github.com/go-sql-driver/mysql.
The following versions and flavours are supported:
- MySQL
- MariaDB - enhanced near linearly scalable multi-master MySQL
- Percona - an enhanced drop-in replacement for MySQL
- Amazon RDS MySQL - MySQL-compatible relational database built for the cloud
- Amazon Aurora MySQL - MySQL-compatible relational database built for the cloud
- Google CloudSQL MySQL - MySQL-compatible relational database built for the cloud
A relational database management system developed by Microsoft, driver used: https://github.com/denisenkom/go-mssqldb.
The Go driver supports all Microsoft SQL Server versions starting with 2008 R2+.
migrator can be used with an already existing legacy DB migration framework.
If you have an existing way of storing information about your tenants you can configure migrator to use it. In the config file you need to provide 2 configuration properties:
tenantSelectSQL
- a select statement which returns names of the tenantstenantInsertSQL
- an insert statement which creates a new tenant entry, the insert statement should be a valid prepared statement for the SQL driver/database you use, it must accept the name of the new tenant as a parameter; finally should your table require additional columns you need to provide default values for them
Here is an example:
tenantSelectSQL: select name from global.customers
tenantInsertSQL: insert into global.customers (name, active, date_added) values (?, true, NOW())
SQL migrations and scripts can use {schema}
placeholder which will be automatically replaced by migrator with a current schema. For example:
create schema if not exists {schema};
create table if not exists {schema}.modules ( k int, v text );
insert into {schema}.modules values ( 123, '123' );
If you have an existing DB migrations legacy framework which uses different schema placeholder you can override the default one.
In the config file you need to provide schemaPlaceHolder
configuration property:
For example:
schemaPlaceHolder: :tenant
Before switching from a legacy tool you need to synchronise source migrations to migrator. migrator has no knowledge of migrations applied by other tools and as such will attempt to apply all found source migrations.
Synchronising will load all source migrations and mark them as applied. This can be done by CreateVersion
operation with action set to Sync
.
Once the initial synchronisation is done you can use migrator for all the consecutive DB migrations.
When using migrator please remember that:
- migrator creates
migrator
schema together withmigrator_versions
andmigrator_migrations
tables automatically - if you're not using Custom tenants support migrator creates
migrator_tenants
table automatically - when adding a new tenant migrator creates a new DB schema and applies all tenant migrations and scripts
- single schemas are not created automatically, you must add initial migration with
create schema {schema}
SQL statement (see sample migrations intest
folder)
migrator exposes Prometheus metrics at /metrics
endpoint. Apart from migrator-specific metrics, it exposes a lot of OS process and Go metrics.
The following metrics are available:
go_gc_*
- Go garbage collectiongo_memstats_*
- Go memoryprocess_*
- OS processmigrator_gin_request_*
- Gin request metricsmigrator_gin_response_*
- Gin response metricsmigrator_gin_tenants_created
- migrator tenants createdmigrator_gin_versions_created
- migrator versions createdmigrator_gin_migrations_applied{type="single_migrations"}
- migrator single migrations appliedmigrator_gin_migrations_applied{type="single_scripts"}
- migrator single scripts appliedmigrator_gin_migrations_applied{type="tenant_migrations_total"}
- migrator total tenant migrations applied (for all tenants)migrator_gin_migrations_applied{type="tenant_scripts_total"}
- migrator total tenant scripts applied (for all tenants)
Health checks are available at /health
endpoint. migrator implements Eclipse MicroProfile Health 3.0 RC4 spec.
A successful response returns HTTP 200 OK code:
{
"status": "UP",
"checks": [
{
"name": "DB",
"status": "UP"
},
{
"name": "Loader",
"status": "UP"
}
]
}
In case one of the checks has DOWN status then the overall status is DOWN. Failed check has data
field which provides more information on why its status is DOWN. Health check will also return HTTP 503 Service Unavailable code:
{
"status": "DOWN",
"checks": [
{
"name": "DB",
"status": "DOWN",
"data": {
"details": "failed to connect to database: dial tcp 127.0.0.1:5432: connect: connection refused"
}
},
{
"name": "Loader",
"status": "DOWN",
"data": {
"details": "open /nosuchdir/migrations: no such file or directory"
}
}
]
}
In this section I provide links to more in-depth migrator tutorials.
The goal of this tutorial is to deploy migrator to AWS ECS, load migrations from AWS S3 and apply them to AWS RDS DB while storing env variables securely in AWS Secrets Manager. The list of all AWS services used is: IAM, ECS, ECR, Secrets Manager, RDS, and S3.
You can find it in tutorials/aws-ecs.
The goal of this tutorial is to deploy migrator to AWS EKS, load migrations from AWS S3 and apply them to AWS RDS DB. The list of AWS services used is: IAM, EKS, ECR, RDS, and S3.
You can find it in tutorials/aws-eks.
The goal of this tutorial is to publish migrator image to Azure ACR private container repository, deploy migrator to Azure AKS, load migrations from Azure Blob Container and apply them to Azure Database for PostgreSQL. The list of Azure services used is: AKS, ACR, Blob Storage, and Azure Database for PostgreSQL.
You can find it in tutorials/azure-aks.
The goal of this tutorial is to secure migrator with OAuth2. It shows how to deploy oauth2-proxy in front of migrator which will off-load and transparently handle authorization for migrator end-users.
You can find it in tutorials/oauth2-proxy.
The goal of this tutorial is to secure migrator with OAuth2 and OIDC. It shows how to deploy oauth2-proxy and haproxy in front of migrator which will off-load and transparently handle both authorization (oauth2-proxy) and authentication (haproxy with custom lua script) for migrator end-users.
You can find it in tutorials/oauth2-proxy-oidc-haproxy.
Performance benchmarks were moved to a dedicated PERFORMANCE.md document.
Please navigate to migrator/releases for a complete list of versions, features, and change log.
Contributions are most welcomed!
For contributing, code style, running unit & integration tests please see CONTRIBUTING.md.
Copyright 2016-2021 Łukasz Budnik
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.