[SIP-59] Proposal for Database migration standards #13351

robdiciuccio · 2021-02-25T23:56:57Z

[SIP] Proposal for database migration standards

Motivation

Reduce pain around metadata database migrations by ensuring standards are followed and appropriate reviews are obtained before merging.

Proposed Change

SIP-57 (Semantic Versioning) introduced standards for avoiding breaking changes and general best practices for database migrations. The proposed changes below are in addition to those standards:

All migrations must support rollbacks. Migrations must have a functional downgrade method to effectively rollback schema changes introduced in the upgrade method. If a migration makes changes to data that are not easily undone (e.g. fix: Retroactively add granularity param to charts #12960), the changes introduced must be non-breaking and idempotent.
Migrations should be atomic and configured to complete fully in a single run, using a single transaction where appropriate. Any failures should trigger a rollback to the previous state. Partial migrations should be avoided.
Any constraints added within a migration should include an explicit name, e.g. sa.ForeignKeyConstraint(["user_id"], ["ab_user.id"], name='fk_user_id').
PRs introducing database migrations must include runtime estimates and downtime expectations.
Care should be taken to not introduce expensive DDL operations such as adding unnecessary constraints/indexes or setting column default values on tables potentially containing thousands of rows. [1][2]
- Indexes in Postgres tables should be added and removed CONCURRENTLY.
Migrations for breaking changes and cleanup (e.g. removal of columns) that should be held for the next major version, per the guidelines in SIP-57, should be accumulated in /superset/migrations/next/ for evaluation and inclusion in a future release.
Establish Github code owners on the superset/migrations directory to ensure PMC members are notified of new or updated migrations.
Require two approvals for PRs that include database migrations, including committers from multiple organizations.
PRs including database migrations should be kept open for a minimum review period of two business days to allow for adequate review, unless circumstances such as a critical vulnerability or breakage require faster turnaround.

New or Changed Public Interfaces

None.

New dependencies

No additional package dependencies.

Migration Plan and Compatibility

Workflow changes only. PR template will be updated with guidelines. Process for running migrations unchanged.

Rejected Alternatives

The status quo, which has resulted in quite a bit of thrash, deployment roadblocks and external discussions between Superset users.

The text was updated successfully, but these errors were encountered:

etr2460 · 2021-02-26T16:35:44Z

Love the suggestions, thanks for driving this! A couple pieces of feedback:

Migration files must have a functional down method to effectively rollback changes introduced in the up method.

This isn't always possible, especially when the migration is doing some repair of the metadata (vs. adding/removing columns and tables). See #12960 for an example of an impossible migration to write a down method for. Maybe this can be more precise by saying that all migrations that modify the structure of the DB/it's columns must have a down method?

PRs introducing database migrations must include runtime estimates and downtime expectations.

Love this, let's plan to add these as fields in the PR template?

Establish Github code owners

This will be our first use of code owners I think, do you have any thoughts about using this more broadly across the repo? Or have you only thought about the migration use case so far?

Nothing here besides my first point should be considered blocking though, and I'll happily vote +1 on this initiative once the thread is created!

john-bodley · 2021-02-27T04:56:19Z

Should we also consider how we could provide near zero-downtime for migrations which involve DDL operations or is this outside the scope of this SIP?

mistercrunch · 2021-03-02T05:33:36Z

I was just talking to an engineer today (Arash @ Preset) about the idea of using ExtraJSONMixin or similar pattern to accumulate / delay database migrations. In his case he wanted to add a few new fields to the highly contentious Query model and I pointed out to it.
https://github.com/apache/superset/blob/master/superset/models/helpers.py#L456-L477

It seems like ExtraJSONMixin could be further improved to be more seamless if we wanted to, but I'm not sure how people feel about it.

A few other ideas around this SIP:

I would recommend using an accumulation pattern around cleanup that are not immediately needed, meaning if for instance we want to remove columns in the database, we can remove the field from the model, but delay the related database field cleanup in some sort of migrations/next.py where we accumulate those cleanup migration scripts and defer until the next big number release (say 2.0.0) where downtime may be expected
It'd be so nice to have blue/green forward compatible stamps on migrations, meaning previous version of the app is guaranteed to work with future version of the database. In many cases if the migration is not blue-green compatible it should be clearly identified as it requires downtime. I'd recommend really pushing PRs to meet this req and pushing to using the accumulation pattern when that's not the case.

mistercrunch · 2021-03-02T05:39:42Z

Migration files must have a functional down method to effectively rollback changes introduced in the up method.

This isn't always possible, especially when the migration is doing some repair of the metadata (vs. adding/removing columns and tables).

It could be possible in some cases by keeping data as backup / renaming column to enable just that. Of course that doesn't always work as new objects get created and may be missing the backup, it may get very tricky to provide that guarantee where you may have to maintain the old and new field with the related old/new logic... Probably over-complicated, but we can see on a case per case basis if it makes sense to try to guarantee that down-migration. If it's not possible we may want to try to delay that migration until a bigger release if possible.

rusackas · 2021-03-02T18:59:36Z

must include runtime estimates and downtime expectations

We've seen instances in the past where one contributor thought runtime/downtime would be minimal based on their perceived use cases. When merged, other orgs had significantly/exponentially more data that needed migration, and the execution time was a pain point. How can we most accurately provide realistic/reasonable estimates given the fairly disparate use cases and datasets of Superset users/institutions?

robdiciuccio · 2021-03-03T23:35:28Z

Migration files must have a functional down method to effectively rollback changes introduced in the up method.

This isn't always possible, especially when the migration is doing some repair of the metadata (vs. adding/removing columns and tables).

Good point. The primary goal here is to be able to successfully rollback from any migration. The example you provided is idempotent and additive, which fits the criteria. How about this updated language?

All migrations must support rollbacks. Migrations must have a functional downgrade method to effectively rollback schema changes introduced in the upgrade method. If a migration makes changes to data that are not easily undone (e.g. #12960), the changes introduced must be non-breaking and idempotent.

robdiciuccio · 2021-03-03T23:37:38Z

This will be our first use of code owners I think, do you have any thoughts about using this more broadly across the repo? Or have you only thought about the migration use case so far?

Another use case I'm thinking about for code owners is the new ephemeral test environment workflow code: adding Preset code owners to ensure AWS resources are not changed without account owner approval.

robdiciuccio · 2021-03-03T23:44:57Z

We've seen instances in the past where one contributor thought runtime/downtime would be minimal based on their perceived use cases. When merged, other orgs had significantly/exponentially more data that needed migration, and the execution time was a pain point. How can we most accurately provide realistic/reasonable estimates given the fairly disparate use cases and datasets of Superset users/institutions?

Yeah, that's a bit tricky. One idea is to provide run times for different row counts, which could then be reasonably extrapolated for larger datasets. In general, committers notified via the proposed Github code owners should know if the tables being altered will incur significant migration overhead.

Require two approvals for PRs that include database migrations, including committers from multiple organizations.

Should we also require that the PR be open for review for a minimum period of time (48h?) to ensure committers from different orgs have time to review?

robdiciuccio · 2021-03-04T01:38:29Z

Should we also consider how we could provide near zero-downtime for migrations which involve DDL operations or is this outside the scope of this SIP?

Making this work for all metadata DB types will be difficult, as the pitfalls and tooling are different for each. We could add some guidance around things like setting default values and creating indexes on tables with many rows, but DDL is going to potentially cause downtime on some systems unless you're using a tool like pt-online-schema-change (for MySQL).

robdiciuccio · 2021-03-04T01:39:47Z

Ran across this guidance in the Alembic docs about naming constraints. Thoughts on including this as a requirement for migrations?

craig-rueda · 2021-03-08T23:47:31Z

To build on Rob's point above, I'd like to add that, I've noticed several migrations that do things like call commit() on their current session multiple times (usually in a loop), which breaks the atomic guarantee of migrations. I'm sure Alembic wraps the current session and intercepts calls to commit() under the covers, but we should still be checking for this sort of thing.

robdiciuccio · 2021-03-09T19:01:16Z

I would recommend using an accumulation pattern around cleanup that are not immediately needed, meaning if for instance we want to remove columns in the database, we can remove the field from the model, but delay the related database field cleanup in some sort of migrations/next.py where we accumulate those cleanup migration scripts and defer until the next big number release (say 2.0.0) where downtime may be expected

@mistercrunch agreed, I added an item for accumulating breaking/cleanup migrations for the next major release

It'd be so nice to have blue/green forward compatible stamps on migrations, meaning previous version of the app is guaranteed to work with future version of the database. In many cases if the migration is not blue-green compatible it should be clearly identified as it requires downtime. I'd recommend really pushing PRs to meet this req and pushing to using the accumulation pattern when that's not the case.

I think the standards set forth in SIP-57 re: breaking changes should accomplish this goal, unless you have something else in mind?

robdiciuccio · 2021-03-09T19:02:43Z

@craig-rueda I added some detail around atomicity of migrations

robdiciuccio · 2021-03-11T01:08:41Z

Updated the SIP above based on feedback in this thread. Will send for a vote on Friday if there's no other discussion items.

betodealmeida · 2021-03-16T00:02:58Z

@robdiciuccio @evans regarding "PRs introducing database migrations must include runtime estimates and downtime expectations", I'm working on a script to run benchmarks on migrations that pre-populates the models:

#13561

robdiciuccio · 2021-03-23T23:41:13Z

The SIP has been approved with nine binding +1 votes, four non-binding +1 votes, zero 0 votes and zero -1 votes.

superset-github-bot bot added the preset-io label Feb 25, 2021

robdiciuccio added the sip Superset Improvement Proposal label Feb 25, 2021

robdiciuccio closed this as completed Mar 23, 2021

This was referenced Mar 24, 2021

chore: Add CODEOWNERS for superset/migrations #13759

Merged

chore: Update PR template for SIP-59 DB migrations process #13855

Merged

betodealmeida mentioned this issue Mar 30, 2021

feat: create table with long name #13871

Merged

8 tasks

hughhhh mentioned this issue Mar 30, 2021

refactor: move CTAS/CVAS field II #13877

Merged

6 tasks

benjreinhart mentioned this issue Mar 31, 2021

fix(#13378): Ensure g.user is set for impersonation #13878

Merged

8 tasks

rusackas mentioned this issue Mar 31, 2021

fix: consistent left margin for dashboard layout. Fixes #13863 #13884

Merged

8 tasks

michael-s-molina mentioned this issue Mar 31, 2021

test: Adds tests to the filter scope components #13887

Merged

8 tasks

AAfghahi mentioned this issue Mar 31, 2021

feature: Importing Saved queries commands and API #13890

Closed

8 tasks

msyavuz mentioned this issue Nov 19, 2024

refactor(date picker): Migrate to Ant Design 5 msyavuz/superset#1

Open

9 tasks

betodealmeida mentioned this issue Nov 19, 2024

chore(oauth2): enable user impersonation by default #30983

Draft

9 tasks

This was referenced Nov 20, 2024

vivek superset to bi superset viveksingh-numerator/bi-superset#1

Closed

Backupbranch viveksingh-numerator/bi-superset#2

Merged

bwa84 mentioned this issue Nov 20, 2024

feat(sunburst): Added some options to adjust Sunburst chart view: innerRadius, outerRadius, donut, showNulls #30987

Open

9 tasks

zytfo mentioned this issue Nov 20, 2024

New changes appraiders/superset#1

Closed

9 tasks

yevhenii1337 mentioned this issue Nov 20, 2024

fix(chart): pie chart lengend view allclinics/superset#38

Merged

9 tasks

justinpark mentioned this issue Nov 21, 2024

fix(explore): verified props is not updated #31008

Merged

9 tasks

yevhenii1337 mentioned this issue Nov 21, 2024

fix(map): hospitals length allclinics/superset#39

Merged

9 tasks

asher-lab mentioned this issue Nov 21, 2024

feat(tag): Added fix for some issues #31017

Closed

9 tasks

lautaro79 mentioned this issue Nov 21, 2024

create Client_Admin role Datakimia-org/superset#12

Open

9 tasks

msyavuz mentioned this issue Nov 21, 2024

refactor(date picker): Migrate Date Picker to Ant Design 5 #31019

Draft

9 tasks

asher-lab mentioned this issue Nov 21, 2024

feat(tag): Remove ExportTagsCommand from AssetsExport and Remove Optional asher-lab/superset#15

Merged

9 tasks

This was referenced Nov 21, 2024

chore: add unit tests for is_mutating() #31021

Merged

fix(dataset): use sqlglot for DML check #31024

Merged

This was referenced Nov 22, 2024

feat(tag): Fix 1 asher-lab/superset#16

Closed

feat(tag): Update README.md asher-lab/superset#17

Open

yevhenii1337 mentioned this issue Nov 22, 2024

fix(tabs): default tab index allclinics/superset#40

Merged

9 tasks

kgabryje mentioned this issue Nov 22, 2024

chore: Refactor dashboard header to func component #31029

Merged

9 tasks

This was referenced Nov 22, 2024

feat(tag): Asher lab patch 0 asher-lab/superset#18

Open

feat(tag): Z1 asher-lab/superset#19

Closed

fix(import): Merge apache/master and resolve conflicts asher-lab/superset#20

Merged

yevhenii1337 mentioned this issue Nov 22, 2024

fix(table): select all control allclinics/superset#41

Merged

9 tasks

kgabryje mentioned this issue Nov 22, 2024

chore: Cleanup code related to MetadataBar, fix types #31030

Merged

9 tasks

yevhenii1337 mentioned this issue Nov 22, 2024

fix(chart): glossary table styles allclinics/superset#42

Merged

9 tasks

This was referenced Nov 22, 2024

feat(tag): Z3-1 greenpoc/superset#1

Closed

feat(tags): Z3-1 greenpoc/superset#2

Open

yevhenii1337 mentioned this issue Nov 22, 2024

feat(chart): video legend allclinics/superset#43

Merged

9 tasks

geido mentioned this issue Nov 22, 2024

fix(Dashboard): Ensure shared label colors are updated #31031

Merged

9 tasks

betodealmeida mentioned this issue Nov 22, 2024

refactor (SIP-117): remove more sqlparse #31032

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SIP-59] Proposal for Database migration standards #13351

[SIP-59] Proposal for Database migration standards #13351

robdiciuccio commented Feb 25, 2021 •

edited

Loading

etr2460 commented Feb 26, 2021

john-bodley commented Feb 27, 2021

mistercrunch commented Mar 2, 2021 •

edited

Loading

mistercrunch commented Mar 2, 2021 •

edited

Loading

rusackas commented Mar 2, 2021 •

edited

Loading

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 4, 2021

robdiciuccio commented Mar 4, 2021

craig-rueda commented Mar 8, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 11, 2021

betodealmeida commented Mar 16, 2021

robdiciuccio commented Mar 23, 2021

[SIP-59] Proposal for Database migration standards #13351

[SIP-59] Proposal for Database migration standards #13351

Comments

robdiciuccio commented Feb 25, 2021 • edited Loading

[SIP] Proposal for database migration standards

Motivation

Proposed Change

New or Changed Public Interfaces

New dependencies

Migration Plan and Compatibility

Rejected Alternatives

etr2460 commented Feb 26, 2021

john-bodley commented Feb 27, 2021

mistercrunch commented Mar 2, 2021 • edited Loading

mistercrunch commented Mar 2, 2021 • edited Loading

rusackas commented Mar 2, 2021 • edited Loading

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 4, 2021

robdiciuccio commented Mar 4, 2021

craig-rueda commented Mar 8, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 11, 2021

betodealmeida commented Mar 16, 2021

robdiciuccio commented Mar 23, 2021

robdiciuccio commented Feb 25, 2021 •

edited

Loading

mistercrunch commented Mar 2, 2021 •

edited

Loading

mistercrunch commented Mar 2, 2021 •

edited

Loading

rusackas commented Mar 2, 2021 •

edited

Loading