LG-228 Make migrations safer and more resilient#2127
Conversation
|
I tested this locally by updating my local Figaro config: then creating a new migration file: then copying this into it: class StatementTimeoutTest < ActiveRecord::Migration[5.1]
def change
safety_assured { execute "SELECT pg_sleep(5);" }
end
endthen running Then, I ran: and the migration passed. |
1fbfa50 to
2832d1a
Compare
lib/deploy/migration.rb
Outdated
There was a problem hiding this comment.
I'm not sure how to test this. Alternatively, to avoid Code Climate from complaining, we can use the slowpoke gem instead (which is where I got this code from), and only require this module.
There was a problem hiding this comment.
Unfortunately, that won't work because the gem depends on a method in the main class that we don't want to require. So, our options are to tell Code Climate to ignore this file for test coverage or to build our own gem. I will try the former.
43d4d80 to
6dc49f2
Compare
|
Note that this can't be merged until the infrastructure is updated to call the migration from here. |
lib/deploy/migration.rb
Outdated
There was a problem hiding this comment.
My one thought here is that it would be nice to give this a more descriptive name, perhaps something like Deploy::MigrationDatabaseTimeout so that it is more clear what's being done in the initializer where we are using it.
jgsmith-usds
left a comment
There was a problem hiding this comment.
This looks reasonable to me. My only concern is to make sure the timeout is long enough for reasonable migrations. I fear we'll run into it eventually and be surprised again.
config/database.yml
Outdated
There was a problem hiding this comment.
Would it make sense to add validation somewhere that database_statement_timeout is present? Looks like if it's nil this will set the timeout to 0, not sure what effect that will have.
There was a problem hiding this comment.
Yep. Will add it to figaro.rb.
|
Thanks, looks like this would've avoided last week's issue. Why is this blocked on the identity-devops PR? I would've thought this would just not be executed until that one is merged. |
|
It's also worth noting that this doesn't address the race condition or rollback problems described in https://github.com/18F/identity-private/issues/2344, but it's a step in the right direction. |
|
Do we think 60 seconds is a big enough statement timeout for migrations? |
57aee08 to
6950960
Compare
**Why**: We recently ran into an issue while deploying RC 56 to production due to a migration needing more time than allowed by our `statement_timeout`. **How**: - Add the `strong_migrations` gem, which will check your migrations for unsafe usage and best practices. Read more here: https://github.com/ankane/strong_migrations - Allow configuring the statement_timeout in `database.yml` via Figaro - Allow overriding the statement_timeout via an ENV var for migrations. Code shamelessly copied from the Slowpoke gem: https://github.com/ankane/slowpoke - Add a `deploy/migrate` script so that it is maintained in this repo as opposed to the devops repo.
6950960 to
1713043
Compare
Why: We recently ran into an issue while deploying RC 56 to
production due to a migration needing more time than allowed by our
statement_timeout.How:
strong_migrationsgem, which will check your migrationsfor unsafe usage and best practices. Read more here:
https://github.com/ankane/strong_migrations
database.ymlvia FigaroCode shamelessly copied from the Slowpoke gem:
https://github.com/ankane/slowpoke
deploy/migratescript so that it is maintained in this repoas opposed to the devops repo.
Hi! Before submitting your PR for review, and/or before merging it, please
go through the following checklist:
For DB changes, check for missing indexes, check to see if the changes
affect other apps (such as the dashboard), make sure the DB columns in the
various environments are properly populated, coordinate with devops, plan
migrations in separate steps.
For route changes, make sure GET requests don't change state or result in
destructive behavior. GET requests should only result in information being
read, not written.
For encryption changes, make sure it is compatible with data that was
encrypted with the old code.
For secrets changes, make sure to update the S3 secrets bucket with the
new configs in all environments.
Do not disable Rubocop or Reek offenses unless you are absolutely sure
they are false positives. If you're not sure how to fix the offense, please
ask a teammate.
When reading data, write tests for nil values, empty strings,
and invalid formats.
When calling
redirect_toin a controller, use_url, not_path.When adding user data to the session, use the
user_sessionhelperinstead of the
sessionhelper so the data does not persist beyond the user'ssession.
When adding a new controller that requires the user to be fully
authenticated, make sure to add
before_action :confirm_two_factor_authenticated.