Skip to content

Commit 387325f

Browse files
committed
Convert all tables to use UTF8MB4 charset
We have seen an increasing number of errors recently relating to publishers attempting to use 4 byte characters in Whitehall. This commit adds a data migration which will convert all of the tables to use the UTF8MB4 charset. This uses a data migration because it is not subject to the 15 minute limit for schema migrations applied in the kubernetes configuration. The migration will need to be run out of hours as the tables will need to be locked during conversion.
1 parent 6d6c23c commit 387325f

File tree

3 files changed

+187
-175
lines changed

3 files changed

+187
-175
lines changed

config/database.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
default: &default
2-
encoding: utf8
2+
encoding: utf8mb4
33
adapter: mysql2
44
prepared_statements: true
55
variables:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Convert all tables to utf8mb4 in order to improve support for non-English languages.
2+
3+
connection = ActiveRecord::Base.connection
4+
# Disable foreign key constraints because foreign keys using strings will prevent conversion due to a charset mismatch.
5+
connection.execute "SET foreign_key_checks = 0;"
6+
connection.tables.each do |table|
7+
next if table == "schema_migrations"
8+
puts "START: Converting table #{table} to utf8mb4"
9+
connection.execute "ALTER TABLE `#{table}` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"
10+
puts "END: Converted table #{table} to utf8mb4"
11+
end
12+
connection.execute "SET foreign_key_checks = 1;"

0 commit comments

Comments
 (0)