You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While using gh-ost, I came across the error "FATAL No shared unique key can be found after ALTER! Bailing out" already reported in issue #647.
Consider the following table:
CREATE TABLE `txn` (
`txnId` int(10) unsigned NOT NULL AUTO_INCREMENT,
`submittedDate` date NOT NULL DEFAULT '0000-00-00',
`merchantId` int(10) unsigned NOT NULL DEFAULT '0',
`amt` decimal(19,6) DEFAULT NULL,
PRIMARY KEY (`txnId`,`submittedDate`)
)
If one tries to do the alter table to replace PRIMARY KEY (`txnId`) with PRIMARY KEY (`txnId`,`submittedDate`) with --alter="DROP PRIMARY KEY, ADD PRIMARY KEY (txnId, submittedDate)", gh-ost will abort the operation with the error "FATAL No shared unique key can be found after ALTER! Bailing out"
The example above is one of the "not allowed" cases. Looking at the code, the logic mandated that both original and ghost tables must share at least one "unique key". This makes sense to a certain point, otherwise, inserts in ghost table could fail with duplicate unique keys.
However, if we look at the alter table above, it is not possible to have duplicate unique key errors because the original unique key is still part of the new key. If the data in original table was unique according to the original unique key, it will still be unique if the new unique key in ghost table has the columns of the original unique key, even if the new column in the new unique key can have duplicate values.
In my view, the current code should be changed. Instead of trying to find pairs of same key in both tables, it should check if each unique key in original table exists as a subset of a unique key in the list of unique keys in ghost table.
I created a pull request with the code change to fix this problem.
Note 1:
The proposed 2-step migration for the case above could be ok, but our table has half billion records, consuming 1.5 TB of space. When we run gh-ost on this table, it runs for almost 30 hours.. doing alter table in 2 steps would mean 60 hours migration.
Note 2:
At first glance, the above alter table doesn't look like a real world use case, but we need to do this so we can create partitions on the table using the column submittedDate and that is not possible without it being part of the primary key. The real alter table that we are running with ghost is something like this and with the code fix in the PR it works really well.
--alter="DROP PRIMARY KEY, ADD PRIMARY KEY (txnId, submittedDate), ADD COLUMN(newColumn1 varchar(255) DEFAULT NULL, newColumn2 varchar(255) DEFAULT NULL, ...) PARTITION BY RANGE (TO_DAYS(submittedDate)) (PARTITION p_20220101 VALUES LESS THAN (TO_DAYS('2022-01-01')), PARTITION p_20220102 VALUES LESS THAN (TO_DAYS('2022-01-02')), (...), PARTITION p_20241230 VALUES LESS THAN (TO_DAYS('2024-12-30')), PARTITION p_20241231 VALUES LESS THAN (TO_DAYS('2024-12-31')), PARTITION p_max VALUES LESS THAN MAXVALUE)"
The text was updated successfully, but these errors were encountered:
While using gh-ost, I came across the error "FATAL No shared unique key can be found after ALTER! Bailing out" already reported in issue #647.
Consider the following table:
If one tries to do the alter table to replace
PRIMARY KEY (`txnId`)
withPRIMARY KEY (`txnId`,`submittedDate`)
with--alter="DROP PRIMARY KEY, ADD PRIMARY KEY (txnId, submittedDate)"
, gh-ost will abort the operation with the error "FATAL No shared unique key can be found after ALTER! Bailing out"On the page https://github.com/github/gh-ost/blob/master/doc/shared-key.md it is defined what is allowed and what is not and if this error happens, it is mentioned that a 2-step migration is required.
The example above is one of the "not allowed" cases. Looking at the code, the logic mandated that both original and ghost tables must share at least one "unique key". This makes sense to a certain point, otherwise, inserts in ghost table could fail with duplicate unique keys.
However, if we look at the alter table above, it is not possible to have duplicate unique key errors because the original unique key is still part of the new key. If the data in original table was unique according to the original unique key, it will still be unique if the new unique key in ghost table has the columns of the original unique key, even if the new column in the new unique key can have duplicate values.
In my view, the current code should be changed. Instead of trying to find pairs of same key in both tables, it should check if each unique key in original table exists as a subset of a unique key in the list of unique keys in ghost table.
I created a pull request with the code change to fix this problem.
Note 1:
The proposed 2-step migration for the case above could be ok, but our table has half billion records, consuming 1.5 TB of space. When we run gh-ost on this table, it runs for almost 30 hours.. doing alter table in 2 steps would mean 60 hours migration.
Note 2:
At first glance, the above alter table doesn't look like a real world use case, but we need to do this so we can create partitions on the table using the column
submittedDate
and that is not possible without it being part of the primary key. The real alter table that we are running with ghost is something like this and with the code fix in the PR it works really well.--alter="DROP PRIMARY KEY, ADD PRIMARY KEY (txnId, submittedDate), ADD COLUMN(newColumn1 varchar(255) DEFAULT NULL, newColumn2 varchar(255) DEFAULT NULL, ...) PARTITION BY RANGE (TO_DAYS(submittedDate)) (PARTITION p_20220101 VALUES LESS THAN (TO_DAYS('2022-01-01')), PARTITION p_20220102 VALUES LESS THAN (TO_DAYS('2022-01-02')), (...), PARTITION p_20241230 VALUES LESS THAN (TO_DAYS('2024-12-30')), PARTITION p_20241231 VALUES LESS THAN (TO_DAYS('2024-12-31')), PARTITION p_max VALUES LESS THAN MAXVALUE)"
The text was updated successfully, but these errors were encountered: