VReplication: Pad binlog values for binary() columns to match the value returned by mysql selects#7969
Conversation
91b04bf to
57842d7
Compare
|
Just reviewed with Sugu and he recommended that the padding should be done right when it is read from the binlog and not just for computing the keyspace id. That way the padded value, which is also the result of a mysql select, will be sent to all consumers of the vstream. So moving it to draft for now. |
… a select query. This also ensures that if such columns are used as sharding keys we get the same keyspace_id Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…rs see the padded value instead of doing it later in vstreamer or doint it just for keyspace id computation Signed-off-by: Rohit Nayak <rohit@planetscale.com>
966763e to
91357c6
Compare
shlomi-noach
left a comment
There was a problem hiding this comment.
It's painful to see how small the change is :)
| paddedData := make([]byte, max) | ||
| copy(paddedData[:l], mdata) | ||
| mdata = paddedData | ||
| } |
There was a problem hiding this comment.
There's a bug here for CHAR columns using binary collations. If you have e.g. CHAR (3) COLLATE UTF8MB4_BIN then the padding goes out to the max byte length of 12 and you get this in the vreplication stream:
'FOO\0\0\0\0\0\0\0\0\0' and the SQL fails because it's too long for the column.
sougou
left a comment
There was a problem hiding this comment.
I feel like a similar bug may exist with CHAR(N), where we may need to pad with spaces.
| } | ||
| }, | ||
| "tables": { | ||
| "tables": { |
|
Let us plan to backport this to 10.0, 9.0 and 8.0. |
|
Also, just realized there was an almost identical recent contribution to |
…ue returned by mysql selects Backport of vitessio#7969 * Pad binlog values for binary() columns to match the value returned by a select query. This also ensures that if such columns are used as sharding keys we get the same keyspace_id * Pad binary() values in the binlog reader directly so that all consumers see the padded value instead of doing it later in vstreamer or doint it just for keyspace id computation Signed-off-by: Rohit Nayak <rohit@planetscale.com> Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak rohit@planetscale.com
Description
For fixed length binary columns, mysql internally pads the values on the right will nulls. However the binlogs contain the original value. This leads to a problem in vreplication workflows if a binary column is used as a sharding key.
During the copy phase the row is read using a mysql select and the column has the padded value. However while replicating we get the unpadded value. This means that we get different keyspace_ids. We found bugs where the copy phase inserted the row in one shard but the update during the catchup phase routed this row to another shard where, of course, the update failed.
The PR pads the value found in the binlog to match the value returned my mysql to fix this issue.
Related Issue(s)
#3984
Checklist