Take MySQL Column Type Into Account in VStreamer#9331
Merged
deepthi merged 2 commits intovitessio:mainfrom Dec 9, 2021
Merged
Conversation
1882a7c to
99dc068
Compare
99dc068 to
ffd97a4
Compare
ffd97a4 to
348ae55
Compare
f79ea0f to
529da1b
Compare
This is required when we need to match MySQL behavior for data that requires column type information as well. For example, the binlog event metadata makes no distinction between events for a BINARY(4) column and events for a CHAR(4) column with a binary collation like utf8mb4_bin. So we need to know the underlying MySQL column type in order to handle them disctinctly -- MySQL pads (fixed length) binary columns on the right side with null bytes, but it does NOT do that for (fixed lengthed) CHARo columns, regardless of the collation. Signed-off-by: Matt Lord <mattalord@gmail.com>
83048d6 to
13ea4ca
Compare
8fd8f3f to
421f4d1
Compare
And use ToLower when looking for BINARY types to be safe. Signed-off-by: Matt Lord <mattalord@gmail.com>
421f4d1 to
3ff9fbd
Compare
rohit-nayak-ps
approved these changes
Dec 9, 2021
Member
rohit-nayak-ps
left a comment
There was a problem hiding this comment.
lgtm.
Very nice! An elegant solution to another binlog parser edge case.
deepthi
approved these changes
Dec 9, 2021
Collaborator
deepthi
left a comment
There was a problem hiding this comment.
Very thorough code comments and nice test case. 💯
This was referenced Dec 10, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
In #7969 we added right side null-byte padding to
BINARYcolumns after processing the binlog event in order to match the MySQL behavior and value so that we correctly calculate the keyspace IDs forBINARYcolumns in vindex functions like binary_md5 and correctly apply vreplication filters using those columns.It turns out, however, that row based binlog events make no distinction between a
BINARY(4)column and aCHAR(4)column with a binary collation likeutf8mb4_bin(you can see a detailed discussion here). So after #7969 we were also, incorrectly, adding right side null-byte padding toCHARcolumns with binary collations.Although we ensured we did not add more padding than the actual column would hold in #8730, we really shouldn't be adding any padding at all to
CHARcolumns in order to match the MySQL behavior (otherwise we have discrepancies as described here). But in order to do this we needed to thread the target MySQL column type info through the vstreamer and RBR binlog event processing components so that we ONLY add the padding to values for actualBINARYcolumns and nothing else (CHARcolumns being the known problematic case today).Now we only add the padding if the underlying MySQL column on the target is
BINARYand then it’s just bytes, not chars made of up N bytes, so the subsequent pad trimming based on the charset was removed.The manual test case here also now passes in this branch:
Related Issue(s)
Fixes: #9207
Checklist