Online DDL: avoid SQL's `CONVERT(...)`, convert programmatically if needed by shlomi-noach · Pull Request #16597 · vitessio/vitess

shlomi-noach · 2024-08-14T17:35:14Z

Description

We have a clear picture and a fix to #16023. The original reason why we needed convert() in the first place is that vreplication and vstreamer both issue a SET NAMES binary. We will want to change that in the future, but this PR in the meantime confirms to the binary connection charset.

So we used convert() to turn textual values into utf8mb4. On the other side, vplayer is reading events from the binary log. It used programmatic conversion (charset.Convert()) of the data to utf8mb4 to align with vcopier.

What we are doing now:

We do not use convert(), solving the sorting issue described in Bug Report: OnlineDDL PK conversion results in table scans #16023 (comment)
For vcopier read data, We do introduce programmatic conversion of non-utf columns into their designated charsets.
For vplayer, we do not convert at all if both source and target have same charset
For vplayer, we do apply programmatic conversion of non-utf columns into their designated charsets, in a similar logic as for vcopier.
If there's a charset.Convert() error, we translate it into ERROR 1366 ("Incorrect string value ..."), which is a terminal error in vreplication, and so the migration bails out as soon as that happens. This can happen if e.g. we're converting a UTF column into ASCII and the UTF column contains a smiley emoji.

Because we do not convert the original charset to utf8mb4, we get to programmatically convert it to the specific target column. Previously (and this is perhaps the last piece of magic I have not digged into yet, and again likely to be caused by the binary charset) we did not need to convert into the target charset.

All the tests remain the same, and we introduce a couple new ones.

Related Issue(s)

Backport

I wish to backport this to all supported versions, seeing that this is a bugfix: without this fix some migrations will slow down to a near halt.

Checklist

"Backport to:" labels have been added if this change should be back-ported to release branches
If this change is to be back-ported to previous releases, a justification is included in the PR description
Tests were added or are not required
Did the new or modified tests pass consistently locally and on CI?
Documentation was added or is not required

Deployment Notes

…ify column's charset or collation Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

…onvert for vplayer Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2024-08-14T17:37:07Z

go/vt/vttablet/onlineddl/vrepl.go

 			sb.WriteString(fmt.Sprintf("CONCAT(%s)", escapeName(name)))
 		case sourceCol.Type() == "json":
 			sb.WriteString(fmt.Sprintf("convert(%s using utf8mb4)", escapeName(name)))
+		case targetCol.Type() == "json" && sourceCol.Type() != "json":


This moves up from below so as to eliminate a case before we compare charsets for JSONs, which is not required and not beneficial.

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go

codecov · 2024-08-14T17:54:45Z

Codecov Report

Attention: Patch coverage is 39.47368% with 23 lines in your changes missing coverage. Please review.

Project coverage is 68.84%. Comparing base (cc68dd5) to head (0548937).
Report is 3 commits behind head on main.

Files	Patch %	Lines
...blet/tabletmanager/vreplication/replicator_plan.go	46.87%	17 Missing ⚠️
go/vt/vttablet/onlineddl/vrepl.go	0.00%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #16597      +/-   ##
==========================================
- Coverage   68.85%   68.84%   -0.02%     
==========================================
  Files        1557     1557              
  Lines      199891   200003     +112     
==========================================
+ Hits       137644   137697      +53     
- Misses      62247    62306      +59

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dbussink · 2024-08-14T17:57:33Z

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go

@@ -646,6 +654,24 @@ func appendFromRow(pq *sqlparser.ParsedQuery, buf *bytes2.Buffer, fields []*quer
 				buf.WriteString(sqltypes.NullStr)
 			} else {
 				vv := sqltypes.MakeTrusted(typ, row.Values[col.offset:col.offset+col.length])


Does this also allocate and later on too? Is it worth avoiding creating this if we overwrite it later?

Done. No double allocation. Also, converged the two codepaths that do charset.Convert() into a single convertStringCharset() function.

vitess-bot · 2024-08-14T18:17:13Z

mattlord · 2024-08-14T20:58:06Z

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go

 			sqlbuffer.WriteString(", ")
 		}
-		if err := appendFromRow(tp.BulkInsertValues, sqlbuffer, tp.Fields, row, tp.FieldsToSkip); err != nil {
+		if err := tp.appendFromRow(tp.BulkInsertValues, sqlbuffer, tp.Fields, row, tp.FieldsToSkip); err != nil {


If we make this change, which I'm OK with, then we don't need to pass in the other tp struct values:

tp.appendFromRow(sqlbuffer, row)

Good catch! Fixed.

mattlord

💅 This is great! I think that this will solve so many edge cases we've seen in production. ❤️ Just a couple of minor points so far.

mattlord · 2024-08-14T21:01:18Z

go/vt/vttablet/onlineddl/vrepl.go

-
 			if trivialCharset(fromCollation) && trivialCharset(toCollation) && targetCol.Type() != "json" {
 				sb.WriteString(escapeName(name))
+			} else if fromCollation == toCollation && targetCol.Type() != "json" {


We don't want && targetCol.Type() != "json" here and just above, do we? We already handle the non-JSON to JSON case above. We'd fall into the else case below where we'd say there's a collation conversion necessary even though there isn't. No?

In any event, I don't think this is a major issue as the primary issue we've seen on the target/vplayer side is where we were unable to use the desired index because of the CONVERT usage and you can't add indexes directly on JSON columns anyway.

We already handle the non-JSON to JSON case above.

You're right! We changed the case ordering and now we don't need this check. Fixed: removed three unnecessary checks in total.

Yes, we're still left with a few CONVERT(...)s yet in the code: for JSONs and for ENUMs. For JSONs it's as you say - not something you can even put in a primary key or any unique key; for ENUMs it's more complex. I'll take it to another PR.

mattlord · 2024-08-14T21:02:42Z

go/vt/vttablet/onlineddl/vrepl.go

 		case sourceCol.Type() == "json":
 			sb.WriteString(fmt.Sprintf("convert(%s using utf8mb4)", escapeName(name)))


@dbussink do you think this is still needed? I don't think so anymore, now that we have native JSON type support.

(against a v21 vtgate here) ❯ mysql commerce -e "create table json_test (id int not null primary key, j1 json); insert into json_test values (1, '{\"name\":\"Matt\"}')" ❯ mysql commerce -e "insert into json_test select id+10, j1 from json_test" ❯ mysql commerce -e "select * from json_test" --column-type-info Field 1: `id` Catalog: `def` Database: `commerce` Table: `json_test` Org_table: `json_test` Type: LONG Collation: binary (63) Length: 11 Max_length: 2 Decimals: 0 Flags: NOT_NULL PRI_KEY NO_DEFAULT_VALUE NUM PART_KEY Field 2: `j1` Catalog: `def` Database: `commerce` Table: `json_test` Org_table: `json_test` Type: JSON Collation: binary (63) Length: 4294967295 Max_length: 16 Decimals: 0 Flags: BLOB BINARY +----+------------------+ | id | j1 | +----+------------------+ | 1 | {"name": "Matt"} | | 11 | {"name": "Matt"} | +----+------------------+

I expect this to be bytes we pass on to MySQL "on the other side" and they are interpreted there as either a JSON field or serialized as a utf8mb4 string if some other type on the target.

Either way, I don't think it's a major deal on the source/vcopier side as the primary problems we've seen there are when these CONVERT calls then preclude us from using the desired index in the rowstreamer query and you can't add indexes directly on JSON columns anyway.

Let's leave it like so for now.

JSON is a bit special anyway, since we can't use the direct textual representation, but we turn it into a sql expression using JSON_OBJECT so we lose as little type information as possible.

mattlord · 2024-08-14T21:20:33Z

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go

+
+				if conversion, ok := tp.ConvertCharset[col.field.Name]; ok && col.length >= 0 {
+					// Non-null string value, for which we have a charset conversion instruction
+					fromCollation := tp.CollationEnv.DefaultCollationForCharset(conversion.FromCharset)


Do we have to rely on the default collation for the charset (on from and to side)? If we take utf8mb4 for example:

mysql> show collation where charset = 'utf8mb4'; +----------------------------+---------+-----+---------+----------+---------+---------------+ | Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute | +----------------------------+---------+-----+---------+----------+---------+---------------+ | utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes | Yes | 0 | NO PAD | | utf8mb4_0900_as_ci | utf8mb4 | 305 | | Yes | 0 | NO PAD | | utf8mb4_0900_as_cs | utf8mb4 | 278 | | Yes | 0 | NO PAD | | utf8mb4_0900_bin | utf8mb4 | 309 | | Yes | 1 | NO PAD | | utf8mb4_bg_0900_ai_ci | utf8mb4 | 318 | | Yes | 0 | NO PAD | | utf8mb4_bg_0900_as_cs | utf8mb4 | 319 | | Yes | 0 | NO PAD | | utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 | PAD SPACE | ... | utf8mb4_turkish_ci | utf8mb4 | 233 | | Yes | 8 | PAD SPACE | | utf8mb4_unicode_520_ci | utf8mb4 | 246 | | Yes | 8 | PAD SPACE | | utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 | PAD SPACE | | utf8mb4_vietnamese_ci | utf8mb4 | 247 | | Yes | 8 | PAD SPACE | | utf8mb4_vi_0900_ai_ci | utf8mb4 | 277 | | Yes | 0 | NO PAD | | utf8mb4_vi_0900_as_cs | utf8mb4 | 300 | | Yes | 0 | NO PAD | | utf8mb4_zh_0900_as_cs | utf8mb4 | 308 | | Yes | 0 | NO PAD | +----------------------------+---------+-----+---------+----------+---------+---------------+ 89 rows in set (0.00 sec)

If you're up for squeezing another change in here... I think we might want to make it ConvertCollation that we use in OnlineDDL — or if we leave the field name the same, just use the collation name when possible rather than the charset name. The collation is specific, and it implies the character set. Perhaps we truly only care about the character set in this scenario though... 🤔

Do we have to rely on the default collation for the charset (on from and to side)? If we take utf8mb4 for example:

It's a bit moot. We only use Collation as an intermediate step to get from the named charset (e.g. "latin1") into a Charset object. So we may as well use the default collection to get there.

Perhaps we truly only care about the character set in this scenario though... 🤔

This is worth digging into. If we do end up using collation rather than charset, then there's a few proto changes to make, so this will be outside the scope of this PR.

mattlord · 2024-08-14T21:50:52Z

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go

+					// Non-null string value, for which we have a charset conversion instruction
+					fromCollation := tp.CollationEnv.DefaultCollationForCharset(conversion.FromCharset)
+					if fromCollation == collations.Unknown {
+						return vterrors.Errorf(vtrpcpb.Code_INVALID_ARGUMENT, "Character set %s not supported for column %s", conversion.FromCharset, col.field.Name)


Nit, but errors aren't supposed to be capitalized (due to wrapping). That applies throughout the new code in the PR.

Fixed! One place where I did leave the message capitalized is in "Incorrect string value" - this string mimics the error message MySQL would have given for the equivalent SQL CONVERT(...) function, and I think we should keep this as it promotes consistency.

mattlord · 2024-08-14T22:07:03Z

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go

 			} else {
 				vv := sqltypes.MakeTrusted(typ, row.Values[col.offset:col.offset+col.length])
+
+				if conversion, ok := tp.ConvertCharset[col.field.Name]; ok && col.length >= 0 {


We don't want col.length > 0 here? If there are no chars/bytes then I wouldn't think we need to do anything in this regard.

Due to my bad English, I'm not sure if you mean we should use col.length >= 0 or if you mean we shouldn't use col.length >= 0.

Just in case you mean the former, we do have col.length >= 0 at the end of this line, in case you've missed it.
If you meant the latter, then col.length >= 0 in this context is an indicator that the value is not NULL, and we should test this or otherwise the conversion will break.

@dbussink pointed out that you meant to highlight > 0 rather than >= 0. Agreed, and fixed!

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2024-08-15T09:05:10Z

I'm backporting this to all supported versions as I see this as an important bugfix.

shlomi-noach added 3 commits August 14, 2024 09:52

Online DDL: do not CONVERT column charset when migration does not mod…

63d4ae3

…ify column's charset or collation Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

vreplication using utf8mb4, remove convert(), only programmatically c…

904df0e

…onvert for vplayer Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

Online DDL: avoid SQL's CONVERT(...), convert programmatically if needed

e3b07ba

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach added Type: Bug Type: Performance Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) labels Aug 14, 2024

shlomi-noach requested review from mattlord and rohit-nayak-ps as code owners August 14, 2024 17:35

shlomi-noach requested review from a team and dbussink August 14, 2024 17:35

github-actions bot added this to the v21.0.0 milestone Aug 14, 2024

shlomi-noach commented Aug 14, 2024

View reviewed changes

go/vt/vttablet/tabletmanager/vreplication/replicator_plan.go Outdated Show resolved Hide resolved

dbussink reviewed Aug 14, 2024

View reviewed changes

mattlord reviewed Aug 14, 2024

View reviewed changes

shlomi-noach added 5 commits August 15, 2024 08:25

decapitalize error messages

0d0d56f

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

simplify 'json' type checks

3d8bc2a

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

using TablePlan field values

6e15d3f

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

using TablePlan field values

3693d69

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

consolidate charset conversion codes into 'convertStringCharset()'

fe7e73f

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach requested a review from a team August 15, 2024 07:44

skip conversion for empty strings

0548937

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

dbussink approved these changes Aug 15, 2024

View reviewed changes

shlomi-noach added Backport to: release-18.0 labels Aug 15, 2024

shlomi-noach requested a review from a team August 15, 2024 09:38

rohit-nayak-ps approved these changes Aug 15, 2024

View reviewed changes

shlomi-noach merged commit 1b7fb6f into vitessio:main Aug 15, 2024

shlomi-noach deleted the onlineddl-charset-conversions-programmatic branch August 15, 2024 11:42

		case sourceCol.Type() == "json":
		sb.WriteString(fmt.Sprintf("convert(%s using utf8mb4)", escapeName(name)))

Conversation

shlomi-noach commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

Backport

Checklist

Deployment Notes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vitess-bot bot commented Aug 14, 2024

Review Checklist

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord left a comment

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shlomi-noach Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattlord Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

shlomi-noach commented Aug 14, 2024 •

edited

Loading

codecov bot commented Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

shlomi-noach Aug 15, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading

mattlord Aug 14, 2024 •

edited

Loading