dev/core#2146 - Long unicode contact names get truncated badly causing a crash #18862

demeritcowboy · 2020-10-27T00:34:59Z

Overview

https://lab.civicrm.org/dev/core/-/issues/2146

Go to create a contact. It's slightly easier to see with Organization.
Enter a pretty long name using unicode characters into the organization name field, around 128 characters.
Click Save.
DB Error: Unknown error.

Before

Error

After

No error. Names that are too long get truncated.

Technical Details

substr() isn't multibyte-aware. But also if you look a couple lines up, it already calls copyValues(), which in turn calls CRM_Utils_String::ellipsify() which is multibyte-aware. So the string is already truncated correctly before it tries to then mangle it.
We can't call ellipsify though because then that breaks the way it currently works for Individual.

Comments

Has test. Without the patch the last 3 org tests fail with something like: [nativecode=1366 ** Incorrect string value: '\xD1' for column civicrm_contact.sort_name at row 1]. Some of the individual tests fail for similar or incorrect string length truncation reasons.

civibot · 2020-10-27T00:35:01Z

(Standard links)

If this is your first pull-request for CiviCRM, please browse CONTRIBUTING.md for information about the development and testing processes.
If you are reviewing this pull-request, you may wish to consult the test sites and the Review Standards (long template, short template).

demeritcowboy · 2020-10-27T01:14:14Z

D'oh the call to Individual::format() puts it back again if it's Individual. Will update...

seamuslee001 · 2020-10-27T04:52:58Z

@demeritcowboy test fails relate

demeritcowboy · 2020-10-27T05:09:42Z

Ok yeah I guess that's something that would need concept approval, whether names should be silently truncated (current), or have some indication it's been truncated (the change), or should it flat out fail to create the contact (other). I was keeping it consistent with copyValues but that's obviously not what's currently tested.

Maybe keeping same as current is best for now.

Relatedly, I did notice api v3 and v4 enforce Org name length differently. v3 fails, v4 truncates.

demeritcowboy · 2020-10-27T14:22:00Z

Have updated so it leaves it as silent truncation for individual, same as before patch, but without db failure for both individual and org for long unicode.

seamuslee001 · 2020-10-27T22:11:37Z

This looks fine to me now merging

demeritcowboy · 2020-10-27T22:20:35Z

Thanks!

civibot bot added the master label Oct 27, 2020

demeritcowboy force-pushed the mb-substr branch from 891b9d9 to 5055dc8 Compare October 27, 2020 01:18

truncate unicode strings better

4952dbf

demeritcowboy force-pushed the mb-substr branch from 5055dc8 to 4952dbf Compare October 27, 2020 14:15

seamuslee001 merged commit 9b91d1b into civicrm:master Oct 27, 2020

demeritcowboy deleted the mb-substr branch October 27, 2020 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev/core#2146 - Long unicode contact names get truncated badly causing a crash #18862

dev/core#2146 - Long unicode contact names get truncated badly causing a crash #18862

demeritcowboy commented Oct 27, 2020 •

edited

Loading

civibot bot commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

seamuslee001 commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

seamuslee001 commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

dev/core#2146 - Long unicode contact names get truncated badly causing a crash #18862

dev/core#2146 - Long unicode contact names get truncated badly causing a crash #18862

Conversation

demeritcowboy commented Oct 27, 2020 • edited Loading

Overview

Before

After

Technical Details

Comments

civibot bot commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

seamuslee001 commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

seamuslee001 commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020

demeritcowboy commented Oct 27, 2020 •

edited

Loading