Skip to content

Conversation

@richard67
Copy link
Member

@richard67 richard67 commented Mar 28, 2020

Pull Request for Issue # .

Summary of Changes

This Pull Request (PR) adapts the utf8mb4 conversion to changes in J4 which have been forgotten to do in the conversion sql scripts.

It does not add the missing tables for action logs and the privacy suite to the utf8mb4 conversion script. This will be done with PR #28495 in staging and later be merged up into 4.0-dev with the regular upmerge. @wilsonge Whenever this is done, ping me, because then I have to do a small change in a new PR for 4.0-dev or (if still open then) in PR #28515 .

Testing Instructions

This PR is only relevant for MySQL databases. not for MariaDB or PostgreSQL databases.

Code review should be sufficient:

  • Check that each of the two sections "Step 2.2: Convert all tables to utf8mb4 chracter set ..." and "Step 2.4: Set default character set and collation for all tables" in file administrator/components/com_admin/sql/others/mysql/utf8mb4-conversion-02.sql handles all tables created in joomla.sql, i.e. all core tables, in alphabetical order, except of the tables for action logs and com_privacy, for which a PR is made in staging, see summary of changes above.
  • Check for every table added by this PR to the sections mentioned above that it doesn't contain any varchar columns which is longer than 191 characters in any key or index, so that it doesn't need any special index handling in the other parts of the utf8mb4 conversion. I.e. the changes in this PR are enough, indexes of tables newly added in J4 already are limited to the first 191 characters and so are already utf8mb4 compatible.
  • Check for every table added by this PR to the sections mentioned above that it doesn't contain any columns with binary collation (utf8mb4_bin or utf8_bin, depending on the utf8mb4 support of the database), which would also require additional special treatment at other places.

For a real test it would either need a version of MySQL older than 5.5.3, or it would need to temporarily patch the utf8mb4 support detection in the database driver and make an installation so that utf8 (without mb4) is used, and then revert those driver patches and use the database fixer to do the utf8mb4 conversion.

The utf8mb4 detection of the MySQLi driver can be patched by patching the private function serverClaimsUtf8mb4Support() in that driver so that it always returns false. You can find this function in file libraries/vendor/joomla/database/src/Mysqli/MysqliDriver.php beginning at line 1075.

For the MySQL (PDO) driver it would be more complicated. I recommend to test with the MySQLi driver, that's sufficient.

You should not test with MySQL 8.0.19 or later because you will run into another issue, see PR #28370 , which has nothing to do with this PR here but might be confusing when checking the database checker for errors.

Expected result

Utf8mb4 conversion handles all core tables except of the tables for action logs and com_privacy, for which a PR is made in staging, see summary of changes above.

Actual result

Utf8mb4 conversion doesn't handle all core tables. Tables for action logs, the privacy suite, com_csp, mail templates, webauthn and workflows are missing.

Additional information

The utf8mb4 conversion is needed for the migration or update of an old database (but with still supported version) which doesn't support utf8mb4 character set to a newer version which supports that. It runs on a joomla update, if necessary, or when you go to the database checker page (Extensions -> Manage -> Database) and see a message about missing utf8 conversion and use the "Fix" button after having migrated or updated your database server.

Theoretically it should not be necessary on J4 to have the utf8mb4 conversion, because the databases which fultill the minimum requirements for J4 all support utf8mb4, and updating an older database to such a version has to happen before updating to J4. But in practice we can't make sure that this is fulfilled. The site admin may just have not checked "Extensions -> Manage -> Database" after having updated the database version and before updating to J4, and so not have seen the message about missing utf8 cconversion and so not have used the "Fix" button to run the conversion. So we keep the conversion in J4 and run them like before at the end of an update if necessary, then we can be really sure that AFTER the update to J4 we really are on utf8mb4.

Documentation Changes Required

None.

- Add missing tables.
- Remove obsolete tables.
- Change com_finder tables to unicode collation.
@richard67
Copy link
Member Author

@wilsonge Should that be a release blocker or even a beta blocker?

@richard67
Copy link
Member Author

@wilsonge I plan to make a separate PR for removing some logic for utf8mb4 support detection and initialization of table #__utf8_conversion from the installation and to initialise that table with the right value from beginning on in joomla.sql because we can be sure that a new installation happens on a database which supports utf8mb4. Or would you say I should do it with tis PR here? Or with @Hackwar 's PR #28350 ?

@richard67
Copy link
Member Author

@wilsonge I've decided to make a separate PR, see #28515 .
@Hackwar If that gets merged before your PR #28350 I'll help with the conflicts.

@richard67
Copy link
Member Author

@alikon Could you test this one? Review as described should be sufficient. And any idea who else I could ping for 2nd tester?

@Hackwar
Copy link
Member

Hackwar commented Mar 30, 2020

Do we really need a separate DB table with just one column and one row in it to check if that conversion has run or not?

@richard67
Copy link
Member Author

Yes we still need that.

@richard67
Copy link
Member Author

richard67 commented Mar 30, 2020

@Hackwar We might have people having forgotten to use the "Fix" button after they have migrated their 3.10 to a new db server or updated their server and before updating to 4. That's the only reason why to keep this utf8mb4 conversion stuff in J4, otherwise we could drop it completely. I won't change now the way how it works completely, I'll only fix it so it works for that scenario, and so we still need that table.

@richard67
Copy link
Member Author

Because I have to make some changes in PR #28495 for J3, it will have to remove the part for the action logs and the privacy suite tables from this PR here, but it is too late now today for me. So I set the "Updates requested" label on GitHub to show that this PR is not ready for tests and will leave a comment telling the same at the top of the tesdting instructions.

I will post back here and remove label and hint in testing instructions when ready for test.

@richard67 richard67 added the Updates Requested Indicates that this pull request needs an update from the author and should not be tested. label Mar 30, 2020
Remove tables for action logs and the privacy suite for which a fix will be made in staging and later be merged up into 4.0-dev.
@richard67 richard67 removed the Updates Requested Indicates that this pull request needs an update from the author and should not be tested. label Mar 30, 2020
@richard67
Copy link
Member Author

Changes done and testing instructions adapted. Ready for review and test.

@richard67 richard67 requested a review from wilsonge March 30, 2020 22:32
@alikon
Copy link
Contributor

alikon commented Mar 31, 2020

I have tested this item ✅ successfully on 7d0b31c

code review


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28500.

ALTER TABLE `#__content_types` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `#__contentitem_tag_map` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `#__core_log_searches` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `#__csp` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need CSP table here - if we do then we have the same issues as the 3.x PR. These new 4.x tables should be utf8mb4 out the box

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have the same issue as on J3 because there is no release (or beta) out in the wild where some tables haven't been converted yet. And you are right, the tables new in J4 should be utf8mb4 already. I just was not really aware of that.

ALTER TABLE `#__user_usergroup_map` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `#__utf8_conversion` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `#__viewlevels` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `#__webauthn_credentials` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with these ones

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was not aware of that. So at the end it needs only the changes for the finder tables in this PR, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably and even then only on the existing tables/columns - not any new ones hannes has modified in his changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hannes' update sql 4.0.0-2018-07-29.sql indeed modifies all of them ... so this PR is useless, if not wrong. Will close with some comment below.

@richard67
Copy link
Member Author

Thanks to review it turned out that this PR is useless. The tables which are new in J4 are created already with utf8mb4 charset, and the finder tables are all modified with the update sql script 4.0.0-2018-07-29.sql, so there is no need to add them to the utf8mb4 conversion.

I should have been aware of that.

Sorry to all who reviewed for having wasted your time.

Closing as obsolete.

@richard67 richard67 closed this Apr 1, 2020
@richard67 richard67 deleted the 4.0-dev-fix-utf8mb4-conversion branch April 1, 2020 18:22
@richard67
Copy link
Member Author

richard67 commented Apr 2, 2020

@wilsonge I just see the deletion of table #__core_log_searches from file utf8mb4-conversion-02.sql will still be needed in J4, also other modifications and simplifications like you had suggested, e.g. removal of index manipulations from the conversion sql scripts, and more simpliciations like in my other PR #28515 .

Question: Shall I add all that to my other PR #28515 ? It will result in merge conflicts after PR #28495 has been merged into staging and 3.9.17 will be released and you merge up the stuff then, and you should ping me when that happens because I know how to fix that.

Alternatively I could close PR #28515 and prepare a new one and keep it draft as long as the upmerge of 3.9.17 has not happened yet, and then finish that new PR. I think it will still be in time before beta until I have finished all then.

Just let me know what you prefer.

@richard67
Copy link
Member Author

@Hackwar I thought a bit about it and think you are right, the table can be removed for new installations and be dropped on updated installations after a successful conversion. I will work on it either with PR #28515 or a new one. But maybe we should wait with all that until PR #28495 has been tested and merged into staging and then found the way up to 4.0-dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants