Support utf8mb4 in com_localise component#286
Support utf8mb4 in com_localise component#286andrepereiradasilva wants to merge 28 commits intojoomla-projects:developfrom andrepereiradasilva:utf8mb4
Conversation
|
oh a new PR has merged right now by @infograf768 with new sql tables ... talk about bad luck! Will have to do a revision to the PR. |
|
Well, what I saw looked good to me, same way as the core does it. The DEFAULT='' aded to one of the columns I am not sure if that is necessary, but should not do any harm. |
Nice. so i will just have to do the same process to the other tables merged in a PR an hour ago ... |
|
Yes. |
|
@andrepereiradasilva But one thing I should mention: The conversion script has to be run not only when updating to this particular version where utf8mb4 was introduced, it also has to be run when someone in future updates from a version before the one when utf8mb4 was introduced to a version after the one, means as long as you still expect old installations to be converted on update, you have to keep the conversion script in the manifest and run it, and it has to be adapted in future maybe, so it always fits to the data structures of the particular new version. |
|
@richard67 i think it's already doing that. |
|
Sure, I just say it has to stay like this also in future version of this component. The conversion then runs after each update, not doing any harm if already done with last update, but this then covers also the conversion if somone migrated from old utf8 to new utf8mb4 database. Those will be converted to utf8mb4 then as soon as they update this component to a new version, and for the meantime they will still run utf8_unicode_ci (which should not be a problem). |
|
yes i think so. |
|
The changes in this PR here after last commits still look good to me. So theoretically all is covered. |
|
so is the same process that is needed for weblinks too, right? |
|
Folks, I have already prepared a PR, but as it works fine as it is, both on staging and 3.4.8, I think it is not urgent to change. BTW, this is what it should be as otherwise it breaks (look at |
|
Yes, if they wanna convert + migrate data in this way. Another was which some more complicated extensions seem to go is to save old data, create new tables and then import back old data. |
|
@infograf768 Correct, utf8mb4_bin if path is to be used in join queries or as foreign key. |
|
So @infograf768 shall i make that changes (on create and on update) or remove the PR? |
|
Please do and let's talk on Glip tomorrow. :) |
|
ok |
# Conflicts: # component/admin/sql/install/mysql/install.sql
|
Conflicts fixed, All seems ok now, |
|
@richard67 @andrepereiradasilva |
|
so, for instance, if your solution was already working com_localise could work in all 3.x version with all supported mysql db versions with the xml posted in above comment and the necessary sql files. IMHO it would be a good improvement. updated the xml |
|
@andrepereiradasilva PR #9468 for the core is ready for code review, testing insructions will follow. If you want you can have a look on it and maybe even test with your test package for com_localise. |
also correct 4.0.4 filename (a space existed inthe beggining of the filename)
|
updated test instructions |
|
@infograf768 with @richard67 joomla-cms new PR (joomla/joomla-cms#9468) and the changes i made here this changes will work on 3.3+ versions and also on non utf8mb4 capable servers, so they there is no need to wait for 4.0 after. |
|
I tested this item ✅ successfully on fd79cb3 Tested all cases installation and then update to new version, uninstall this then, then install directly the new version, then uninstall this, with checking database after each step, for following Joomla! Version and utf8mb4 support combinations:
I checked data structures only, not if the localise component works well, because I am not familiar with the component. Result: All as expected, i.e. on 3.5.0 RC 4 + my PR with utf8mb4 support tables are in utf8mb4, other cases tables are in utf8. You should update your test instructions to make clear that on a 3.5.0 (+ my PR), the result is only utf8mb4_unicode_ci (and the 1 utf8mb4_bin column) when utf8mb4 is supported by the database server and client api, otherwise the result is like for the other case (3.4.8). @infograf768 How can I mark a test result here without having a fancy issue tracker? I never marked test results with GitHub UI only so I have no idea. I helped myself by creating the above test result check mark and link to the commit manually. |
updated
i think there is no issue tracker yet for com_localise |
|
For now, I have merged #289 When joomla/joomla-cms#9468 will be merged, we can test this again after updating it. |
|
btw, I do not think we need 191. It is used in core for some varchar. Nowhere in core do we use 191 for a KEY. It is always 100, |
|
Test OK |
|
@infograf768 You do not need 191, yes, but 191 is the correct value for max. possible. I would also have used that in core instead of the 100 if the decision was on me. But someone had a favour for even decimal numbers and so did not like the 191 before I started to work for the utf8mb4 stuff. Is a bit like someone not familiar with dual system says: No I want this buffer be 1000 long and not 1024 because 1024 does not look nice 😄 But for real programmers the 1024 looks perfect, and so does the 191 for people knowing about mysql and utf8mb4. |
|
First of all let me say that i'm by no mean an expert in database systems, just trying to help the project. Why the 191 index size? In this case (the language file path index), when i added a size of 100 i noticed some values in the index were being cropped. I was told that the maximum index length in utf8mb4 is 767 bytes (191 * 4 bytes per character = 764), so there is the reason for the 191. If anyone can confirm that having an index size of 100 will not cause issues, i will change it with no problems. BTW fixed the conflicts after latest merge. Now the check for code difference is more clear. |
|
As joomla/joomla-cms#9468 is not going to make it in 3.5.0, I am going to release a 4.0.15-dev version of com_localise without this PR. I am just waiting for this to be tested: |
|
@infograf768 I agree, better wait with moving to utf8mb4 ... I think we can make the core handle conversion scripts for extensions, so extensions not have to do it with their schema updates or own php scripts (but still have to provide sql script(s) for conversion). Am almost sure I can have something ready soon, maybe for 3.5.1. What Ande does with this PR is a good proof of concept and still needed, only the conversion to utf8mb4 would then in a future solution be moved from the schemapath to a new conversionpath or so, and be executed by the installer on update or after migration when is necessary. |
|
closed as does not makes sense anymore |
Description
This PR is for adding utf8mb4 support to com_localise.
You will need two packages for the tests:
com_localise-develop-utf8mb4.zip
This tests need to be made in a mysql db server with (5.5.3+) AND without (lower than 5.5.3) utf8mb4 support.
How to test
Note: For 3.5.0 tests (WITH or WITHOUT utf8mb4 support) first apply richard patch in joomla latest staging (joomla/joomla-cms#9468).
Database server WITH utf8mb4 support (Mysql 5.5.3+)
Joomla 3.5.0
New install
#__localiseand#__localise_revised_valuestables collations are now utf8mb4_unicode_ciUpdate
#__localisein utf8_general_ci#__localiseand#__localise_revised_valuestables collations are now utf8mb4_unicode_ciJoomla 3.4.8
Do exactly the same tests as for Joomla 3.5.0, the only difference will be the results in the Joomla 3.5.0 install and update tests: the result table collations will be utf8_general_ci, instead of utf8mb4_unicode_ci.
Database server WITHOUT utf8mb4 support (lower than Mysql 5.5.3)
Do exactly the same tests done in
Database server WITH utf8mb4 support (Mysql 5.5.3+), the only difference will be the results in the Joomla 3.5.0 install and update tests: the result table collations will be utf8_general_ci, instead of utf8mb4_unicode_ci.Observations
This test removes also duplicated sql instructions that are already in another PR, but for simplicity they were inserted in this PR too.