Skip to content

Conversation

@infograf768
Copy link
Member

As title says.
Completes #20711

Summary of Changes

in #20711 when using a non-latin language, the aliases for the category and article are for example in Greek: el-gr

After this patch they will be unicode, i.e for Greek article
Title: Άρθρο (el-gr) Alias: άρθρο-el-gr Category: Κατηγορία (el-gr)

For Greek Category
Title: Κατηγορία (el-gr) Alias: κατηγορία-el-gr

Testing Instructions

As the Greek pack has issues, install at least Persian language (it is one of the languages installable in 4.0.)
Patch and install the Multilingual sample data.

@infograf768 infograf768 changed the title [4.0] Changing aliases to unicode for articles and categories for non latin languages. [4.0] Multilingual sample data: Changing aliases to unicode for articles and categories for non latin languages. Jun 14, 2018
@infograf768
Copy link
Member Author

@Bakual @laoneo

@brianteeman
Copy link
Contributor

setting unicodeslugs is something that is normally done in the global configuration. IF I read this code correctly it will only set the slugs to unicode for the sample data. Won't that then lead to confusion for the user when they create content and dont get unicode slugs

@infograf768
Copy link
Member Author

... look at blog.php before commenting. thanks.

@brianteeman
Copy link
Contributor

I said IF I read it correctly

However I have now read that file which isnt used in this pr and that also has the same bug

@mbabker
Copy link
Contributor

mbabker commented Jun 14, 2018

#17917 and related PR for context.

@brianteeman
Copy link
Contributor

that just explains why it was done this way - doesnt it mean it is the correct way.

@Bakual
Copy link
Contributor

Bakual commented Jun 14, 2018

As it is the same code we already have in the other sample data plugin, it should be fine.

@mbabker
Copy link
Contributor

mbabker commented Jun 14, 2018

With J4 the transliteration comment pointed out before should no longer be a problem, so it can probably be done "right" now.

Won't that then lead to confusion for the user when they create content and dont get unicode slugs

How were the slugs generated in the sample data SQL files. Same potential confusion point exists based on the state of whomever's database and global config that data was exported from.

@brianteeman
Copy link
Contributor

install multilingual sample data with the persion language (and this pr)

Article etc successfully created with unicode - open the article and then save it

Bye Bye unicode alias because the global configuration is set to no unicode alias.

See movie
uicode

@mbabker
Copy link
Contributor

mbabker commented Jun 14, 2018

Serious question now, because I don't do multilanguage (other people do my dirty work on the sites I maintain that support it) and because I don't know the history or requirements of the feature.

With J4 and PHP 7, is there anything in transliterated versus unicode that really needs us to keep having this behavior toggle? Or what is the real gain to keeping this behavior toggle?

@infograf768
Copy link
Member Author

infograf768 commented Jun 15, 2018

How were the slugs generated in the sample data SQL files. Same potential confusion point exists based on the state of whomever's database and global config that data was exported from.

In 3.x, this is not solved and we get el-gr or fa-ir as alias when installing Joomla with the optional multilang with a non latin language. I guess because such a multilang setting was designed to just see how a basic multilingual site would be working and it did not really matter. Or because we just did not consider the problem which I just found now.
After all, as we always got an alias that was readable, it looked ok.
Importing anything containing a unicode alias does not break a site at all.
When editing such an item indeed and saving it, the alias could be changed to a date if unicode alias is not set which is the default behavior since 1.5 when the name/title can't be transliterated. If there is some parts in the alias which are pure ascii, that part would be kept as explained.

We can, if it bothers so much some people, for this multilingual sample data plugin, decide to use a fixed alias instead of unicode, for example:
for category
'alias' => 'cat-' . strtolower($itemLanguage->language) ,
for article
'alias' => 'art-' . strtolower($itemLanguage->language) ,

EDIT: imho there is no way to switch to non-unicode for the blog sampledata plugin.

With J4 and PHP 7, is there anything in transliterated versus unicode that really needs us to keep having this behavior toggle? Or what is the real gain to keeping this behavior toggle?

Can you explain what you mean by behavior toggle? If you mean keeping the unicode alias feature switch, anyone using wikipedia would immediately understand why we implemented the feature and should keep it. Transliteration just does not exist for Arabic, Chinese, etc. via a php file and it is common now to use unicode in urls as well as IDN, which we also implemented.
Using PHP7, as we have discussed recently, has nothing to do with the matter. We do know there is a specific transliteration php method, but it requires a module that is not enabled on all Servers.
In any case, even if we were forcing that (which you said we can't), it would only solve our need of some ascii transliterate in packs and joomla generally speaking. Users should still be able to switch to pure unicode.

@infograf768
Copy link
Member Author

Sorry, did not think that you would maybe mean the opposite, i.e. no switch but always unicode.
It was decided to let the choice to the user as copy/pasting (in a forum, in a mail, or in glip for example) unicode urls may not obtain the same results depending on the browser you copied it from.
For example, Firefox let's copy with percent encodings. In this case NO problem, the link is clickable.
Safari copies with unicode and it may breaK the link.
Here are screenshots in Glip

Safari

screen shot 2018-06-15 at 08 28 07

Firefox

screen shot 2018-06-15 at 08 27 57

Clicking on a link in the html does not create any issue.

@mbabker
Copy link
Contributor

mbabker commented Jun 15, 2018

Using PHP7, as we have discussed recently, has nothing to do with the matter. We do know there is a specific transliteration php method, but it requires a module that is not enabled on all Servers.

True, the Intl extension may not always be installed (it isn't by default). Another matter at the time was the PHP Transliterator wasn't available for PHP 5.3, that was the bigger hurdle to things with support in 3.x. Maybe we're in a state now where we can re-evaluate and decide if we can even conditionally support that or if we need to stick with the code we already have.

And thanks for the rest, it honestly clears up why the unicode config option is there for me.

@brianteeman
Copy link
Contributor

Just a thought but is there any reason that the unicode alias is a global setting as opposed to a language setting. Would someone in greek want a non-unicode alias? I dont know. But perhaps if the unicode alias setting was moved to the language xml definition and then could be overwridden in the content language setting if needed then this problem would be resolved. As soon as the greek content language was created it would know that it should be a unicode alias etc.

@infograf768
Copy link
Member Author

infograf768 commented Jun 15, 2018

unicode alias is achoice or not for every site and should never be enforced by the lang pack.
And, yes some Greek do not want unicode, thus why was enforced in the greek pack a transliteration method. As content languages can be deleted, they can’t obviously be used to define such parameter.

@brianteeman
Copy link
Contributor

I did NOT say enforced

@coolcat-creations
Copy link
Contributor

Not sure how to test this, I installed Persian, applied the patch, created the sample data module, clicked on Multilingual Sample Data, then a Persian Menu was installed with a Home item and an hidden category list item, I don't see any items in there.

@ghost
Copy link

ghost commented Sep 8, 2018

I don't see any items in there.

@coolcat-creations cause waiting of Decision #21553

@ghost ghost added the J4 Issue label Apr 5, 2019
@ghost ghost removed the J4 Issue label Apr 13, 2019
@roland-d
Copy link
Contributor

Can this be tested now?

@ghost ghost changed the title [4.0] Multilingual sample data: Changing aliases to unicode for articles and categories for non latin languages. [4.0] Multilingual sample data: Changing aliases to unicode for articles and categories for non latin languages Sep 1, 2019
@infograf768
Copy link
Member Author

Drone relaunched

@infograf768
Copy link
Member Author

@roland-d
system-test-mysql is failing. No idea why as the PR is unrelated to that.

@infograf768
Copy link
Member Author

Can be tested again :)

@sebenns
Copy link
Contributor

sebenns commented Sep 2, 2019

I have tested this item 🔴 unsuccessfully on 44a1f84

Just followed the steps as written in testing instructions. Clean installation of joomla 4 a 12, installed persian lang and created samples. Then edited a menu entry in the menu manager for fa-IR -> got the same result as written in the comments already #20759 (comment)


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/20759.

//Edit: I need to add some lines to this.

As it is written in the comment I've linked, if the unicode settings is set to true, it will work - the question is, shouldn't it work if I just follow the test instructions steps - like a normal user would install only the multilanguage sample data, without turning the settings on?

Looking in the changed code @multilang.php there can be identified a way to achieve the perfect solution by turning the unicode settings on. Problem here, it won't save the changed settings.

Factory::getConfig()->set('unicodeslugs', $unicode);

@infograf768
Copy link
Member Author

@datsepp
please look at #17930
I use here the same code as we did there for the blog sample data.

@infograf768
Copy link
Member Author

@datsepp
As explained in private glip, the purpose of this PR is not to set unicodeslugs in global config after the multilingual sample data installation.

@roland-d
Copy link
Contributor

roland-d commented Sep 5, 2019

@infograf768 I am not sure what the correct way is here either. However if you install the sample data and you get your Unicode slugs and after that edit an item and save it, the Unicode slug is gone. @brianteeman mentioned that as well in his comment ( #20759 (comment) ).

My reasoning is, people installing sample data are most likely doing this in either a test site or a development site, not in a live site. So I don't see changing the Unicode setting to true as a problem. Even if it is set to true, if the user doesn't use a language requiring it, they won't notice it.

Why wouldn't we set the Unicode to Yes when importing Unicode slugs?

@infograf768
Copy link
Member Author

Why wouldn't we set the Unicode to Yes when importing Unicode slugs?

Because this is only a sample data and user may not want to use unicode slugs for the final site, including for non-latin languages. As I explained already, some users for Greek language may prefer to use the Greek default transliteration.

@roland-d
Copy link
Contributor

roland-d commented Sep 5, 2019

As I explained already, some users for Greek language may prefer to use the Greek default transliteration.

So we have 2 evils here :) Users whose language can be transliterated but don't want it and users who want to use it but their transliterated items are broken :/

@roland-d
Copy link
Contributor

roland-d commented Sep 5, 2019

I have tested this item ✅ successfully on c17445e

After installing the sampledata I can see the sample items with Unicode slugs


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/20759.

@roland-d
Copy link
Contributor

roland-d commented Sep 5, 2019

All clear, test is good as well.

@sebenns
Copy link
Contributor

sebenns commented Sep 6, 2019

I have tested this item ✅ successfully on d27e53f

Installed a non-latin language (persian), applied patch and installed the multilingual sample data. Everything went well, the aliases for category and article are in unicode as expected so.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/20759.

@ghost
Copy link

ghost commented Sep 6, 2019

Status "Ready To Commit". 1 Build failing.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label Sep 6, 2019
@wilsonge wilsonge merged commit 3499099 into joomla:4.0-dev Sep 9, 2019
@wilsonge
Copy link
Contributor

wilsonge commented Sep 9, 2019

Thanks JM!

@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Sep 9, 2019
@wilsonge wilsonge added this to the Joomla 4.0 milestone Sep 9, 2019
@infograf768 infograf768 deleted the 4.0multilangsample2 branch September 9, 2019 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants