-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do we allow multiple multi-select messages to nest inside one another? #103
Comments
The example seems incomplete. If I understand what the [] notation implies, I would have expected Was something like that intended ? |
Yeah, you're right, it needed fixing in the way that you suggested. Hopefully, I amended it correctly. |
@asmusf, I agree. The top example above would be a painful syntax for us to adopt, as it's basically a giant string concatenation. Translators adore (heavy sarcasm) having to work around ICU MessageFormat's syntax. I have a heavy preference for a syntax that enforces complete strings at the cost of repetition when nested and thus prefer something like @echeran's amended example. This is "ugly" in this case because of three levels of nesting making a lot of repetitious strings and we all get tired of typing them. CAT tools make short work of translating the variations and the results can be grammatically correct (ftw). FWIW, Android doesn't allow nesting and the Android team when I've talked to them cited the combinatorial explosion and complexity thereof in deciding to not support it. I think that's wrong, but do think that the third level of nesting approaches unsustainability, particularly if you're working with a language with more "slots". I think the inconvenience is enough of a discouragement to deep nesting so IMHO we should support multiple nesting levels. |
My position is that we should be very careful about what we limit on which level. The idea that something is hard to work with is a great reason to discourage users and ecosystems from using it, but I'm very concerned about the idea of using data model or syntax to impose limitations based on our preferences and our read of current situation. The set of linter rules, the set of high-level CAT tool features enabled/disabled, and the set of what each organization will allow for and disallow is a moving target. Data model is set in stone until another monumental effort like MFWG comes up, in no small part due to frustrations with the limits of the previous data model. It is my belief that every time we impose our "best practice" by tailoring data model, we're shortening the life span of the result of our work. |
I'd rather not make concatenation illegal, but rather recommend against it. If we follow the usually good rule of being lenient on input but strict on output, we could well support parsing either variant, but strongly recommend that any tool that's outputting MF2 source would use the second form. Provided that we require any and all function calls to be free of side effects and make sure that we remain free of loops and other complications, transforming a message to a canonical form is a pretty easy operation, no matter what the syntax is. |
I was definitely in the camp of enforcing full strings until I had to deal with a real example this week which would have required to use 6 levels of nesting. In these types of extreme scenarios, concatenation would most likely work better, at the cost of the linguistic issues and TMS integration that will come with it. So I would tend to agree that supporting both approaches with clear best practices documentation might be a good approach. |
I think that allowing nesting only kicks the can down the road the to translation tools or the translators. Only that in order to make the life of the developers easier we make the one of the translators harder. Worse, messages go through segmentation, leveraging, etc., way before the translators get to touch them, and through validation (and TM updates, etc) after the translator. All of these steps would need to be changed in all popular translation tools. This will not happen, and there will be no addoption. So allowing internal selectors all the way to translation tools will not work. Anyway... The two representations are 100% compatible:
vs (ignore syntax):
It is actually possible to do this algorithmically:
Take the prefix ("foo ") and add it as prefix to each "branch" of the selection:
Then take the suffix (" bar" and add it as a suffix to each "branch" of the selection:
It is like math: This works recursively for multiple selectors. What I'm trying to argue for:
Options:
We should of course look at pros and cons for each. My choice would be 2. We keep the data model simple, and we don't loose any flexibility. Mihai Note about 6 decisions in one message Google restricts nesting to: I've seen a few cases where 2 plurals would have been handy, and I want to make a case for allowing that. So I am quite sure that 6 selectors can be refactored to be more localization friendly. |
In addition to the round-tripping which was discussed at today's meeting, I would like to present at least one real-world sample message that would be rather horrible to work with if selectors could not be included within messages:
That is a natural-language summary for the results of a search among an event's programme items, as presented in a minimalist UI. It is literally the message that got me looking for a tool like MessageFormat to make it bearable to work with, and internationalisable. That single message contains seven selectors, each with two cases. As MF1, it's pretty complex, but still better than any alternative. If MF2 only supported message-level selectors, a total of 128 cases would need to be defined for it, and adding or removing a selector would become effectively impossible. |
I've added this comment by email, with colors and fancy formatting, not realizing it comes from a GitHub issue.
The result here was somewhat messy.
And GitHub did not allow me to fix it (because "Email replies do not support Markdown", not clear why)
I have reformatted it below (with will make Longl Ho's comment look out of order).
Sorry, my fault.
|
+1 to @mihnita Dropbox also has strict rules against nesting levels and enforces full sentences via linters as well, otherwise things got kicked back from our translation vendor as untranslatable. |
I've tried to translate it into Romanian, and I find that unreadable. And I am a developer. Let's take the longest possible combination:
And let's make N = 1 / 42:
I have no idea what The various substrings ( Using It will be a major pain for languages that don't use spaces (Chinese, Japanese, Thai, etc.) In Romanian (and other Romance languages)
So that part has to be somehow "dragged" inside the selection. By doing this I am already forced to do some "nasty nesting", as all the text from one/{ALL} and item(s) becomes single selector. Also, "item" should come before "current and future" Worse: "matching" should also match the plural of items:
I don't count 7 selectors, I only count 6 (N shows twice). TLDR: this makes the life of ONE English developer easier. I would rather translated 64 messages than this.
To translate this I really have to "untangle" it in my brain (or on paper), so I end up with the many combinations that we tried to avoid. Then I translate, and try to figure out how to "compress" it back (including the ugly nesting that the English does not need, but I do). So, if anything, I see this example is an argument for NOT supporting internal selectors. Mihai |
How would I handle the example above? This sounds like a pretty technical message, which does not sound very natural anyway.
And "build" the
And then I know it is not ideal... but I think the result is not that bad. Still not sure what Let's take "new", for example:
("new" goes after "items"): |
@mihnita's suggestion of splitting up the parts and then using ListFormat to collate them is probably how I'd solve it now, but that wasn't possible back in 2014 when I wrote the original. It's also a good argument for allowing messages to be built out of other partial messages. Summarising/rephrasing my points here:
|
I'm a strong proponent of message references. Either via
which allows
Super strong agreement. I believe that every time I see a suggestion to reject something in the data model because "it's not a good practice" is shorting the longevity of our solution.
+1 |
Given the consensuses reached at last week's meeting, should we have another task force meeting to see if we could resolve this one? As relevant context, see also the conversation in #130 for what this decision will effectively imply. |
This issue is to start (continue) the discussion. Previous discussions have occurred in working group meetings and more recently zbraniecki/message-format-2.0-rs#6
One option proposed in the above linked issue shows the style of nested multi-select messages that is currently supported in ICU MessageFormat, Fluent, etc.
Example:
10 friends from 2 countries liked her profile.
:The issue also links to this issue representing an alternative that has also been discussed during working group meetings: projectfluent/fluent#4 . Applying the alternative to the above example might look like
Other sources, docs, and slide decks have been made for the working group in favor of these options -- feel free to include those here as part of the discussion.
The text was updated successfully, but these errors were encountered: