-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic message references #80
Comments
In some languages there are grammatical cases(https://en.wikipedia.org/wiki/Grammatical_case). This feature can help with this problem. |
Grammatical cases are already well-supported by Fluent; see http://projectfluent.org/fluent/guide/variants.html. But you're right—these two features will synergize well :) |
I think we should do this, the use-cases look good enough. Localizers' life will be hard in these cases, but less hard than with the alternative. |
Thanks, @Pike. A few more examples which illustrate why it's useful to resolve the references on the localization side rather than in the code (and pass translated strings as arguments). Let's assume a game UI which logs what the player sees:
English Localization-creature-fairy = fairy
-creature-elf = elf
.StartsWith = vowel
you-see =
You see { $object.StartsWith ->
[vowel] an { $object }
*[consonant] a { $object }
}. The German Localization-creature-fairy = Fee
.Genus = Femininum
-creature-elf =
{
*[Nominativ] Elf
[Akkusativ] Elfen
}
.Genus = Maskulinum
you-see =
Du siehst { $object.Genus->
*[Maskulinum] einen { $object[Akkusativ] }
[Femininum] eine { $object[Akkusativ] }
[Neutrum] ein { $object[Akkusativ] }
}. The PS. The examples above don't solve capitalization ( |
While working on https://bugzilla.mozilla.org/show_bug.cgi?id=1435915 I found a use case for this feature. There's an API there which constructs a description of the application handler. It can be a localizable term, like "Portable Document Format (PDF)" or "Video Podcast", it can be a generic description like I handle all three scenarios using a strategy from Gaia days - the API circulates an "l10n type" object: // a string -> l10nId
// an object -> {id: l10nId, args: l10nArgs}
// an object -> {raw: string}
{id: "applications-type-video-podcast-feed"},
{id: "applications-file-ending", args: {extension: ".mp4"}},
{raw: "Windows Video File"}, // this one comes straight from the OS Those strings are resolved in a loop and displayed in a table in Firefox Preferences in a column "type description". Now, the trick is that there's a place in the API which separates how this string is displayed in case there are two entries with the same description. This can happen because for example, there are two file types for "Video Podcast" or "Windows Video File". In that case, there's a special string in Fluent: applications-type-description-with-type = { $description } ({ $type }) which is used to display With support for this UI I could use the I'm going to workaround it for now, but just thought it may be useful to know that we already encountered a use case in Firefox. |
This would also help with cases where a message value is used as an attribute in another element. could be: pane-general-title = General
pane-search-title = Search
category =
.tooltiptext = { $paneTitle } Example2: the item-type-description =
.typeDescription = { $typeDescription } Granted, I don't know how will we store it in |
Aaand another use case:
|
From https://bugzilla.mozilla.org/show_bug.cgi?id=1451450#c6: We'll need to support
Which I think is best solved by adding another level of nesting to the AST, unfortunately. Right now, {
"type": "VariantExpression",
"id": {
"type": "Identifier",
"name": "-term"
},
"key": {
"type": "VariantName",
"name": "name"
}
} In order to support both {
"type": "VariantExpression",
"of": {
"type": "MessageReference",
"id": {
"type": "Identifier",
"name": "-term"
}
},
"key": {
"type": "VariantName",
"name": "name"
}
} This is best visualized with the spans of
|
In Django we have a use case for this feature, not just as a matter of convenience - without it we wouldn't be able to generate correct translations at all. We have exactly the We would also want some way for (Whether Django, with its current investment in gettext, would be able to move to fluent is another matter, but the point applies to other framework-like code, and the choices of frameworks can affect the choices of a lot of other things). |
I think I'm stuck with https://bugzilla.mozilla.org/show_bug.cgi?id=1435915#c15 until this lands. |
Some more explanation would help :) Do you mean something like the following? applications-action-always-ask =
.label = Always ask
applications-action-generic-label = {$menuitem.label} And then in JS: setAttributes(
labelElement,
"applications-action-generic-label",
{
menuitem: new FluentReference(menuitemElement.getAttribute("data-l10n-id")),
}
); It's still an open question for me whether we should allow dynamic reference to messages. In fact, I'd prefer to start by allowing dynamic references to terms only. I have concerns about dynamic reference being abused in scenarios where they're not about grammar. In bug 1435915 comment 16 I suggested a slightly more verbose alternative which will fix the problem outline in the bug. IIUC, the real fix would be to encapsulate the variable shape of the translation with a WebComponent. |
For some use cases - like the ones I mentioned for Django, which will apply to other frameworks - this solution is simply not an option, because we don't know the strings ahead of time, they are supplied by other developers. We'd be left with the kind of 'solution' that @zbraniecki has, which leaves you with broken translations for many uses cases (inability to deal with case/gender agreement etc.). |
So, Zibi's example is actually interesting in two ways: Firstly, it emulates message references. And message references are easy, and also kinda pointless as they're completely atomic. Secondly, it adds fallback for missing message references. In his code example, messages are resolved on the Localization abstraction instead of the Bundle. Which solves a lot of problems we have with message references right now. Even just static ones. I'd love to discuss how message references work as part of the resolver standardization. But I'm also realistic about not getting a fully sync and fully async resolver implemented for all impls that want both. Neither of js, python, or rust have generic sync/async programming, right? To Luke's comment: Terms are effectively language-dependent APIs. Messages referencing terms need to know the API, and all terms for that use need to implement the same API. With static term references, that's already nasty. With dynamic term references, it's an order of magnitude worse. And when you talk about different software packages ... . Say, the German team of the django localizers decides to change the Term API for Also, to clarify, I'm just saying that Mozilla isn't the right org to drive this. That doesn't mean that we shouldn't build the Fluent ecosystem such that someone else can give this a shot. Their task is going to be to figure out these things, beyond writing down APIs and syntax. |
Are you saying that for a scenario like #80 (comment) we should generate |
I would just go for computed values and retranslations. |
It's true that allowing dynamic references would put more work on the translator. But that's the eternal balance that must be played: more work for the developer to create hundreds of nearly identical strings but for swapping out a word or two, or more work for the translator to understand the tech side. But I think the reality is that most strings are fairly basic, and only a handful would require the level of detail that would make a localizer need to look up some syntax. But that's already somewhat expected, after all, given how Fluent is designed to help us move away from "Number of files: X" to "No files" but "1 file" or "2 files", etc. — that work befalls the translator. Rare would be an application that needs extremely complex logic requiring dynamic message references, but better to make it possible than preclude it entirely. I can completely understand Mozilla not wanting to be the driving force if it doesn't have an internal use case (althpugh it sounds like it does), but defining a standard syntax and providing a baseline (even if suboptimal) implementation would do well to further the adoption outside of Mozilla and prevent splintering of the format. |
Another example of where this would be helpful in Firefox - https://bugzilla.mozilla.org/show_bug.cgi?id=1642725 Without that,
The last is an example of the first in this case - the developer causes all Having Dynamic References would make this code clean, intentional, and easy to optimize the DOM bindings around. |
Another potential case - https://phabricator.services.mozilla.com/D80944 In this case, we need to evaluate how the brand name of the product affect the structure of the sentence. It may not be possible to easily place the brand in nominative form, or it may be that we'll have to denote that the argument is in nominative form and ask localizers to adapt the sentence to it. A scenario I imagine might be the most flexible is: For locales where the sentence doesn't depend on any aspect of the variable, use dynamic references:
For locales where it does,
This would allow localizers to adapt sentences which they need and leave the generic form (potentially imperfect) as a |
The solution this patch settled on is also what I think is the best localization practice for a small number of variants: autocomplete-import-logins-from-chrome = Import your login from Google Chrome
autocomplete-import-logins-from-ie = Import your login from Internet Explorer
autocomplete-import-logins-from-safari = Import your login from Safari
# etc. This has the best chance of producing translation of good quality. The localizers have full context in each string, and are also free to introduce any changes to spelling, declension, and others, as they see fit, because each string is independent. |
If that's the best practice, how is Fluent better for this use case than a YAML file with key/value pairs? Sure breaking out every possible variation into a separate key gives you absolute control, but then the burden of translators goes way up (and translation editing tooling gets tasked with trying to lighten the load through suggestions from similar strings etc. which turns into a mess when you start updating anything). |
That's a good question, thanks! I think the notion of the localizer's control is key. A simple key/value pair store takes away this control when we consider plurals, genders, or some forms of declensions. If the source language (often: English) doesn't support a grammatical feature required by the target language, the possibility of creating a well-sounding translation is limited. In the Import your login from… example, however, the reason for the variation is not language-specific: the list of supported browsers is known ahead of time and constant across languages. In this case, I think separate messages offer the most control to localizers, again. If a language requires declension or a different article of some browser names, the localizer can modify the relevant string inline. Does that answer your question? |
Hi, another use case example. It comes from Django. The snippet provides correct Polish translations to sentences consisting of model objects' count and the model's name, like "1 user" or "5 groups" in English. In Polish one of grammatical genders – "masculine personal" (męskoosobowy) – is an exception from others, and requires genitive instead of nominative in plural form for one of the plural categories. Polish localization
-user = użytkownik
.gender = masculine personal
-users = {$case ->
*[nominative] użytkownicy
[genitive] użytkowników
}
-group = group
.gender = feminine
-groups = {$case ->
*[nominative] grupy
[genitive] grup
}
number-of-model-objects = {$name.gender ->
[masculine personal] {$count ->
[one] { $count } { $name }
[few] { $count } { $name-plural(case: "genitive") }
*[many] { $count } { $name-plural(case: "genitive") }
}
*[other] { $count ->
[one] { $count } { $name }
[few] { $count } { $name-plural }
*[many] { $count } { $name-plural(case: "genitive") }
}
}
|
Just a heads up, Linguist has Fluent support now, so you can mark your code blocks for highlighting: ```fluent
example = foo
``` |
These are experimental, not yet standardized constructs. The syntax for dynamic term references (-$term-var) was taken as chosen by an unofficial Perl 6 Fluent implementation. projectfluent/fluent#130 projectfluent/fluent#80 (comment)
So any news on implementing this? Or should we just implement our own? |
This is not currently being worked on. In large part progress here is blocked due to the uncertainty of how or whether MessageFormat 2 will be able to support dynamic message references, and not wanting to introduce new Fluent features that may be challenging to make compatible with MF2. |
@eemeli Even as optional feature, to be able to be only explicitly enabled in given project (with red-alert-style warnings that it may break compatibility with MF2)? It is a highly no-go for dynamic content (as stated above - I am actually investigating Fluent for a game and the Heroes III example is my exact deal-breaker with numbers. Paying translator to translate 10,000 message variants is an awful idea to translate a single line...) |
Status update: I was actually able to workaround my way into this. Relevant playground [Polish, following Heroes III example]: https://projectfluent.org/play/?id=dbf872642497ea2b98efe2afa7585dc1 Steps:
It is actually fairly easy to include this in application-side code and is obvious enough for the translators to keep this as viable solution |
The downside of your approach, which may not be really that relevant for your use case, is that you resolve your first message and second separately. It means that any locale change requires both calls to be re-run for the new locale, which is quirky (at least in DOM scenario). Main value of dynamic references is that it folds this sequence into a single API call between L10n system and the caller securing locale consistency. It's a bit analogous to as if instead of having |
@zbraniecki This is why I call this a "workaround" 😄 But since it is obviously hitting a wall for a proper implementation (and there is no other good quality alternative) I will take this one. It suit my needs good enough and I do not use DOM at all in desired game. |
It is sometimes desired to parametrize message references in placeables. In this issue I'd like to propose a new argument type, extending
FluentType
which could be used to programmatically pass message references as arguments to messages.Problem Statement
Redundancy is considered good for localization. It allows localizers to tailor the wording and the grammar of the translation of each particular case. Also see Fluent Good Practices.
In general, the pattern of having one message per item is preferred over factoring the action out to its own message (
Delete This { $item }
) and passing the translateditem
in some way.In some cases, however, this pattern doesn't scale well.
Consider this example from Firefox (source):
Or the use-case @cruelbob gives in #79 (comment):
One of my favorite games, Heroes of Might and Magic III, pits armies consisting of over 140 different unit types in battles against each other. After every move, the battle log reads:
Or:
If we wanted to avoid concatenation of sentences (two sentences per creature: one for
do X damage
and one forX creatures perish
), we'd end up with 141² = 19,881 different permutations of creature pairs.This doesn't scale well.
Proposed Solution
Introducing some redundancy should still be preferred for small sets of items. For large sets leading to lots and lots of permutations, it should be possible to parametrize the translation of placeables.
I'll use the example of HoMM3 because the other two also require the List Formatting feature to make sense.
I'd like to make it possible to pass external arguments which resolve to message references. Given the following FTL:
…both
$attacker_name
and$defender_name
would be arguments of typeFluentReference
(extendingFluentType
; same asFluentNumber
andFluentDateTime
). The developer would pass them like so:This change mostly requires additions to the
MessageContext
resolution logic. Syntax-wise, theVariantExpression
and theAttributeExpression
should be changed to accept both message identifiers as well as external arguments as parent objects (like in the$attacker_name[singular]
example above).Open Questions
Sign-offs
(toggle)
@Pike
@stasm
@zbraniecki
Also CC @flodolo.
The text was updated successfully, but these errors were encountered: