Add plurals support for CSV translation files #1291

dalexeev · 2020-08-01T14:30:09Z

Describe the project you are working on:

A game (main language is Russian).

Describe the problem or limitation you are having in your project:

CSV translation files do not support plurals. godotengine/godot#40443 adds plurals support for .po files, but CSV is overlooked.
.po files are designed to use English as the primary language, while CSV also allows identifiers.

Comments

Me:

It looks like the tr_n function is not very suitable if you are using identifier system:
tr_n("MY_ID", "", n) # Or `tr_n("MY_ID", "MY_ID", n)`?
From the docs
There are two approaches to generate multilingual language games and applications. Both are based on a key:value system. The first is to use one of the languages as the key (usually English), the second is to use a specific identifier. <...> In general, games use the second approach and a unique ID is used for each string.

@Calinou:

@dalexeev In my experience, gettext PO files are heavily centered around using English text as identifiers. On the other hand, custom formats (like Godot's CSV format) and XLIFF tend to recommend using keys as identifiers.

@pycbouh:

In my experience, gettext PO files are heavily centered around using English text as identifiers.

This is definitely the intended way to use it by the creators for translating Linux, but the file format itself is not enforcing this as a rule in any way. If you use keys as identifiers, some tools may warn you that your translation language is English (POEdit does that, for one), but it's on the user to handle this. In this case the user being the engine.

Describe the feature / enhancement and how it helps to overcome the problem or limitation:

For CSV, we should also implement plurals support. For example like this:

KEY	en	ru
DAYS_AGO[0]	%d day ago	%d день назад
DAYS_AGO[1]	%d days ago	%d дня назад
DAYS_AGO[2]	-	%d дней назад

Usage:

var s = tr_n(n, "DAYS_AGO") % n

That is, we just have to make n the first argument, and it will be compatible with both systems.

Indeed, some cells remain empty. But there are relatively few of them. Note that strings without numeric substitution still require only one row:

KEY	en	ru
REGULAR_KEY	Regular key	Обычный ключ
...	...	...
SPECIAL_KEY[0]	%d key	%d ключ
SPECIAL_KEY[1]	%d keys	%d ключа
SPECIAL_KEY[2]		%d ключей
...	...	...
ANOTHER_KEY	Another key	Другой ключ
...	...	...

There is another option:

KEY	en[0]	en[1]	ru[0]	ru[1]	ru[2]
JUST_KEY	Just a key		Просто ключ
DAYS_AGO	%d day ago	%d days ago	%d день назад	%d дня назад	%d дней назад

But I like the first option better, because strings usually don't have numeric substitutions. Moreover, each language in this variant requires multiple columns. Although if we split the table into two files (for tr() and for tr_n()), then there will be no empty cells at all. But this is also not good, because it complicates the work (2 files instead of 1). In general, the first option is the most compromise.

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

It's not hard to implement. Here's an example to help you understand how this should work:

func tr_n(n: int, key: String) -> String:
    return tr("%s[%d]" % [key, f(n)])

func f(n: int) -> int:
    match TranslationServer.get_locale():
        "en_US":
            if n == 1:
                return 0
            else:
                return 1
        "ru_RU":
            if n % 10 == 1 && n % 100 != 11:
                return 0
            elif n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 10 || n % 100 >= 20):
                return 1
            else:
                return 2
        ...

The only thing, the first option only works with identifiers. The second option also works with English strings as the primary key.

If this enhancement will not be used often, can it be worked around with a few lines of script?:

This is a commonly used feature. In addition, there is currently no way to globally redefine the tr_n function.

Is there a reason why this should be core and not an add-on in the asset library?:

.po files are not a complete replacement for CSV (see above). Therefore, CSV should support plurals as well as .po files.

@akien-mga:

For CSV plurals, I would suggest opening a proposal indeed and doing research on how plurals are handled by other projects that support CSV translations.

From what I found, there are many different CSV translation workflows and the few that support plurals have it hacked in in a way as suggested e.g. here, but there's no common standard. It's a simple system so we can indeed design our own plurals logic, but if there was a somewhat "popular" way of doing plurals with CSV used e.g. in other game engines, it would be best for us to follow that.

The text was updated successfully, but these errors were encountered:

Zylann · 2020-08-03T02:32:35Z

Side note:

.po files are designed to use English as the primary language

That might be a convention, but I think it's not true. I do use identifiers in my game with .po files and it works fine in Godot. I dunno where these convention differences come from but it's not enforced into the formats themselves.

dalexeev · 2020-08-03T07:38:14Z

@Zylann The API added in godotengine/godot#40443 assumes:

# tr_n(message, plural_message, n, context = "")
var s = tr_n("%d day ago", "%d days ago", n) % n

# ru.po
msgid "%d day ago"
msgid_plural "%d days ago"
msgstr[0] "%d день назад"
msgstr[1] "%d дня назад"
msgstr[2] "%d дней назад"

If using IDs:

# tr_n(message, plural_message, n, context = "")
var s = tr_n("DAYS_AGO", "", n) % n

# en.po
msgid "DAYS_AGO"
msgid_plural ""
msgstr[0] "%d day ago"
msgstr[1] "%d days ago"

I suggested changing the order of the arguments:

That is, we just have to make n the first argument, and it will be compatible with both systems.

# tr_n(n, message, plural_message = "", context = "")
var s = tr_n(n, "DAYS_AGO") % n

However, CSV still needs full plurals support. If only because CSV files can be opened in any spreadsheet processor, and .po files are inconvenient to edit without special software.

SkyLucilfer · 2020-08-23T22:51:53Z

I have implemented this feature. It functions like how the proposal describes, using tr_n(n, "DAYS_AGO") will fetch the correct plural translation from the CSV using adjusted key, i.e. DAYS_AGO[0], DAYS_AGO[1] etc. depending on the locale and n.

The PR should be coming soon.

Calinou added the topic:core label Aug 1, 2020

SkyLucilfer mentioned this issue Aug 25, 2020

Add CSV plural support godotengine/godot#41519

Open

Calinou added this to the 4.0 milestone Sep 3, 2021

aaronfranke modified the milestones: 4.0, 4.x Feb 24, 2023

Vovkiv mentioned this issue Dec 22, 2023

Replace current CSV translation system with gettext MewPurPur/GodSVG#347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add plurals support for CSV translation files #1291

Add plurals support for CSV translation files #1291

dalexeev commented Aug 1, 2020 •

edited

Loading

Zylann commented Aug 3, 2020 •

edited

Loading

dalexeev commented Aug 3, 2020

SkyLucilfer commented Aug 23, 2020 •

edited

Loading

Add plurals support for CSV translation files #1291

Add plurals support for CSV translation files #1291

Comments

dalexeev commented Aug 1, 2020 • edited Loading

Describe the project you are working on:

Describe the problem or limitation you are having in your project:

Describe the feature / enhancement and how it helps to overcome the problem or limitation:

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

If this enhancement will not be used often, can it be worked around with a few lines of script?:

Is there a reason why this should be core and not an add-on in the asset library?:

Zylann commented Aug 3, 2020 • edited Loading

dalexeev commented Aug 3, 2020

SkyLucilfer commented Aug 23, 2020 • edited Loading

dalexeev commented Aug 1, 2020 •

edited

Loading

Zylann commented Aug 3, 2020 •

edited

Loading

SkyLucilfer commented Aug 23, 2020 •

edited

Loading