-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathTutorial.Rmd
172 lines (116 loc) · 8.4 KB
/
Tutorial.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
---
title: "Tools for Developing Diagnostic Messages"
date: "`r Sys.Date()`"
output:
html_document:
fig_caption: false
toc: true
toc_float:
collapsed: false
smooth_scroll: false
toc_depth: 3
vignette: >
%\VignetteIndexEntry{Tools for Developing Diagnostic Messages}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
**msgtools** provides a simplified workflow for creating, maintaining, and updating translations of messages in both R and C code that improve upon those available in the **tools** package. For some context on message localization in R, see ["R Administration"](https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#Localization-of-messages) and ["Translating R Messages, R >=3.0.0"](https://developer.r-project.org/Translations30.html)). A list of "translation teams" is available from https://developer.r-project.org/TranslationTeams.html. This vignette walks through the basic functionality of **msgtools**, in particular how to use the package to generate message translations.
## Intro to R message localization
To begin *localization* (l10n) of messages, a package developer simply needs to (1) write translations for these messages using a standardized file format and (2) install the translations into their package. Localization traditionally requires the manual creation of a `.pot` ("portable object template") file using the `xgettext` command line utility and the generation of language-specific .po ("portable object") translation files that specify the translation of each message string into the target language.
The .pot/.po file format is fairly straightforward, simply containing a series of messages and their translations, plus some metadata:
```
#: lib/error.c:116
msgid "Unknown system error"
msgstr "Error desconegut del sistema"
```
(The R source contains a number of examples of translations, for example in [the base package](https://svn.r-project.org/R/trunk/src/library/base/po/).)
**msgtools** negates the need to directly call the command-line utilities or interact with .pot/.po files by relying on R representations of templates and translations (provided by [poio](https://cran.r-project.org/package=poio)). This makes it possible to programmatically create, modify, and install translations, thus eliminating the need to manually open each translation file in a text editor.
## Message Translations (Localization)
Let's start by creating a simple package containing some translatable messages. If you are working on your own package, of course skip these steps. We'll use **devtools** to first create a simple package:
```{r}
library("devtools")
# Create a package in the temp directory
description <- list(
Title = "An Example Package",
Description = "Demonstrates 'msgtools' functionality",
BugReports = "https://github.com/RL10N/msgtools/issues",
License = "Unlimited"
)
pkg_dir <- file.path(tempdir(), "translateme")
dir.create(pkg_dir)
create(pkg_dir, description = description, rstudio = FALSE)
```
We will then add a function to this "dummy" package that contains a number of translatable messages:
```{r}
library("msgtools")
(ex <- msgtools:::translatable_messages)
dump("ex", file = file.path(pkg_dir, "R", "fns.R"))
```
We will just confirm that the function has been added to the package by using the `get_messages()` function, which extracts all translatable messages from a package's code:
```{r}
# check that GNU gettext is available (required for msgtools to work)
check_for_gettext()
get_messages(pkg_dir)
```
(Note: If your working directory is within the package, you do *not* need to explicitly specify the package directory, `pkg`.)
This shows that any call to `stop()`, `warning()`, `message()`, `gettext()`, `ngettext()`, or `gettextf()` is automatically internationalized. This means that you likely do not have to change any R code to make it ready for translation. All you have to do is *localize* the messages into whatever target languages you are interested in. If you translate messages into a given language, those translations will be shown to end-users when operating in a locale that uses that language; otherwise the original English language messages will be shown. (For this to work, R has to be installed with internationalization support (which is optional).) With the package setup, we can now start configuring the package for localization.
To do that, start by calling `use_localization()`, which will add a `/po` directory to the package and initialize a `.pot` ("portable object template") file that contains all translatable messages:
```{r}
use_localization(pkg_dir)
```
You can also use the `use_l10n()` function, which is an alias with a slightly shorter name.
### Translating Messages
Once that initial step is performed, you can start creating translations. Translations are stored in `.po` ("portable object") files, based on the .pot template file. The format is pretty simple, with each message and its translation set next to one another, along with some metadata:
```
#: lib/error.c:116
msgid "Unknown system error"
msgstr "Error desconegut del sistema"
```
(The R source contains a number of examples of translations, for example in [the base package](https://svn.r-project.org/R/trunk/src/library/base/po/).)
To do so, you simply call `make_translation()` with the language and contact information for the translator (possibly you or someone else). Here's the code to initialize a Spanish translation:
```{r}
es <- make_translation("es", translator = "Awesome Translator <[email protected]", pkg = pkg_dir)
```
This creates an empty translation `"po"` object in memory based on the template file.
We could also do this interactively using `edit_translation(es)`
Once you are satisfied with the translations, you can write them to the package directory using:
```{r}
write_translation(es, pkg = pkg_dir)
```
If you want, you can also edit the files manually using a text editor. The `make_translation()` file respects any existing translations, so calling it again will update the .po file against the template and load the updated file into memory:
```{r}
es <- make_translation("es", pkg = pkg_dir)
```
### Updating Translations
If you make changes to your code, you will need to update the .pot template file, as well as any translations. To do so, simply sync the template (updating it against your current code) and remake the translations:
```{r}
sync_template(pkg = pkg_dir)
es <- make_translation("es", pkg = pkg_dir)
```
### Installing Translations
Once we have the messages translated, we can check them for errors using `check_translations()` and then install them into the package by using `install_translations()`, which will create a `inst/po` directory and populate it with the translation `.mo` ("message object") binaries:
```{r}
check_translations(pkg_dir)
install_translations(pkg_dir)
```
That's it! The message translations will now be built and installed with the package, as you can see here:
```{r}
dir(file.path(pkg_dir, "inst", "po"), recursive = TRUE)
```
### Messages in Compiled Code
Note that R-level messages and messages in compiled code are handled separately; with separate .pot (template) and .po (translation) files for R and C code for each language. **msgtools** is primarily designed to work with R-level messages, but also supports working with C-level messages. Essentially all functions in the package will acept a `domain = "C"` argument to handle translations for messages in C code.
## Some Other Tools
Beyond support for message localization, **msgtools** also provides some basic diagnostic functions for examining and working with messages. One of these we have already seen: `get_message()` simply returns a data frame of messages. Other functionality includes spell-checking of messages:
```{r}
spell_check_msgs(pkg = pkg_dir)
```
This can assess spelling in other languages, as well.
Another simple function is provided to compare string edit distance between messages, to identify messages that are used multiple times throughout a package that may have slight inconsistencies.
```{r}
get_message_distances(pkg = pkg_dir)
```
Here you can see, for example, the close similarity of "Every wife had %d sacks" and "Every cat had %d kits" but clearly these two messages are meant to be distinct.
The plan is to add additional utilities, for example to document messages in package help files and to identify possible errors in message texts (e.g., pluralization).
```{r, echo=FALSE}
unlink(pkg_dir, recursive = TRUE)
```