We sometimes use libraries and components which are not very popular. We may have to pay special attention for internationalization of these libraries and components.
On the other hand, we can use libraries and components for improvement of internationalization. This chapter introduces such a libraries and components.
GNU Gettext is a tool to internationalize messages a software outputs according
to locale status of LC_MESSAGES. A gettext
ized
software contains messages written in various languages (according to available
translators) and a user can choose them using environmental variables. GNU
gettext is a part of Debian system.
Install gettext
package and read info pages for details.
Don't use non-ASCII characters for 'msgid'. Be careful because you may tend to use ISO-8859-1 characters. For example, '©' (copyright mark; you may be not able to read the copyright mark NOW in THIS document) is non-ASCII character (0xa9 in ISO-8859-1). Otherwise, translators may feel difficulty to edit catalog files because of conflict between encodings for msgid and in msgstr.
Be sure the message can be displayed in the assumed environment. In other
words, you have to read the chapter of 'Output to Display' in this document and
internationalize the output mechanism of your software prior to
gettext
ization. ENGLISH MESSAGES ARE PREFERRED EVEN FOR
NON-ENGLISH-SPEAKING PEOPLE, THAN MEANINGLESS BROKEN MESSAGES.
The 2nd (3rd, ...) byte of multibyte characters or all bytes of non-ASCII characters in stateful encodings can be 0x5c (same to backslash in ASCII) or 0x22 (same to double quote in ASCII). These characters have to properly escaped because present version of GNU gettext doesn't care the 'charset' subitem of 'Content-Type' item for 'msgstr'.
A gettext
ed message must not used in multiple contexts. This is
because a word may have different meaning in different context. For example, a
verb means an order or a command if it appears at the top of the sentence in
English. However, different languages have different grammar. If a verb is
gettext
ed and it is used both in a usual sentence and in an
imperative sentence, one cannot translate it.
If a sentence is gettext
ed, never divide the sentence. If a
sentence is divided in the original source code, connect them so as to single
string contains the full sentence. This is because the order of words in a
sentence is different among languages. For example, a routine
printf("There "); switch(num_of_files) { case 0: printf("are no files "); break; case 1: printf("is 1 file "); break; default: printf("are %d files ", num_of_files); break; } printf("in %s directory.\n", dir_name);
has to be written like that:
switch(num_of_files) { case 0: printf("There are no files in %s directory", dir_name); break; case 1: printf("There is 1 file in %s directory", dir_name); break; default: printf("There are %d files in %s directory", num_of_files, dir_name); break; }
before it is gettext
ized.
A software with gettext
ed messages should not depend on the length
of the messages. The messages may get longer in different language.
When two or more '%' directive for formatted output functions such as
printf() appear in a message, the order of these '%' directives
may be changed by translation. In such a case, the translator can specify the
order. See section of 'Special Comments preceding Keywords' in info page of
gettext
for detail.
Now there are projects to translate messages in various softwares. For
example, Translation
Project
.
At first, the software has to have the following lines.
int main(int argc, char **argv) { ... setlocale (LC_ALL, ""); /* This is not for gettext but all i18n software should have this line. */ bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE); ... }
where PACKAGE is the name of the catalog file and LOCALEDIR is "/usr/share/locale" for Debian. PACKAGE and LOCALEDIR should be defined in a header file or Makefile.
It is convenient to prepare the following header file.
#include <libintl.h> #define _(String) gettext((String))
and messages in source files should be written as _("message"), instead of "message".
Next, catalog files have to be prepared.
At first, a template for catalog file is prepared using xgettext
.
At default a template file message.po is prepared. [27]
Though gettext
ization of a software is a temporal work,
translation is a continuing work because you have to translate new (or
modified) messages when (or before) a new version of the software is released.
***** Not written yet *****
Readline library need to be internationalized.
***** Not written yet *****
Ncurses is a free implementation of curses library. Though this library is now maintained by Free Software Foundation, it is not covered by GNU General Public License.
Ncurses library need to be internationalized.
Introduction to i18n
14 February 2003kubota@debian.org