X-Git-Url: https://git.saurik.com/wxWidgets.git/blobdiff_plain/4a9dba0e561d2485d9235eab7b51aac8729f1b10..10ff9c616e00e4074dfdc2ac9e354605cc129c22:/docs/html/gettext/gettext_3.html diff --git a/docs/html/gettext/gettext_3.html b/docs/html/gettext/gettext_3.html deleted file mode 100644 index 1311d7068f..0000000000 --- a/docs/html/gettext/gettext_3.html +++ /dev/null @@ -1,606 +0,0 @@ - - - - -GNU gettext utilities - Preparing Program Sources - - - - - - -

Go to the first, previous, next, last section, table of contents. -


- - -

Preparing Program Sources

- -

-For the programmer, changes to the C source code fall into three -categories. First, you have to make the localization functions -known to all modules needing message translation. Second, you should -properly trigger the operation of GNU gettext when the program -initializes, usually from the main function. Last, you should -identify and especially mark all constant strings in your program -needing translation. - -

-

-Presuming that your set of programs, or package, has been adjusted -so all needed GNU gettext files are available, and your -`Makefile' files are adjusted (see section The Maintainer's View), each C module -having translated C strings should contain the line: - -

- -
-#include <libintl.h>
-
- -

-The remaining changes to your C sources are discussed in the further -sections of this chapter. - -

- - - -

Triggering gettext Operations

- -

-The initialization of locale data should be done with more or less -the same code in every program, as demonstrated below: - -

- -
-int
-main (argc, argv)
-     int argc;
-     char argv;
-{
-  ...
-  setlocale (LC_ALL, "");
-  bindtextdomain (PACKAGE, LOCALEDIR);
-  textdomain (PACKAGE);
-  ...
-}
-
- -

-PACKAGE and LOCALEDIR should be provided either by -`config.h' or by the Makefile. For now consult the gettext -sources for more information. - -

-

-The use of LC_ALL might not be appropriate for you. -LC_ALL includes all locale categories and especially -LC_CTYPE. This later category is responsible for determining -character classes with the isalnum etc. functions from -`ctype.h' which could especially for programs, which process some -kind of input language, be wrong. For example this would mean that a -source code using the @,{c} (c-cedilla character) is runnable in -France but not in the U.S. - -

-

-Some systems also have problems with parsing number using the -scanf functions if an other but the LC_ALL locale is used. -The standards say that additional formats but the one known in the -"C" locale might be recognized. But some systems seem to reject -numbers in the "C" locale format. In some situation, it might -also be a problem with the notation itself which makes it impossible to -recognize whether the number is in the "C" locale or the local -format. This can happen if thousands separator characters are used. -Some locales define this character according to the national -conventions to '.' which is the same character used in the -"C" locale to denote the decimal point. - -

-

-So it is sometimes necessary to replace the LC_ALL line in the -code above by a sequence of setlocale lines - -

- -
-{
-  ...
-  setlocale (LC_TIME, "");
-  setlocale (LC_MESSAGES, "");
-  ...
-}
-
- -

-or to switch for and back to the character class in question. On all -POSIX conformant systems the locale categories LC_CTYPE, -LC_COLLATE, LC_MONETARY, LC_NUMERIC, and -LC_TIME are available. On some modern systems there is also a -locale LC_MESSAGES which is called on some old, XPG2 compliant -systems LC_RESPONSES. - -

- - -

How Marks Appears in Sources

- -

-All strings requiring translation should be marked in the C sources. Marking -is done in such a way that each translatable string appears to be -the sole argument of some function or preprocessor macro. There are -only a few such possible functions or macros meant for translation, -and their names are said to be marking keywords. The marking is -attached to strings themselves, rather than to what we do with them. -This approach has more uses. A blatant example is an error message -produced by formatting. The format string needs translation, as -well as some strings inserted through some `%s' specification -in the format, while the result from sprintf may have so many -different instances that it is impractical to list them all in some -`error_string_out()' routine, say. - -

-

-This marking operation has two goals. The first goal of marking -is for triggering the retrieval of the translation, at run time. -The keyword are possibly resolved into a routine able to dynamically -return the proper translation, as far as possible or wanted, for the -argument string. Most localizable strings are found in executable -positions, that is, attached to variables or given as parameters to -functions. But this is not universal usage, and some translatable -strings appear in structured initializations. See section Special Cases of Translatable Strings. - -

-

-The second goal of the marking operation is to help xgettext -at properly extracting all translatable strings when it scans a set -of program sources and produces PO file templates. - -

-

-The canonical keyword for marking translatable strings is -`gettext', it gave its name to the whole GNU gettext -package. For packages making only light use of the `gettext' -keyword, macro or function, it is easily used as is. However, -for packages using the gettext interface more heavily, it -is usually more convenient to give the main keyword a shorter, less -obtrusive name. Indeed, the keyword might appear on a lot of strings -all over the package, and programmers usually do not want nor need -their program sources to remind them forcefully, all the time, that they -are internationalized. Further, a long keyword has the disadvantage -of using more horizontal space, forcing more indentation work on -sources for those trying to keep them within 79 or 80 columns. - -

-

-Many packages use `_' (a simple underline) as a keyword, -and write `_("Translatable string")' instead of `gettext -("Translatable string")'. Further, the coding rule, from GNU standards, -wanting that there is a space between the keyword and the opening -parenthesis is relaxed, in practice, for this particular usage. -So, the textual overhead per translatable string is reduced to -only three characters: the underline and the two parentheses. -However, even if GNU gettext uses this convention internally, -it does not offer it officially. The real, genuine keyword is truly -`gettext' indeed. It is fairly easy for those wanting to use -`_' instead of `gettext' to declare: - -

- -
-#include <libintl.h>
-#define _(String) gettext (String)
-
- -

-instead of merely using `#include <libintl.h>'. - -

-

-Later on, the maintenance is relatively easy. If, as a programmer, -you add or modify a string, you will have to ask yourself if the -new or altered string requires translation, and include it within -`_()' if you think it should be translated. `"%s: %d"' is -an example of string not requiring translation! - -

- - -

Marking Translatable Strings

- -

-In PO mode, one set of features is meant more for the programmer than -for the translator, and allows him to interactively mark which strings, -in a set of program sources, are translatable, and which are not. -Even if it is a fairly easy job for a programmer to find and mark -such strings by other means, using any editor of his choice, PO mode -makes this work more comfortable. Further, this gives translators -who feel a little like programmers, or programmers who feel a little -like translators, a tool letting them work at marking translatable -strings in the program sources, while simultaneously producing a set of -translation in some language, for the package being internationalized. - -

-

-The set of program sources, targetted by the PO mode commands describe -here, should have an Emacs tags table constructed for your project, -prior to using these PO file commands. This is easy to do. In any -shell window, change the directory to the root of your project, then -execute a command resembling: - -

- -
-etags src/*.[hc] lib/*.[hc]
-
- -

-presuming here you want to process all `.h' and `.c' files -from the `src/' and `lib/' directories. This command will -explore all said files and create a `TAGS' file in your root -directory, somewhat summarizing the contents using a special file -format Emacs can understand. - -

-

-For packages following the GNU coding standards, there is -a make goal tags or TAGS which construct the tag files in -all directories and for all files containing source code. - -

-

-Once your `TAGS' file is ready, the following commands assist -the programmer at marking translatable strings in his set of sources. -But these commands are necessarily driven from within a PO file -window, and it is likely that you do not even have such a PO file yet. -This is not a problem at all, as you may safely open a new, empty PO -file, mainly for using these commands. This empty PO file will slowly -fill in while you mark strings as translatable in your program sources. - -

-
- -
, -
-Search through program sources for a string which looks like a -candidate for translation. - -
M-, -
-Mark the last string found with `_()'. - -
M-. -
-Mark the last string found with a keyword taken from a set of possible -keywords. This command with a prefix allows some management of these -keywords. - -
- -

-The , (po-tags-search) command search for the next -occurrence of a string which looks like a possible candidate for -translation, and displays the program source in another Emacs window, -positioned in such a way that the string is near the top of this other -window. If the string is too big to fit whole in this window, it is -positioned so only its end is shown. In any case, the cursor -is left in the PO file window. If the shown string would be better -presented differently in different native languages, you may mark it -using M-, or M-.. Otherwise, you might rather ignore it -and skip to the next string by merely repeating the , command. - -

-

-A string is a good candidate for translation if it contains a sequence -of three or more letters. A string containing at most two letters in -a row will be considered as a candidate if it has more letters than -non-letters. The command disregards strings containing no letters, -or isolated letters only. It also disregards strings within comments, -or strings already marked with some keyword PO mode knows (see below). - -

-

-If you have never told Emacs about some `TAGS' file to use, the -command will request that you specify one from the minibuffer, the -first time you use the command. You may later change your `TAGS' -file by using the regular Emacs command M-x visit-tags-table, -which will ask you to name the precise `TAGS' file you want -to use. See section `Tag Tables' in The Emacs Editor. - -

-

-Each time you use the , command, the search resumes from where it was -left by the previous search, and goes through all program sources, -obeying the `TAGS' file, until all sources have been processed. -However, by giving a prefix argument to the command (C-u -,), you may request that the search be restarted all over again -from the first program source; but in this case, strings that you -recently marked as translatable will be automatically skipped. - -

-

-Using this , command does not prevent using of other regular -Emacs tags commands. For example, regular tags-search or -tags-query-replace commands may be used without disrupting the -independent , search sequence. However, as implemented, the -initial , command (or the , command is used with a -prefix) might also reinitialize the regular Emacs tags searching to the -first tags file, this reinitialization might be considered spurious. - -

-

-The M-, (po-mark-translatable) command will mark the -recently found string with the `_' keyword. The M-. -(po-select-mark-and-mark) command will request that you type -one keyword from the minibuffer and use that keyword for marking -the string. Both commands will automatically create a new PO file -untranslated entry for the string being marked, and make it the -current entry (making it easy for you to immediately proceed to its -translation, if you feel like doing it right away). It is possible -that the modifications made to the program source by M-, or -M-. render some source line longer than 80 columns, forcing you -to break and re-indent this line differently. You may use the O -command from PO mode, or any other window changing command from -GNU Emacs, to break out into the program source window, and do any -needed adjustments. You will have to use some regular Emacs command -to return the cursor to the PO file window, if you want command -, for the next string, say. - -

-

-The M-. command has a few built-in speedups, so you do not -have to explicitly type all keywords all the time. The first such -speedup is that you are presented with a preferred keyword, -which you may accept by merely typing RET at the prompt. -The second speedup is that you may type any non-ambiguous prefix of the -keyword you really mean, and the command will complete it automatically -for you. This also means that PO mode has to know all -your possible keywords, and that it will not accept mistyped keywords. - -

-

-If you reply ? to the keyword request, the command gives a -list of all known keywords, from which you may choose. When the -command is prefixed by an argument (C-u M-.), it inhibits -updating any program source or PO file buffer, and does some simple -keyword management instead. In this case, the command asks for a -keyword, written in full, which becomes a new allowed keyword for -later M-. commands. Moreover, this new keyword automatically -becomes the preferred keyword for later commands. By typing -an already known keyword in response to C-u M-., one merely -changes the preferred keyword and does nothing more. - -

-

-All keywords known for M-. are recognized by the , command -when scanning for strings, and strings already marked by any of those -known keywords are automatically skipped. If many PO files are opened -simultaneously, each one has its own independent set of known keywords. -There is no provision in PO mode, currently, for deleting a known -keyword, you have to quit the file (maybe using q) and reopen -it afresh. When a PO file is newly brought up in an Emacs window, only -`gettext' and `_' are known as keywords, and `gettext' -is preferred for the M-. command. In fact, this is not useful to -prefer `_', as this one is already built in the M-, command. - -

- - -

Special Comments preceding Keywords

- -

-In C programs strings are often used within calls of functions from the -printf family. The special thing about these format strings is -that they can contain format specifiers introduced with %. Assume -we have the code - -

- -
-printf (gettext ("String `%s' has %d characters\n"), s, strlen (s));
-
- -

-A possible German translation for the above string might be: - -

- -
-"%d Zeichen lang ist die Zeichenkette `%s'"
-
- -

-A C programmer, even if he cannot speak German, will recognize that -there is something wrong here. The order of the two format specifiers -is changed but of course the arguments in the printf don't have. -This will most probably lead to problems because now the length of the -string is regarded as the address. - -

-

-To prevent errors at runtime caused by translations the msgfmt -tool can check statically whether the arguments in the original and the -translation string match in type and number. If this is not the case a -warning will be given and the error cannot causes problems at runtime. - -

-

-If the word order in the above German translation would be correct one -would have to write - -

- -
-"%2$d Zeichen lang ist die Zeichenkette `%1$s'"
-
- -

-The routines in msgfmt know about this special notation. - -

-

-Because not all strings in a program must be format strings it is not -useful for msgfmt to test all the strings in the `.po' file. -This might cause problems because the string might contain what looks -like a format specifier, but the string is not used in printf. - -

-

-Therefore the xgettext adds a special tag to those messages it -thinks might be a format string. There is no absolute rule for this, -only a heuristic. In the `.po' file the entry is marked using the -c-format flag in the #, comment line (see section The Format of PO Files). - -

-

-The careful reader now might say that this again can cause problems. -The heuristic might guess it wrong. This is true and therefore -xgettext knows about special kind of comment which lets -the programmer take over the decision. If in the same line or -the immediately preceding line of the gettext keyword -the xgettext program find a comment containing the words -xgettext:c-format it will mark the string in any case with -the c-format flag. This kind of comment should be used when -xgettext does not recognize the string as a format string but -is really is one and it should be tested. Please note that when the -comment is in the same line of the gettext keyword, it must be -before the string to be translated. - -

-

-This situation happens quite often. The printf function is often -called with strings which do not contain a format specifier. Of course -one would normally use fputs but it does happen. In this case -xgettext does not recognize this as a format string but what -happens if the translation introduces a valid format specifier? The -printf function will try to access one of the parameter but none -exists because the original code does not refer to any parameter. - -

-

-xgettext of course could make a wrong decision the other way -round. A string marked as a format string is not really a format -string. In this case the msgfmt might give too many warnings and -would prevent translating the `.po' file. The method to prevent -this wrong decision is similar to the one used above, only the comment -to use must contain the string xgettext:no-c-format. - -

-

-If a string is marked with c-format and this is not correct the -user can find out who is responsible for the decision. See section Invoking the xgettext Program to see how the --debug option can be used for solving -this problem. - -

- - -

Special Cases of Translatable Strings

- -

-The attentive reader might now point out that it is not always possible -to mark translatable string with gettext or something like this. -Consider the following case: - -

- -
-{
-  static const char *messages[] = {
-    "some very meaningful message",
-    "and another one"
-  };
-  const char *string;
-  ...
-  string
-    = index > 1 ? "a default message" : messages[index];
-
-  fputs (string);
-  ...
-}
-
- -

-While it is no problem to mark the string "a default message" it -is not possible to mark the string initializers for messages. -What is to be done? We have to fulfil two tasks. First we have to mark the -strings so that the xgettext program (see section Invoking the xgettext Program) -can find them, and second we have to translate the string at runtime -before printing them. - -

-

-The first task can be fulfilled by creating a new keyword, which names a -no-op. For the second we have to mark all access points to a string -from the array. So one solution can look like this: - -

- -
-#define gettext_noop(String) (String)
-
-{
-  static const char *messages[] = {
-    gettext_noop ("some very meaningful message"),
-    gettext_noop ("and another one")
-  };
-  const char *string;
-  ...
-  string
-    = index > 1 ? gettext ("a default message") : gettext (messages[index]);
-
-  fputs (string);
-  ...
-}
-
- -

-Please convince yourself that the string which is written by -fputs is translated in any case. How to get xgettext know -the additional keyword gettext_noop is explained in section Invoking the xgettext Program. - -

-

-The above is of course not the only solution. You could also come along -with the following one: - -

- -
-#define gettext_noop(String) (String)
-
-{
-  static const char *messages[] = {
-    gettext_noop ("some very meaningful message",
-    gettext_noop ("and another one")
-  };
-  const char *string;
-  ...
-  string
-    = index > 1 ? gettext_noop ("a default message") : messages[index];
-
-  fputs (gettext (string));
-  ...
-}
-
- -

-But this has some drawbacks. First the programmer has to take care that -he uses gettext_noop for the string "a default message". -A use of gettext could have in rare cases unpredictable results. -The second reason is found in the internals of the GNU gettext -Library which will make this solution less efficient. - -

-

-One advantage is that you need not make control flow analysis to make -sure the output is really translated in any case. But this analysis is -generally not very difficult. If it should be in any situation you can -use this second method in this situation. - -

-


-

Go to the first, previous, next, last section, table of contents. - -