X-Git-Url: https://git.saurik.com/wxWidgets.git/blobdiff_plain/69477ac44979d42e1139dc70a2be7c0390043246..066f3611df971be93b2ec46b82c2f05f3ff9a422:/docs/html/gettext/gettext_8.html diff --git a/docs/html/gettext/gettext_8.html b/docs/html/gettext/gettext_8.html deleted file mode 100644 index a028ce9f83..0000000000 --- a/docs/html/gettext/gettext_8.html +++ /dev/null @@ -1,896 +0,0 @@ - -
- - -Go to the first, previous, next, last section, table of contents. -
- - -
-One aim of the current message catalog implementation provided by
-GNU gettext
was to use the systems message catalog handling, if the
-installer wishes to do so. So we perhaps should first take a look at
-the solutions we know about. The people in the POSIX committee does not
-manage to agree on one of the semi-official standards which we'll
-describe below. In fact they couldn't agree on anything, so nothing
-decide only to include an example of an interface. The major Unix vendors
-are split in the usage of the two most important specifications: X/Opens
-catgets vs. Uniforums gettext interface. We'll describe them both and
-later explain our solution of this dilemma.
-
-
catgets
-The catgets
implementation is defined in the X/Open Portability
-Guide, Volume 3, XSI Supplementary Definitions, Chapter 5. But the
-process of creating this standard seemed to be too slow for some of
-the Unix vendors so they created their implementations on preliminary
-versions of the standard. Of course this leads again to problems while
-writing platform independent programs: even the usage of catgets
-does not guarantee a unique interface.
-
-
-Another, personal comment on this that only a bunch of committee members -could have made this interface. They never really tried to program -using this interface. It is a fast, memory-saving implementation, an -user can happily live with it. But programmers hate it (at least me and -some others do...) - -
--But we must not forget one point: after all the trouble with transfering -the rights on Unix(tm) they at last came to X/Open, the very same who -published this specifications. This leads me to making the prediction -that this interface will be in future Unix standards (e.g. Spec1170) and -therefore part of all Unix implementation (implementations, which are -allowed to wear this name). - -
- - - -
-The interface to the catgets
implementation consists of three
-functions which correspond to those used in file access: catopen
-to open the catalog for using, catgets
for accessing the message
-tables, and catclose
for closing after work is done. Prototypes
-for the functions and the needed definitions are in the
-<nl_types.h>
header file.
-
-
-catopen
is used like in this:
-
-
-nl_catd catd = catopen ("catalog_name", 0); -- -
-The function takes as the argument the name of the catalog. This usual
-refers to the name of the program or the package. The second parameter
-is not further specified in the standard. I don't even know whether it
-is implemented consistently among various systems. So the common advice
-is to use 0
as the value. The return value is a handle to the
-message catalog, equivalent to handles to file returned by open
.
-
-
-This handle is of course used in the catgets
function which can
-be used like this:
-
-
-char *translation = catgets (catd, set_no, msg_id, "original string"); -- -
-The first parameter is this catalog descriptor. The second parameter
-specifies the set of messages in this catalog, in which the message
-described by msg_id
is obtained. catgets
therefore uses a
-three-stage addressing:
-
-
-catalog name => set number => message ID => translation -- -
-The fourth argument is not used to address the translation. It is given
-as a default value in case when one of the addressing stages fail. One
-important thing to remember is that although the return type of catgets
-is char *
the resulting string must not be changed. It
-should better const char *
, but the standard is published in
-1988, one year before ANSI C.
-
-
-The last of these function functions is used and behaves as expected: - -
- --catclose (catd); -- -
-After this no catgets
call using the descriptor is legal anymore.
-
-
catgets
Interface?!
-Now that this descriptions seemed to be really easy where are the
-problem we speak of. In fact the interface could be used in a
-reasonable way, but constructing the message catalogs is a pain. The
-reason for this lies in the third argument of catgets
: the unique
-message ID. This has to be a numeric value for all messages in a single
-set. Perhaps you could imagine the problems keeping such list while
-changing the source code. Add a new message here, remove one there. Of
-course there have been developed a lot of tools helping to organize this
-chaos but one as the other fails in one aspect or the other. We don't
-want to say that the other approach has no problems but they are far
-more easily to manage.
-
-
gettext
-The definition of the gettext
interface comes from a Uniforum
-proposal and it is followed by at least one major Unix vendor
-(Sun) in its last developments. It is not specified in any official
-standard, though.
-
-
-The main points about this solution is that it does not follow the -method of normal file handling (open-use-close) and that it does not -burden the programmer so many task, especially the unique key handling. -Of course here is also a unique key needed, but this key is the -message itself (how long or short it is). See section Comparing the Two Interfaces for a -more detailed comparison of the two methods. - -
-
-The following section contains a rather detailed description of the
-interface. We make it that detailed because this is the interface
-we chose for the GNU gettext
Library. Programmers interested
-in using this library will be interested in this description.
-
-
-The minimal functionality an interface must have is a) to select a -domain the strings are coming from (a single domain for all programs is -not reasonable because its construction and maintenance is difficult, -perhaps impossible) and b) to access a string in a selected domain. - -
-
-This is principally the description of the gettext
interface. It
-has an global domain which unqualified usages reference. Of course this
-domain is selectable by the user.
-
-
-char *textdomain (const char *domain_name); -- -
-This provides the possibility to change or query the current status of
-the current global domain of the LC_MESSAGE
category. The
-argument is a null-terminated string, whose characters must be legal in
-the use in filenames. If the domain_name argument is NULL
,
-the function return the current value. If no value has been set
-before, the name of the default domain is returned: messages.
-Please note that although the return value of textdomain
is of
-type char *
no changing is allowed. It is also important to know
-that no checks of the availability are made. If the name is not
-available you will see this by the fact that no translations are provided.
-
-
-To use a domain set by textdomain
the function
-
-
-char *gettext (const char *msgid); -- -
-is to be used. This is the simplest reasonable form one can imagine.
-The translation of the string msgid is returned if it is available
-in the current domain. If not available the argument itself is
-returned. If the argument is NULL
the result is undefined.
-
-
-One things which should come into mind is that no explicit dependency to
-the used domain is given. The current value of the domain for the
-LC_MESSAGES
locale is used. If this changes between two
-executions of the same gettext
call in the program, both calls
-reference a different message catalog.
-
-
-For the easiest case, which is normally used in internationalized
-packages, once at the beginning of execution a call to textdomain
-is issued, setting the domain to a unique name, normally the package
-name. In the following code all strings which have to be translated are
-filtered through the gettext function. That's all, the package speaks
-your language.
-
-
-While this single name domain work good for most applications there
-might be the need to get translations from more than one domain. Of
-course one could switch between different domains with calls to
-textdomain
, but this is really not convenient nor is it fast. A
-possible situation could be one case discussing while this writing: all
-error messages of functions in the set of common used functions should
-go into a separate domain error
. By this mean we would only need
-to translate them once.
-
-
-For this reasons there are two more functions to retrieve strings: - -
- --char *dgettext (const char *domain_name, const char *msgid); -char *dcgettext (const char *domain_name, const char *msgid, - int category); -- -
-Both take an additional argument at the first place, which corresponds
-to the argument of textdomain
. The third argument of
-dcgettext
allows to use another locale but LC_MESSAGES
.
-But I really don't know where this can be useful. If the
-domain_name is NULL
or category has an value beside
-the known ones, the result is undefined. It should also be noted that
-this function is not part of the second known implementation of this
-function family, the one found in Solaris.
-
-
-A second ambiguity can arise by the fact, that perhaps more than one -domain has the same name. This can be solved by specifying where the -needed message catalog files can be found. - -
- --char *bindtextdomain (const char *domain_name, - const char *dir_name); -- -
-Calling this function binds the given domain to a file in the specified
-directory (how this file is determined follows below). Especially a
-file in the systems default place is not favored against the specified
-file anymore (as it would be by solely using textdomain
). A
-NULL
pointer for the dir_name parameter returns the binding
-associated with domain_name. If domain_name itself is
-NULL
nothing happens and a NULL
pointer is returned. Here
-again as for all the other functions is true that none of the return
-value must be changed!
-
-
-It is important to remember that relative path names for the
-dir_name parameter can be trouble. Since the path is always
-computed relative to the current directory different results will be
-achieved when the program executes a chdir
command. Relative
-paths should always be avoided to avoid dependencies and
-unreliabilities.
-
-
-Because many different languages for many different packages have to be
-stored we need some way to add these information to file message catalog
-files. The way usually used in Unix environments is have this encoding
-in the file name. This is also done here. The directory name given in
-bindtextdomain
s second argument (or the default directory),
-followed by the value and name of the locale and the domain name are
-concatenated:
-
-
-dir_name/locale/LC_category/domain_name.mo -- -
-The default value for dir_name is system specific. For the GNU -library, and for packages adhering to its conventions, it's: - -
-/usr/local/share/locale -- -
-locale is the value of the locale whose name is this
-LC_category
. For gettext
and dgettext
this
-locale is always LC_MESSAGES
. dcgettext
specifies the
-locale by the third argument.(2) (3)
-
-
-At this point of the discussion we should talk about an advantage of the
-GNU gettext
implementation. Some readers might have pointed out
-that an internationalized program might have a poor performance if some
-string has to be translated in an inner loop. While this is unavoidable
-when the string varies from one run of the loop to the other it is
-simply a waste of time when the string is always the same. Take the
-following example:
-
-
-{ - while (...) - { - puts (gettext ("Hello world")); - } -} -- -
-When the locale selection does not change between two runs the resulting -string is always the same. One way to use this is: - -
- --{ - str = gettext ("Hello world"); - while (...) - { - puts (str); - } -} -- -
-But this solution is not usable in all situation (e.g. when the locale -selection changes) nor is it good readable. - -
--The GNU C compiler, version 2.7 and above, provide another solution for -this. To describe this we show here some lines of the -`intl/libgettext.h' file. For an explanation of the expression -command block see section `Statements and Declarations in Expressions' in The GNU CC Manual. - -
- --# if defined __GNUC__ && __GNUC__ == 2 && __GNUC_MINOR__ >= 7 -extern int _nl_msg_cat_cntr; -# define dcgettext(domainname, msgid, category) \ - (__extension__ \ - ({ \ - char *result; \ - if (__builtin_constant_p (msgid)) \ - { \ - static char *__translation__; \ - static int __catalog_counter__; \ - if (! __translation__ \ - || __catalog_counter__ != _nl_msg_cat_cntr) \ - { \ - __translation__ = \ - dcgettext__ ((domainname), (msgid), (category)); \ - __catalog_counter__ = _nl_msg_cat_cntr; \ - } \ - result = __translation__; \ - } \ - else \ - result = dcgettext__ ((domainname), (msgid), (category)); \ - result; \ - })) -# endif -- -
-The interesting thing here is the __builtin_constant_p
predicate.
-This is evaluated at compile time and so optimization can take place
-immediately. Here two cases are distinguished: the argument to
-gettext
is not a constant value in which case simply the function
-dcgettext__
is called, the real implementation of the
-dcgettext
function.
-
-
-If the string argument is constant we can reuse the once gained
-translation when the locale selection has not changed. This is exactly
-what is done here. The _nl_msg_cat_cntr
variable is defined in
-the `loadmsgcat.c' which is available in `libintl.a' and is
-changed whenever a new message catalog is loaded.
-
-
-The following discussion is perhaps a little bit colored. As said
-above we implemented GNU gettext
following the Uniforum
-proposal and this surely has its reasons. But it should show how we
-came to this decision.
-
-
-First we take a look at the developing process. When we write an
-application using NLS provided by gettext
we proceed as always.
-Only when we come to a string which might be seen by the users and thus
-has to be translated we use gettext("...")
instead of
-"..."
. At the beginning of each source file (or in a central
-header file) we define
-
-
-#define gettext(String) (String) -- -
-Even this definition can be avoided when the system supports the
-gettext
function in its C library. When we compile this code the
-result is the same as if no NLS code is used. When you take a look at
-the GNU gettext
code you will see that we use _("...")
-instead of gettext("...")
. This reduces the number of
-additional characters per translatable string to 3 (in words:
-three).
-
-
-When now a production version of the program is needed we simply replace -the definition - -
- --#define _(String) (String) -- -
-by - -
- --#include <libintl.h> -#define _(String) gettext (String) -- -
-Additionally we run the program `xgettext' on all source code file -which contain translatable strings and that's it: we have a running -program which does not depend on translations to be available, but which -can use any that becomes available. - -
-
-The same procedure can be done for the gettext_noop
invocations
-(see section Special Cases of Translatable Strings). First you can define gettext_noop
to a
-no-op macro and later use the definition from `libintl.h'. Because
-this name is not used in Suns implementation of `libintl.h',
-you should consider the following code for your project:
-
-
-#ifdef gettext_noop -# define N_(String) gettext_noop (String) -#else -# define N_(String) (String) -#endif -- -
-N_
is a short form similar to _
. The `Makefile' in
-the `po/' directory of GNU gettext knows by default both of the
-mentioned short forms so you are invited to follow this proposal for
-your own ease.
-
-
-Now to catgets
. The main problem is the work for the
-programmer. Every time he comes to a translatable string he has to
-define a number (or a symbolic constant) which has also be defined in
-the message catalog file. He also has to take care for duplicate
-entries, duplicate message IDs etc. If he wants to have the same
-quality in the message catalog as the GNU gettext
program
-provides he also has to put the descriptive comments for the strings and
-the location in all source code files in the message catalog. This is
-nearly a Mission: Impossible.
-
-
-But there are also some points people might call advantages speaking for
-catgets
. If you have a single word in a string and this string
-is used in different contexts it is likely that in one or the other
-language the word has different translations. Example:
-
-
-printf ("%s: %d", gettext ("number"), number_of_errors) - -printf ("you should see %d %s", number_count, - number_count == 1 ? gettext ("number") : gettext ("numbers")) -- -
-Here we have to translate two times the string "number"
. Even
-if you do not speak a language beside English it might be possible to
-recognize that the two words have a different meaning. In German the
-first appearance has to be translated to "Anzahl"
and the second
-to "Zahl"
.
-
-
-Now you can say that this example is really esoteric. And you are -right! This is exactly how we felt about this problem and decide that -it does not weight that much. The solution for the above problem could -be very easy: - -
- --printf ("%s %d", gettext ("number:"), number_of_errors) - -printf (number_count == 1 ? gettext ("you should see %d number") - : gettext ("you should see %d numbers"), - number_count) -- -
-We believe that we can solve all conflicts with this method. If it is -difficult one can also consider changing one of the conflicting string a -little bit. But it is not impossible to overcome. - -
--Translator note: It is perhaps appropriate here to tell those English -speaking programmers that the plural form of a noun cannot be formed by -appending a single `s'. Most other languages use different methods. -Even the above form is not general enough to cope with all languages. -Rafal Maszkowski <rzm@mat.uni.torun.pl> reports: - -
- --- --In Polish we use e.g. plik (file) this way: - -
-1 plik -2,3,4 pliki -5-21 pliko'w -22-24 pliki -25-31 pliko'w -- --and so on (o' means 8859-2 oacute which should be rather okreska, -similar to aogonek). -
-A workable approach might be to consider methods like the one used for
-LC_TIME
in the POSIX.2 standard. The value of the
-alt_digits
field can be up to 100 strings which represent the
-numbers 1 to 100. Using this in a situation of an internationalized
-program means that an array of translatable strings should be indexed by
-the number which should represent. A small example:
-
-
-void -print_month_info (int month) -{ - const char *month_pos[12] = - { N_("first"), N_("second"), N_("third"), N_("fourth"), - N_("fifth"), N_("sixth"), N_("seventh"), N_("eighth"), - N_("ninth"), N_("tenth"), N_("eleventh"), N_("twelfth") }; - printf (_("%s is the %s month\n"), nl_langinfo (MON_1 + month), - _(month_pos[month])); -} -- -
-It should be obvious that this method is only reasonable for small -ranges of numbers. - -
- - - -
-Starting with version 0.9.4 the library libintl.h
should be
-self-contained. I.e., you can use it in your own programs without
-providing additional functions. The `Makefile' will put the header
-and the library in directories selected using the $(prefix)
.
-
-
-One exception of the above is found on HP-UX systems. Here the C library
-does not contain the alloca
function (and the HP compiler does
-not generate it inlined). But it is not intended to rewrite the whole
-library just because of this dumb system. Instead include the
-alloca
function in all package you use the libintl.a
in.
-
-
gettext
grok
-To fully exploit the functionality of the GNU gettext
library it
-is surely helpful to read the source code. But for those who don't want
-to spend that much time in reading the (sometimes complicated) code here
-is a list comments:
-
-
gettext
-function. The method which is presented here only works correctly
-with the GNU implementation of the gettext
functions. It is not
-possible with underlying catgets
functions or gettext
-functions from the systems C library. The exception is of course the
-GNU C Library which uses the GNU gettext
Library for message handling.
-
-In the function dcgettext
at every call the current setting of
-the highest priority environment variable is determined and used.
-Highest priority means here the following list with decreasing
-priority:
-
-
-LANGUAGE
-
-LC_ALL
-
-LC_xxx
, according to selected locale
-
-LANG
-
-LANGUAGE
changes. According
-to the process explained above the new value of this variable is found
-as soon as the dcgettext
function is called. But this also means
-the (perhaps) different message catalog file is loaded. In other
-words: the used language is changed.
-
-But there is one little hook. The code for gcc-2.7.0 and up provides
-some optimization. This optimization normally prevents the calling of
-the dcgettext
function as long as no new catalog is loaded. But
-if dcgettext
is not called the program also cannot find the
-LANGUAGE
variable be changed (see section Optimization of the *gettext functions). A
-solution for this is very easy. Include the following code in the
-language switching function.
-
-
-- /* Change language. */ - setenv ("LANGUAGE", "fr", 1); - - /* Make change known. */ - { - extern int _nl_msg_cat_cntr; - ++_nl_msg_cat_cntr; - } -- -The variable
_nl_msg_cat_cntr
is defined in `loadmsgcat.c'.
-The programmer will find himself in need for a construct like this only
-when developing programs which do run longer and provide the user to
-select the language at runtime. Non-interactive programs (like all
-these little Unix tools) should never need this.
-
-
-There are two competing methods for language independent messages:
-the X/Open catgets
method, and the Uniforum gettext
-method. The catgets
method indexes messages by integers; the
-gettext
method indexes them by their English translations.
-The catgets
method has been around longer and is supported
-by more vendors. The gettext
method is supported by Sun,
-and it has been heard that the COSE multi-vendor initiative is
-supporting it. Neither method is a POSIX standard; the POSIX.1
-committee had a lot of disagreement in this area.
-
-
-Neither one is in the POSIX standard. There was much disagreement
-in the POSIX.1 committee about using the gettext
routines
-vs. catgets
(XPG). In the end the committee couldn't
-agree on anything, so no messaging system was included as part
-of the standard. I believe the informative annex of the standard
-includes the XPG3 messaging interfaces, "...as an example of
-a messaging system that has been implemented..."
-
-
-They were very careful not to say anywhere that you should use one -set of interfaces over the other. For more on this topic please -see the Programming for Internationalization FAQ. - -
- - -catgets
-There have been a few discussions of late on the use of
-catgets
as a base. I think it important to present both
-sides of the argument and hence am opting to play devil's advocate
-for a little bit.
-
-
-I'll not deny the fact that catgets
could have been designed
-a lot better. It currently has quite a number of limitations and
-these have already been pointed out.
-
-
-However there is a great deal to be said for consistency and -standardization. A common recurring problem when writing Unix -software is the myriad portability problems across Unix platforms. -It seems as if every Unix vendor had a look at the operating system -and found parts they could improve upon. Undoubtedly, these -modifications are probably innovative and solve real problems. -However, software developers have a hard time keeping up with all -these changes across so many platforms. - -
--And this has prompted the Unix vendors to begin to standardize their -systems. Hence the impetus for Spec1170. Every major Unix vendor -has committed to supporting this standard and every Unix software -developer waits with glee the day they can write software to this -standard and simply recompile (without having to use autoconf) -across different platforms. - -
-
-As I understand it, Spec1170 is roughly based upon version 4 of the
-X/Open Portability Guidelines (XPG4). Because catgets
and
-friends are defined in XPG4, I'm led to believe that catgets
-is a part of Spec1170 and hence will become a standardized component
-of all Unix systems.
-
-
-Now it seems kind of wasteful to me to have two different systems
-installed for accessing message catalogs. If we do want to remedy
-catgets
deficiencies why don't we try to expand catgets
-(in a compatible manner) rather than implement an entirely new system.
-Otherwise, we'll end up with two message catalog access systems installed
-with an operating system - one set of routines for packages using GNU
-gettext
for their internationalization, and another set of routines
-(catgets) for all other software. Bloated?
-
-
-Supposing another catalog access system is implemented. Which do
-we recommend? At least for Linux, we need to attract as many
-software developers as possible. Hence we need to make it as easy
-for them to port their software as possible. Which means supporting
-catgets
. We will be implementing the glocale
code
-within our libc
, but does this mean we also have to incorporate
-another message catalog access scheme within our libc
as well?
-And what about people who are going to be using the glocale
-+ non-catgets
routines. When they port their software to
-other platforms, they're now going to have to include the front-end
-(glocale
) code plus the back-end code (the non-catgets
-access routines) with their software instead of just including the
-glocale
code with their software.
-
-
-Message catalog support is however only the tip of the iceberg.
-What about the data for the other locale categories. They also have
-a number of deficiencies. Are we going to abandon them as well and
-develop another duplicate set of routines (should glocale
-expand beyond message catalog support)?
-
-
-Like many parts of Unix that can be improved upon, we're stuck with balancing -compatibility with the past with useful improvements and innovations for -the future. - -
- - - --X/Open agreed very late on the standard form so that many -implementations differ from the final form. Both of my system (old -Linux catgets and Ultrix-4) have a strange variation. - -
-
-OK. After incorporating the last changes I have to spend some time on
-making the GNU/Linux libc
gettext
functions. So in future
-Solaris is not the only system having gettext
.
-
-
-
Go to the first, previous, next, last section, table of contents. - -