X-Git-Url: https://git.saurik.com/wxWidgets.git/blobdiff_plain/1a7f306263595bff3b74e96e4c2bee6f0a008500..51dc95a4c8ccb00741be48f6353749ada3e9f39a:/docs/html/gettext/gettext.htm diff --git a/docs/html/gettext/gettext.htm b/docs/html/gettext/gettext.htm deleted file mode 100644 index c48fc3708b..0000000000 --- a/docs/html/gettext/gettext.htm +++ /dev/null @@ -1,4961 +0,0 @@ -<HTML> -<HEAD> -<!-- This HTML file has been created by texi2html 1.51 - from gettext.texi on 4 September 1998 --> - -<TITLE>GNU gettext utilities</TITLE> -</HEAD> -<BODY> -<H1>GNU gettext tools, version 0.10</H1> -<H2>Native Language Support Library and Tools</H2> -<H2>Edition 0.10, 26 November</H2> -<ADDRESS>Ulrich Drepper</ADDRESS> -<ADDRESS>Jim Meyering</ADDRESS> -<ADDRESS>Pinard</ADDRESS> -<P> -<P><HR><P> - -<P> -Copyright (C) 1995 Free Software Foundation, Inc. - -</P> -<P> -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. - -</P> -<P> -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. - -</P> -<P> -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for modified versions, -except that this permission notice may be stated in a translation approved -by the Foundation. - -</P> - - - -<H1><A NAME="SEC1" HREF="gettext_toc.html#TOC1">Introduction</A></H1> - - -<BLOCKQUOTE> -<P> -This manual is still in <EM>DRAFT</EM> state. Some sections are still -empty, or almost. We keep merging material from other sources -(essentially email folders) while the proper integration of this -material is delayed. -</BLOCKQUOTE> - -<P> -In this manual, we use <EM>he</EM> when speaking of the programmer or -maintainer, <EM>she</EM> when speaking of the translator, and <EM>they</EM> -when speaking of the installers or end users of the translated program. -This is only a convenience for clarifying the documentation. It is -absolutely not meant to imply that some roles are more appropriate -to males or females. Besides, as you might guess, GNU <CODE>gettext</CODE> -is meant to be useful for people using computers, whatever their sex, -race, religion or nationality! - -</P> -<P> -This chapter explains what are the goals seeked by the mere existence -of GNU <CODE>gettext</CODE>. Then, it explains a few wide concepts around -Native Language Support, and situates message translation in regard -to other aspects of national and cultural variance, as applicable -to programs. It also surveys what are those files used to convey -translations. It explains how the various tools interrelate in the -initial generation for these files, and later, how the maintenance -cycle usually operate. - -</P> - - - -<H2><A NAME="SEC2" HREF="gettext_toc.html#TOC2">The Purpose of GNU <CODE>gettext</CODE></A></H2> - -<P> -Usually, programs are written and documented in English, and use -English at execution time for interacting with users. This is true -not only from within GNU, but also in a great deal of commercial -and free software. Using a common language is quite handy for -communication between developers, maintainers and users from all -countries. On the other hand, most people are less comfortable with -English than with their own native language, and would rather prefer -using their mother tongue for day to day's work, as far as possible. -Many would simply <EM>love</EM> seeing their computer screen showing -a lot less of English, and far more of their own spoken language. - -</P> -<P> -However, to some people, this dream might appear so far fetched that -they may believe it is not even worth spending time thinking about -it, and they have no confidence at all that the dream might ever -become true. Many did not loose hope yet, and organized themselves. -The GNU Translation Project is a formalization of this hope into a -workable structure, which has a good chance to get all of us nearer -the achievement of a truly multi-lingual set of programs. - -</P> -<P> -GNU <CODE>gettext</CODE> is an important step for the GNU Translation -Project, as it is an asset on which we may build many other steps. -This package offers to programmers, translators and even users, a -well integrated set of tools and documentation. Specifically, the GNU -<CODE>gettext</CODE> utilities are a set of tools that provides a framework -to help other GNU packages produce multi-lingual messages. These tools -include a set of conventions about how programs should be written to -support message catalogs, a directory and file naming organization -for the message catalogs themselves, a runtime library supporting the -retrieval of translated messages, and a few stand-alone programs to -massage in various ways the sets of translatable strings, or already -translated strings. A special GNU Emacs mode also helps interested -parties into preparing these sets, or bringing them up to date. - -</P> -<P> -GNU <CODE>gettext</CODE> is designed so it minimizes the impact of -internationalization on program sources, keeping this impact as small -and hardly noticeable as possible. Internationalization has better -chances of succeeding if it is very light weighted, or at least, -appear to be so, when looking at program sources. - -</P> -<P> -The GNU Translation Project also uses the GNU <CODE>gettext</CODE> -distribution as a vehicle for documenting its structure and methods, -even if this goes beyond the technicalities of the GNU <CODE>gettext</CODE> -proper. By doing so, translators will find in a single place, as -far as possible, all they need to know for properly doing their -translating work. Also, this supplementary documentation might also -help programmers, and even curious users, at understanding how GNU -<CODE>gettext</CODE> is related to the remainder of the GNU Translation -Project, and consequently, have a glimpse at the <EM>big picture</EM>. - -</P> - - -<H2><A NAME="SEC3" HREF="gettext_toc.html#TOC3">I18n, L10n, and Such</A></H2> - -<P> -Two long words appear all the time when we discuss support of native -language in programs, and these words have a precise meaning, worth -being explained here, once and for all in this document. The words are -<EM>internationalization</EM> and <EM>localization</EM>. Many people, -tired of writing these long words over and over again, took the -habit of writing <STRONG>i18n</STRONG> and <STRONG>l10n</STRONG> instead, quoting the first -and last letter of each word, and replacing the run of intermediate -letters by a number merely telling how many such letters there are. -But in this manual, in the sake of clarity, we will patiently write -the names in full, each time... - -</P> -<P> -By <STRONG>internationalization</STRONG>, one refers to the operation by which a -program, or a set of programs turned into a package, is made aware and -able to support multiple languages. This is a generalization process, -by which the programs are untied from using only English strings or -other English specific habits, and connected to generic ways of doing -the same, instead. Program developers may use various techniques to -internationalize their programs, some of them have been standardized. -GNU <CODE>gettext</CODE> offers one of these standards. See section <A HREF="gettext.html#SEC36">The Programmer's View</A>. - -</P> -<P> -By <STRONG>localization</STRONG>, one means the operation by which, in a set -of programs already internationalized, one gives the program all -needed information so that it can bend itself to handle its input -and output in a fashion which is correct for some native language and -cultural habits. This is a particularisation process, by which generic -methods already implemented in an internationalized program are used -in specific ways. The programming environment puts several functions -to the programmers disposal which allow this runtime configuration. -The formal description of specific set of cultural habits for some -country, together with all associated translations targeted to the -same native language, is called the <STRONG>locale</STRONG> for this language -or country. Users achieve localization of programs by setting proper -values to special environment variables, prior to executing those -programs, identifying which locale should be used. - -</P> -<P> -In fact, locale message support is only one component of the cultural -data that makes up a particular locale. There are a whole host of -routines and functions provided to aid programmers in developing -internationalized software and which allows them to access the data -stored in a particular locale. When someone presently refers to a -particular locale, they are obviously referring to the data stored -within that particular locale. Similarly, if a programmer is referring -to "accessing the locale routines", they are referring to the -complete suite of routines that access all of the locale's information. - -</P> -<P> -One uses the expression <STRONG>Native Language Support</STRONG>, or merely NLS, -for speaking of the overall activity or feature encompassing both -internationalization and localization, allowing for multi-lingual -interactions in a program. In a nutshell, one could say that -internationalization is the operation by which further localizations -are made possible. - -</P> -<P> -Also, very roughly said, when it comes to multi-lingual messages, -internationalization is usually taken care of by programmers, and -localization is usually taken care of by translators. - -</P> - - -<H2><A NAME="SEC4" HREF="gettext_toc.html#TOC4">Aspects in Native Language Support</A></H2> - -<P> -For a totally multi-lingual distribution, there are many things to -translate beyond output messages. - -</P> - -<UL> -<LI> - -As of today, GNU <CODE>gettext</CODE> offers a complete toolset for -translating messages output by C programs. Perl scripts and shell -scripts also need to be translated. Even if there are some hooks -so this can be done, these hooks are not integrated as well as they -should be. - -<LI> - -Some programs, like <CODE>autoconf</CODE> or <CODE>bison</CODE>, are able -to produce other programs (or scripts). Even if the generating -programs themselves are internationalized, the generated programs they -produce may need internationalization on their own, and this indirect -internationalization could be automated right from the generating -program. In fact, quite usually, generating and generated programs -could be internationalized independently, as the effort needed is -fairly orthogonal. - -<LI> - -A few programs include textual tables which might need translation -themselves, independently of the strings contained in the program -itself. For example, RFC 1345 gives an English description for each -character which GNU <CODE>recode</CODE> is able to reconstruct at execution. -Since these descriptions are extracted from the RFC by mechanical means, -translating them properly would require a prior translation of the RFC -itself. - -<LI> - -Almost all programs accept options, which are often worded out so to -be descriptive for the English readers; one might want to consider -offering translated versions for program options as well. - -<LI> - -Many programs read, interpret, compile, or are somewhat driven by -input files which are texts containing keywords, identifiers, or -replies which are inherently translatable. For example, one may want -<CODE>gcc</CODE> to allow diacriticized characters in identifiers or use -translated keywords; <SAMP>`rm -i'</SAMP> might accept something else than -<SAMP>`y'</SAMP> or <SAMP>`n'</SAMP> for replies, etc. Even if the program will -eventually make most of its output in the foreign languages, one has -to decide whether the input syntax, option values, etc., are to be -localized or not. - -<LI> - -The manual accompanying a package, as well as all documentation files -in the distribution, could surely be translated, too. Translating a -manual, with the intent of later keeping up with updates, is a major -undertaking in itself, generally. - -</UL> - -<P> -As we already stressed, translation is only one aspect of locales. -Other internationalization aspects are not currently handled by GNU -<CODE>gettext</CODE>, but perhaps may be handled in future versions. There -are many attributes that are needed to define a country's cultural -conventions. These attributes include beside the country's native -language, the formatting of the date and time, the representation of -numbers, the symbols for currency, etc. These local <STRONG>rules</STRONG> are -termed the country's locale. The locale represents the knowledge -needed to support the country's native attributes. - -</P> -<P> -There are a few major areas which may vary between countries and -hence, define what a locale must describe. The following list helps -putting multi-lingual messages into the proper context of other tasks -related to locales, and also presents some other areas which GNU -<CODE>gettext</CODE> might eventually tackle, maybe, one of these days. - -</P> -<DL COMPACT> - -<DT><EM>Characters and Codesets</EM> -<DD> -The codeset most commonly used through out the USA and most English -speaking parts of the world is the ASCII codeset. However, there are -many characters needed by various locales that are not found within -this codeset. The 8-bit ISO 8859-1 code set has most of the special -characters needed to handle the major European languages. However, in -many cases, the ISO 8859-1 font is not adequate. Hence each locale -will need to specify which codeset they need to use and will need -to have the appropriate character handling routines to cope with -the codeset. - -<DT><EM>Currency</EM> -<DD> -The symbols used vary from country to country as does the position -used by the symbol. Software needs to be able to transparently -display currency figures in the native mode for each locale. - -<DT><EM>Dates</EM> -<DD> -The format of date varies between locales. For example, Christmas day -in 1994 is written as 12/25/94 in the USA and as 25/12/94 in Australia. -Other countries might use ISO 8061 dates, etc. - -Time of the day may be noted as <VAR>hh</VAR>:<VAR>mm</VAR>, <VAR>hh</VAR>.<VAR>mm</VAR>, -or otherwise. Some locales require time to be specified in 24-hour -mode rather than as AM or PM. Further, the nature and yearly extent -of the Daylight Saving correction vary widely between countries. - -<DT><EM>Numbers</EM> -<DD> -Numbers can be represented differently in different locales. -For example, the following numbers are all written correctly for -their respective locales: - - -<PRE> -12,345.67 English -12.345,67 French -1,2345.67 Asia -</PRE> - -Some programs could go further and use different unit systems, like -English units or Metric units, or even take into account variants -about how numbers are spelled in full. - -<DT><EM>Messages</EM> -<DD> -The most obvious area is the language support within a locale. This is -where GNU <CODE>gettext</CODE> provide an ease for developers and users to -easily change the language that the software uses to communicate to -the user. - -</DL> - -<P> -In the near future we see no chance that beside message handling -more components of locale will be made available for use in other -GNU packages. The reason for this is that most modern system provide -a more or less reasonable support for at least some of the missing -components. Another point is that the GNU libc and Linux will get -a new and complete implementation of the whole locale functionality -which could be adopted by system lacking a reasonable locale support. - -</P> - - -<H2><A NAME="SEC5" HREF="gettext_toc.html#TOC5">Files Conveying Translations</A></H2> - -<P> -The letters PO in <TT>`.po'</TT> files means Portable Object, to -distinguish it from <TT>`.mo'</TT> files, where MO stands for Machine -Object. This paradigm, as well as the PO file format, is inspired -by the NLS standard developed by Uniforum, and implemented by Sun -in their Solaris system. - -</P> -<P> -PO files are meant to be read and edited by humans, and associate each -original, translatable string of a given package with its translation -in a particular target language. A single PO file is dedicated to -a single target language. If a package supports many languages, -there is one such PO file per language supported, and each package -has its own set of PO files. These PO files are best created by -the <CODE>xgettext</CODE> program, and later updated or refreshed through -the <CODE>tupdate</CODE> program. Program <CODE>xgettext</CODE> extracts all -marked messages from a set of C files and initializes a PO file with -empty translations. Program <CODE>tupdate</CODE> takes care of adjusting -PO files between releases of the corresponding sources, commenting -obsolete entries, initializing new ones, and updating all source -line references. Files ending with <TT>`.pot'</TT> are kind of base -translation files found in distributions, in PO file format, and -<TT>`.pox'</TT> files are often temporary PO files. - -</P> -<P> -MO files are meant to be read by programs, and are binary in nature. -A few systems already offer tools for creating and handling MO files -as part of the Native Language Support coming with the system, but the -format of these MO files is often different from system to system, -and non-portable. They do not necessary use <TT>`.mo'</TT> for file -extensions, but since system libraries are also used for accessing -these files, it works as long as the system is self-consistent about -it. If GNU <CODE>gettext</CODE> is able to interface with the tools already -provided with systems, it will consequently let these provided tools -take care of generating the MO files. Or else, if such tools are not -found or do not seem usable, GNU <CODE>gettext</CODE> will use its own ways -and its own format for MO files. Files ending with <TT>`.gmo'</TT> are -really MO files, when it is known that these files use the GNU format. - -</P> - - -<H2><A NAME="SEC6" HREF="gettext_toc.html#TOC6">Overview of GNU <CODE>gettext</CODE></A></H2> - -<P> -The following diagram summarizes the relation between the files -handled by GNU <CODE>gettext</CODE> and the tools acting on these files. -It is followed by a somewhat detailed explanations, which you should -read while keeping an eye on the diagram. Having a clear understanding -of these interrelations would surely help programmers, translators -and maintainers. - -</P> - -<PRE> -Original C Sources ---> PO mode ---> Marked C Sources ---. - | - .---------<--- GNU gettext Library | -.--- make <---+ | -| `---------<--------------------+-----------' -| | -| .-----<--- PACKAGE.pot <--- xgettext <---' .---<--- PO Compendium -| | | ^ -| | `---. | -| `---. +---> PO mode ---. -| +----> tupdate -------> LANG.pox --->--------' | -| .---' | -| | | -| `-------------<---------------. | -| +--- LANG.po <--- New LANG.pox <----' -| .--- LANG.gmo <--- msgfmt <---' -| | -| `---> install ---> /.../LANG/PACKAGE.mo ---. -| +---> "Hello world!" -`-------> install ---> /.../bin/PROGRAM -------' -</PRE> - -<P> -The indication <SAMP>`PO mode'</SAMP> appears in two places in this picture, -and you may safely read it as merely meaning "hand editing", using -any editor of your choice, really. However, for those of you being -the lucky users of GNU Emacs, PO mode has been specifically created -for providing a cosy environment for editing or modifying PO files. -While editing a PO file, PO mode allows for the easy browsing of -auxiliary and compendium PO files, as well as following references into -the set of C program sources from which PO files has been derived. -It has a few special features, among which the interactive marking -of program strings as translatable, and the validatation of PO files -with easy repositioning to PO file lines showing errors. - -</P> -<P> -As a programmer, the first step into bringing GNU <CODE>gettext</CODE> -into your package is identifying, right in the C sources, which -strings are meant to be translatable, and which are untranslatable. -This tedious job can be done a little more comfortably using PO -mode, but you can use any means being usual to you for modifying your -C sources. Some other simple, standard changes are also needed to -properly initialize the translation library. See section <A HREF="gettext.html#SEC13">Preparing Program Sources</A>, for -more information about all this. - -</P> -<P> -Once the C sources have been modified, the <CODE>xgettext</CODE> program -is used to find and extract all translatable strings, and create an -initial PO file out of all these. This <TT>`<VAR>package</VAR>.pot'</TT> file -contains all original program strings, it has sets of pointers to -exactly where in C sources each string is used, and all translations -are set to empty. The letter <KBD>t</KBD> in <TT>`.pot'</TT> marks that this is -a Template PO file, not yet oriented towards any particular language. -See section <A HREF="gettext.html#SEC19">Invoking the <CODE>xgettext</CODE> Program</A>, for more details about how one calls the -<CODE>xgettext</CODE> program. If you are <EM>really</EM> lazy, you might -be interested at working a lot more right away, and preparing the -whole distribution setup (see section <A HREF="gettext.html#SEC65">The Maintainer's View</A>). By doing so, you -spare typing the <CODE>xgettext</CODE> command yourself, as <CODE>make</CODE> -should now generate the proper things automatically for you! - -</P> -<P> -The first time through, there is no <TT>`<VAR>lang</VAR>.po'</TT> yet, so the -<CODE>tupdate</CODE> step may be skipped and replaced by a mere copy of -<TT>`<VAR>package</VAR>.pot'</TT> to <TT>`<VAR>lang</VAR>.pox'</TT>, where <VAR>lang</VAR> -represents the target language. - -</P> -<P> -Then comes the initial translation of messages. Translation in -itself is a whole matter, still exclusively meant for humans, -and whose complexity far overwhelms the level of this manual. -Nevertheless, a few hints are given in some other chapter of this -manual (see section <A HREF="gettext.html#SEC54">The Translator's View</A>). You will also find there indications -about how to contact translating teams, or becoming part of them, -for sharing your translating concerns with others who target the same -native language. - -</P> -<P> -While adding the translated messages into the <TT>`<VAR>lang</VAR>.pox'</TT> -PO file, if you do not have GNU Emacs handy, you are on your own -for ensuring that your fully respect the PO file format, and quoting -conventions (see section <A HREF="gettext.html#SEC9">The Format of PO Files</A>). This is surely not an impossible task, -as this is the way many people handled PO files already for Uniforum or -Solaris. On the other hand, using PO mode in GNU Emacs, most details -of PO file format are taken care for you, but you have to acquire -some familiarity with PO mode itself. Besides main PO mode commands -(see section <A HREF="gettext.html#SEC10">Main Commands</A>), you should know how to move between entries -(see section <A HREF="gettext.html#SEC11">Entry Positioning</A>), and how to handle untranslated entries -(see section <A HREF="gettext.html#SEC24">Untranslated Entries</A>). - -</P> -<P> -If some common translations have already been saved into a compendium -PO file, translators may use PO mode for initializing untranslated -entries from the compendium, and also save selected translations into -the compendium, updating it (see section <A HREF="gettext.html#SEC21">Using Translation Compendiums</A>). Compendium files -are meant to be exchanged between members of a given translation team. - -</P> -<P> -Programs, or packages of programs, are dynamic in nature: users write -bug reports and suggestion for improvements, maintainers react by -modifying programs in various ways. The fact that a package has -already been internationalized should not make maintainers shy -of adding new strings, or modifying strings already translated. -They just do their job the best they can. For the GNU Translation -Project to work smoothly, it is important that maintainers do not -carry translation concerns on their already loaded shoulders, and that -translators be kept as free as possible of programmatic concerns. - -</P> -<P> -The only concern maintainers should have is carefully marking new -strings are translatable, when they should be, and do not otherwise -worry about them being translated, as this will come in proper time. -Consequently, when programs and their strings are adjusted in various -ways by maintainers, and for matters usually unrelated to translation, -<CODE>xgettext</CODE> would construct <TT>`<VAR>package</VAR>.pot'</TT> files which are -evolving over time, so the translations carried by <TT>`<VAR>lang</VAR>.po'</TT> -are slowly fading out of date. - -</P> -<P> -It is important for translators (and even maintainers) to understand -that package translation is a continuous process in the lifetime of a -package, and not something which is done once and for all at the start. -After an initial burst of translation activity for a given package, -interventions are needed once in a while, because here and there, -translated entries become obsolete, and new untranslated entries -appear, needing translation. - -</P> -<P> -The <CODE>tupdate</CODE> program has the purpose of refreshing an already -existing <TT>`<VAR>lang</VAR>.po'</TT> file, by comparing it with a newer -<TT>`<VAR>package</VAR>.pot'</TT> template file, extracted by <CODE>xgettext</CODE> -out of recent C sources. The refreshing operation adjusts all -references to C source locations for strings, since these strings -move as programs are modified. Also, <CODE>tupdate</CODE> comments out as -obsolete, in <TT>`<VAR>lang</VAR>.pox'</TT>, those already translated entries -which are no longer used in the program sources (see section <A HREF="gettext.html#SEC25">Obsolete Entries</A>. It finally discovers new strings and insert them in -the resulting PO file as untranslated entries (see section <A HREF="gettext.html#SEC24">Untranslated Entries</A>. See section <A HREF="gettext.html#SEC23">Invoking the <CODE>tupdate</CODE> Program</A>, for more information about what -<CODE>tupdate</CODE> really does. - -</P> -<P> -Whatever route or means taken, the goal is obtaining an updated -<TT>`<VAR>lang</VAR>.pox'</TT> file offering translations for all strings. -When this is properly achieved, this file <TT>`<VAR>lang</VAR>.pox'</TT> may -take the place of the previous official <TT>`<VAR>lang</VAR>.po'</TT> file. - -</P> -<P> -The time mobility, or fluidity of PO files, is an integral part of -the translation game, and should be well understood, and accepted. -People resisting it will have a hard time participating in the GNU -Translation Project, or will give a hard time to other participants! -In particular, maintainers should relax and include all available PO -files in their distributions, even if these have not recently been -updated, without banging or otherwise trying to exert pressure on the -translator teams to get the job done. The pressure should rather -come from the community of users speaking a particular language, -and maintainers should consider themselves fairly relieved of any -concern about the adequacy of translation files. On the other hand, -translators should reasonably try updating the PO files they are -responsible for, while the package is undergoing pretest, prior to -an official distribution. - -</P> -<P> -Once the PO file is complete and dependable, the <CODE>msgfmt</CODE> program -is used for turning the PO file into a machine-oriented format, which -may yield efficient retrieval of translations by the programs of the -package, whenever needed at runtime (see section <A HREF="gettext.html#SEC31">The Format of GNU MO Files</A>). See section <A HREF="gettext.html#SEC30">Invoking the <CODE>msgfmt</CODE> Program</A>, for more information about all modalities of execution -for the <CODE>msgfmt</CODE> program. - -</P> -<P> -Finally, the modified and marked C sources are compiled and linked -with the GNU <CODE>gettext</CODE> library, usually through the operation of -<CODE>make</CODE>, given a suitable <TT>`Makefile'</TT> exists for the project, -and the resulting executable is installed somewhere users will find it. -The MO files themselves should also be properly installed. Given the -appropriate environment variables are set (see section <A HREF="gettext.html#SEC35">Magic for End Users</A>), the -program should localize itself automatically, whenever it executes. - -</P> -<P> -The remaining of this manual has the purpose of deepening the various -steps outlined in this section. - -</P> - - -<H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">PO Files and PO Mode Basics</A></H1> - -<P> -The GNU <CODE>gettext</CODE> toolset helps programmers and translators -at producing, updating and using translation files, mainly those -PO files which are textual, editable files. This chapter insists -on the format of PO files, and contains a PO mode starter. PO mode -description is spread over this manual instead of being concentrated -in one place, this chapter presents only the basics of PO mode. - -</P> - - - -<H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">Completing GNU <CODE>gettext</CODE> Installation</A></H2> - -<P> -Once you have received, unpacked, configured and compiled the GNU -<CODE>gettext</CODE> distribution, the <SAMP>`make install'</SAMP> command puts in -place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and -<CODE>tupdate</CODE>, as well as their available message catalogs. For -completing a comfortable installation, you might also want to make the -PO mode available to your GNU Emacs users. - -</P> -<P> -To finish the installation of the PO mode, you might want modify your -file <TT>`.emacs'</TT>, once and for all, so it contains a few lines looking -like: - -</P> - -<PRE> -(setq auto-mode-alist - (cons '("\\.pox?\\'" . po-mode) auto-mode-alist)) -(autoload 'po-mode "po-mode") -</PRE> - -<P> -Later, whenever you edit some <TT>`.po'</TT> or <TT>`.pox'</TT> file, Emacs -loads <TT>`po-mode.elc'</TT> (or <TT>`po-mode.el'</TT>) as needed, and -automatically activate PO mode commands for the associated buffer. -The string <EM>PO</EM> appears in the mode line for any buffer for -which PO mode is active. Many PO files may be active at once in a -single Emacs session. - -</P> - - -<H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">The Format of PO Files</A></H2> - -<P> -A PO file is made up of many entries, each entry holding the relation -between an original untranslated string and its corresponding -translation. All entries in a given PO file usually pertain -to a single project, and all translations are expressed in a single -target language. One PO file <STRONG>entry</STRONG> has the following schematic -structure: - -</P> - -<PRE> -<VAR>white-space</VAR> -# <VAR>translator-comments</VAR> -#. <VAR>automatic-comments</VAR> -#: <VAR>reference</VAR>... -msgid <VAR>untranslated-string</VAR> -msgstr <VAR>translated-string</VAR> -</PRE> - -<P> -The general structure of a PO file should be well understood by -the translator. When using PO mode, very little has to be known -about the format details, as PO mode takes care of them for her. - -</P> -<P> -Entries begin with some optional white space. Usually, when generated -through GNU <CODE>gettext</CODE> tools, there is exactly one blank line -between entries. Then comments follow, on lines all starting with the -character <KBD>#</KBD>. There are two kinds of comments: those which have -some white space immediately following the <KBD>#</KBD>, which comments are -created and maintained exclusively by the translator, and those which -have some non-white character just after the <KBD>#</KBD>, which comments -are created and maintained automatically by GNU <CODE>gettext</CODE> tools. -All comments, of any kind, are optional. - -</P> -<P> -After white space and comments, entries show two strings, giving -first the untranslated string as it appears in the original program -sources, and then, the translation of this string. The original -string is introduced by the keyword <CODE>msgid</CODE>, and the translation, -by <CODE>msgstr</CODE>. The two strings, untranslated and translated, -are quoted in various ways in the PO file, using <KBD>"</KBD> -delimiters and <KBD>\</KBD> escapes, but the translator does not really -have to pay attention to the precise quoting format, as PO mode fully -intend to take care of quoting for her. - -</P> -<P> -The <CODE>msgid</CODE> strings, as well as automatic comments, are produced -and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not -provide means for the translator to alter these. The most she can -do is merely deleting them, and only by deleting the whole entry. -On the other hand, the <CODE>msgstr</CODE> string, as well as translator -comments, are really meant for the translator, and PO mode gives her -the full control she needs. - -</P> -<P> -It happens that some lines, usually whitespace or comments, follow the -very last entry of a PO file. Such lines are not part of any entry, -and PO mode is unable to take action on those lines. By using the -PO mode function <KBD>M-x po-normalize</KBD>, the translator may get -rid of those spurious lines. See section <A HREF="gettext.html#SEC12">Normalizing Strings in Entries</A>. - -</P> -<P> -The remainder of this section may be safely skipped for those using -PO mode, yet it may be interesting for everybody to have a better -idea of the precise format of a PO file. On the other hand, those -not having GNU Emacs handy should carefully continue reading on. - -</P> -<P> -Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects -the C syntax for a character string, including the surrounding quotes -and imbedded backslashed escape sequences. When the time comes -to write multi-line strings, one should not use escaped newlines. -Instead, a closing quote should follow the last character on the -line to be continued, and an opening quote should resume the string -at the beginning of the following PO file line. For example: - -</P> - -<PRE> -msgid "" -"Here is an example of how one might continue a very long string\n" -"for the common case the string represents multi-line output.\n" -</PRE> - -<P> -In this example, the empty string is used on the first line, for -allowing the better alignment of the <KBD>H</KBD> from the word <SAMP>`Here'</SAMP> -over the <KBD>f</KBD> from the word <SAMP>`for'</SAMP>. In this example, the -<CODE>msgid</CODE> keyword is followed by three strings, which are meant -to be concatenated. Concatenating the empty string does not change -the resulting overall string, but it is a way for us to comply with -the necessity of <CODE>msgid</CODE> to be followed by a string on the same -line, while keeping the multi-line presentation left-justified, as -we find this to be cleaner disposition. The empty string could have -been omitted, but only if the string starting with <SAMP>`Here'</SAMP> was -promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF1" HREF="gettext_foot.html#FOOT1">(1)</A> It was not really necessary -either to switch between the two last quoted strings immediately after -the newline <SAMP>`\n'</SAMP>, the switch could have occurred after <EM>any</EM> -other character, we just did it this way because it is neater. - -</P> -<P> -One should carefully distinguish between end of lines marked as -<SAMP>`\n'</SAMP> <EM>inside</EM> quotes, which are part of the represented -string, and end of lines in the PO file itself, outside string quotes, -which have no incidence on the represented string. - -</P> -<P> -Outside strings, white lines and comments may be used freely. -Comments start at the beginning of a line with <SAMP>`#'</SAMP> and extend -until the end of the PO file line. Comments written by translators -should have the initial <SAMP>`#'</SAMP> immediately followed by some white -space. If the <SAMP>`#'</SAMP> is not immediately followed by white space, -this comment is most likely generated and managed by specialized GNU -tools, and might disappear or be replaced unexpectandly when the PO -file is given to <CODE>tupdate</CODE>. - -</P> - - -<H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">Main Commands</A></H2> - -<P> -When Emacs finds a PO file in a window, PO mode is activated -for that window. This puts the window read-only and establishes a -po-mode-map, which is a genuine Emacs mode, in that way that it is -not derived from text mode in any way. - -</P> -<P> -The main PO commands are those who do not fit in the other categories in -subsequent sections, they allow for quitting PO mode or managing windows -in special ways. - -</P> -<DL COMPACT> - -<DT><KBD>u</KBD> -<DD> -Undo last modification to the PO file. - -<DT><KBD>q</KBD> -<DD> -Quit processing and save the PO file. - -<DT><KBD>o</KBD> -<DD> -Temporary leave the PO file window. - -<DT><KBD>h</KBD> -<DD> -Show help about PO mode. - -<DT><KBD>=</KBD> -<DD> -Give some PO file statistics. - -<DT><KBD>v</KBD> -<DD> -Batch validate the format of the whole PO file. - -</DL> - -<P> -The command <KBD>u</KBD> (<CODE>po-undo</CODE>) interfaces to the GNU Emacs -<EM>undo</EM> facility. See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>. Each time <KBD>u</KBD> is typed, modifications the translator -did to the PO file are undone a little more. For the purpose of -undoing, each PO mode command is atomic. This is especially true for -the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single -use of this command is undone at once, even if the edition itself -implied several actions. However, while in the editing window, one -can undo the edition work quite parsimoniously. - -</P> -<P> -The command <KBD>q</KBD> (<CODE>po-quit</CODE>) is used when the translator is -done with the PO file. If the file has been modified, it is saved -on disk first. However, prior to all this, the command checks if -some untranslated message remains in the PO file and, if yes, the -translator is asked if she really wants to leave working with this -PO file. This is the preferred way of getting rid of an Emacs PO -file buffer. Merely killing it through the usual command <KBD>C-x -k</KBD> (<CODE>kill-buffer</CODE>), say, has the unnice effect of leaving a PO -internal work buffer behind. - -</P> -<P> -The command <KBD>o</KBD> (<CODE>po-other-window</CODE>) is another, softer -way, to leave PO mode, temporarily. It just moves the cursor in -some other Emacs window, and pops one if necessary. For example, if -the translator just got PO mode to show some source context in some -other, she might discover some apparent bug in the program source -that needs correction. This command allows the translator to change -sex, become a programmer, and have the cursor right into the window -containing the program she (or rather <EM>he</EM>) wants to modify. -By later getting the cursor back in the PO file window, or by -asking Emacs to edit this file once again, PO mode is then recovered. - -</P> -<P> -The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all -available PO mode commands. The translator should then type any -character to resume normal PO mode operations. The command <KBD>?</KBD> -has the same effect as <KBD>h</KBD>. - -</P> -<P> -The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number -of entries in the PO file, the ordinal of the current entry -(counted from 1), the number of untranslated entries, the number of -obsolete entries, and displays all these numbers. - -</P> -<P> -The command <KBD>v</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in -verbose mode over the current PO file. This command first offers -to save the current PO file on disk. The <CODE>msgfmt</CODE> tool, from -GNU <CODE>gettext</CODE>, has the purpose of creating an MO file out of a -PO file, and PO mode uses the features of this program for checking -the overall format of a PO file, as well as all individual entries. - -</P> -<P> -The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so -the translator regains control immediately while her PO file -is being studied. Error output is collected in the GNU Emacs -<SAMP>`*compilation*'</SAMP> buffer, displayed in another window. The regular -GNU Emacs command <KBD>C-x`</KBD> (<CODE>next-error</CODE>), as well as other -usual compile commands, allow the translator to reposition quickly to -the offending parts of the PO file. Once the cursor on the line in -error, the translator may decide for any PO mode action which would -help correcting the error. - -</P> - - -<H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">Entry Positioning</A></H2> - -<P> -The cursor in a PO file window is almost always part of -an entry. The only exceptions are the special case when the cursor -is after the last entry in the file, or when the PO file is -empty. The entry where the cursor is found to be is said to be the -current entry. Many PO mode commands operate on the current entry, -so moving the cursor does more than allowing the translator to browse -the PO file, this also selects on which entry commands operate. - -</P> -<P> -Some PO mode commands alter the position of the cursor in a specialized -way. A few of those special purpose positioning are described here, -the others are described in following sections. - -</P> -<DL COMPACT> - -<DT><KBD>.</KBD> -<DD> -Redisplay the current entry. - -<DT><KBD>n</KBD> -<DD> -<DT><KBD>SPC</KBD> -<DD> -Select the entry after the current one. - -<DT><KBD>p</KBD> -<DD> -<DT><KBD>DEL</KBD> -<DD> -Select the entry before the current one. - -<DT><KBD><</KBD> -<DD> -Select the first entry in the PO file. - -<DT><KBD>></KBD> -<DD> -Select the last entry in the PO file. - -<DT><KBD>m</KBD> -<DD> -Record the location of the current entry for later use. - -<DT><KBD>l</KBD> -<DD> -Return to a previously saved entry location. - -<DT><KBD>x</KBD> -<DD> -Exchange the current entry location with the previously saved one. - -</DL> - -<P> -Any GNU Emacs command able to reposition the cursor may be used -to select the current entry in PO mode, including commands which -move by characters, lines, paragraphs, screens or pages, and search -commands. However, there is a kind of standard way to display the -current entry in PO mode, which usual GNU Emacs commands moving -the cursor do not especially try to enforce. The command <KBD>.</KBD> -(<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the -current entry properly, after the current entry has been changed by -means external to PO mode, or the Emacs screen otherwise altered. - -</P> -<P> -It is yet to decide if PO mode would help the translator, or otherwise -irritate her, by forcing a more fixed window disposition while she -is doing her work. We originally had quite precise ideas about -how windows should behave, but on the other hand, anyone used to -GNU Emacs is often happy to keep full control. Maybe a fixed window -disposition might be offered as a PO mode option that the translator -might activate or deactivate at will, so it could be offered on an -experimental basis. If nobody feels a real need for using it, or -a compulsion for writing it, we might as well drop this whole idea. -The incentive for doing it should come from translators rather than -programmers, as opinions from an experienced translator are surely -more worth to me than opinions from programmers <EM>thinking</EM> about -how <EM>others</EM> should do translation. - -</P> -<P> -The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD> -(<CODE>po-previous-entry</CODE>) move the cursor the entry following, -or preceding, the current one. If <KBD>n</KBD> is given while the -cursor is on the last entry of the PO file, or if <KBD>p</KBD> -is given while the cursor is on the first entry, no move is done. -<KBD><KBD>SPC</KBD></KBD> and <KBD><KBD>DEL</KBD></KBD> are alternate keys for <KBD>n</KBD> and -<KBD>p</KBD>, respectively. - -</P> -<P> -The commands <KBD><</KBD> (<CODE>po-first-entry</CODE>) and <KBD>></KBD> -(<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last -entry, of the PO file. When the cursor is located past the last -entry in a PO file, most PO mode commands will return an error saying -<SAMP>`After last entry'</SAMP>. However, the commands <KBD><</KBD> and <KBD>></KBD> -have the special property of being able to work even when the cursor -is not into some PO file entry, and you may use them for nicely -correcting this situation. But even these commands will fail on a -truly empty PO file. There are development plans for PO mode for it -to interactively fill an empty PO file from sources. See section <A HREF="gettext.html#SEC16">Marking Translatable Strings</A>. - -</P> -<P> -The translator may decide, before working at the translation of -a particular entry, that she needs browsing the remainder of the -PO file, maybe for finding the terminology or phraseology used -in related entries. She can of course use the standard Emacs idioms -for saving the current cursor location in some register, and use that -register for getting back, or else, to use the location ring. - -</P> -<P> -PO mode offers another approach, by which cursor locations may be saved -onto a special stack. The command <KBD>m</KBD> (<CODE>po-push-location</CODE>) -merely adds the location of current entry to the stack, pushing -the already saved locations under the new one. The command -<KBD>l</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and -reposition the cursor to the entry associated with that top element. -This position is then lost, for the next <KBD>l</KBD> will move the cursor -to the previously saved location, and so on until locations remain -on the stack. - -</P> -<P> -If the translator wants the position to be kept on the location stack, -maybe for taking a mere look at the entry associated with the top -element, then go elsewhere with the intent of getting back later, she -ought to use <KBD>m</KBD> immediately after <KBD>l</KBD>. - -</P> -<P> -The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously -reposition the cursor to the entry associated with the top element of -the stack of saved locations, and replace that top element with the -location of the current entry before the move. Consequently, repeating -the <KBD>x</KBD> command toggles alternatively between two entries. -For achieving this, the translator will position the cursor on the -first entry, use <KBD>m</KBD>, then position to the second entry, and -merely use <KBD>x</KBD> for making the switch. - -</P> - - -<H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">Normalizing Strings in Entries</A></H2> - -<P> -There are many different ways for encoding a particular string into a -PO file entry, because there are so many different ways to split and -quote multi-line strings, and even, to represent special characters -by backslahsed escaped sequences. Some features of PO mode rely on -the ability for PO mode to scan an already existing PO file for a -particular string encoded into the <CODE>msgid</CODE> field of some entry. -Even if PO mode has internally all the built-in machinery for -implementing this recognition easily, doing it fast is technically -difficult. For facilitating a solution to this efficiency problem, -we decided for a canonical representation for strings. - -</P> -<P> -A conventional representation of strings in a PO file is currently -under discussion, and PO mode experiments a canonical representation. -Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform -way of representing equivalent strings would be useful, as the internal -normalization needed by PO mode could be automatically satisfied -when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>. An explicit -PO mode normalization should then be only necessary for PO files -imported from elsewhere, or for when the convention itself evolves. - -</P> -<P> -So, for achieving normalization of at least the strings of a given -PO file needing a canonical representation, the following PO mode -command is available: - -</P> -<DL COMPACT> - -<DT><KBD>M-x po-normalize</KBD> -<DD> -Tidy the whole PO file by making entries more uniform. - -</DL> - -<P> -The special command <KBD>M-x po-normalize</KBD>, which has no associate -keys, revises all entries, ensuring that strings of both original -and translated entries use uniform internal quoting in the PO file. -It also removes any crumb after the last entry. This command may be -useful for PO files freshly imported from elsewhere, or if we ever -improve on the canonical quoting format we use. This canonical format -is not only meant for getting cleaner PO files, but also for greatly -speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands. - -</P> -<P> -<KBD>M-x po-normalize</KBD> presently makes three passes over the entries. -The first implements heuristics for converting PO files for GNU -<CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE> -fields were using K&R style C string syntax for multi-line strings. -These heuristics may fail for comments not related to obsolete -entries and ending with a backslash; they also depend on subsequent -passes for finalizing the proper commenting of continued lines for -obsolete entries. This first pass might disappear once all oldish PO -files would have been adjusted. The second and third pass normalize -all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively. They also -clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE> -for continued lines. - -</P> -<P> -Having such an explicit normalizing command allows for importing PO -files from other sources, but also eases the evolution of the current -convention, evolution driven mostly by aesthetic concerns, as of now. -It is all easy to make suggested adjustments at a later time, as the -normalizing command and eventually, other GNU <CODE>gettext</CODE> tools -should greatly automate conformance. A description of the canonical -string format is given below, for the particular benefit of those not -having GNU Emacs handy, and who would nevertheless want to handcraft -their PO files in nice ways. - -</P> -<P> -Right now, in PO mode, strings are single line or multi-line. A string -goes multi-line if and only if it has <EM>embedded</EM> newlines, that -is, if it matches <SAMP>`[^\n]\n+[^\n]'</SAMP>. So, we would have: - -</P> - -<PRE> -msgstr "\n\nHello, world!\n\n\n" -</PRE> - -<P> -but, replacing the space by a newline, this becomes: - -</P> - -<PRE> -msgstr "" -"\n" -"\n" -"Hello,\n" -"world!\n" -"\n" -"\n" -</PRE> - -<P> -We are deliberately using a caricatural example, here, to make the -point clearer. Usually, multi-lines are not that bad looking. -It is probable that we will implement the following suggestion. -We might lump together all initial newlines into the empty string, -and also all newlines introducing empty lines (that is, for <VAR>n</VAR> -> 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate -string), so making the previous example appear: - -</P> - -<PRE> -msgstr "\n\n" -"Hello,\n" -"world!\n" -"\n\n" -</PRE> - -<P> -There are a few yet undecided little points about string normalization, -to be documented in this manual, once these questions settle. - -</P> - - -<H1><A NAME="SEC13" HREF="gettext_toc.html#TOC13">Preparing Program Sources</A></H1> - -<P> -For the programmer, changes to the C source code fall into three -categories. First, you have to make the localization functions -known to all modules needing message translation. Second, you should -properly trigger the operation of GNU <CODE>gettext</CODE> when the program -initializes, usually from the <CODE>main</CODE> function. Last, you should -identify and especially mark all constant strings in your program -needing translation. - -</P> -<P> -Presuming that your set of programs, or package, has been adjusted -so all needed GNU <CODE>gettext</CODE> files are available, and your -<TT>`Makefile'</TT> files are adjusted (see section <A HREF="gettext.html#SEC65">The Maintainer's View</A>), each C module -having translated C strings should contain the line: - -</P> - -<PRE> -#include <libintl.h> -</PRE> - -<P> -The remaining changes to your C sources are discussed in the further -sections of this chapter. - -</P> - - - -<H2><A NAME="SEC14" HREF="gettext_toc.html#TOC14">Triggering <CODE>gettext</CODE> Operations</A></H2> - -<P> -The initialization of locale data should be done with more or less -the same code in every program, as demonstrated below: - -</P> - -<PRE> -int -main (argc, argv) - int argc; - char argv; -{ - ... - setlocale (LC_ALL, ""); - bindtextdomain (PACKAGE, LOCALEDIR); - textdomain (PACKAGE); - ... -} -</PRE> - -<P> -<VAR>PACKAGE</VAR> and <VAR>LOCALEDIR</VAR> should be provided either by -<TT>`config.h'</TT> or by the Makefile. For now consult the <CODE>gettext</CODE> -sources for more information. - -</P> -<P> -The use of <CODE>LC_ALL</CODE> might not be appropriate for you. -<CODE>LC_ALL</CODE> includes all locale categories and especially -<CODE>LC_CTYPE</CODE>. This later category is responsible for determining -character classes with the <CODE>isalnum</CODE> etc. functions from -<TT>`ctype.h'</TT> which could especially for programs, which process some -kind of input language, be wrong. For example this would mean that a -source code using the (cedille character) is runnable in -France but not in the U.S. - -</P> -<P> -So it is sometimes necessary to replace the <CODE>LC_ALL</CODE> line in the -code above by a sequence of <CODE>setlocale</CODE> lines - -</P> - -<PRE> -{ - ... - setlocale (LC_TIME, ""); - setlocale (LC_MESSAGES, ""); - ... -} -</PRE> - -<P> -or to switch for and back to the character class in question. - -</P> - - -<H2><A NAME="SEC15" HREF="gettext_toc.html#TOC15">How Marks Appears in Sources</A></H2> - -<P> -The C sources should mark all strings requiring translation. Marking -is done in such a way that each translatable string appears to be -the sole argument of some function or preprocessor macro. There are -only a few such possible functions or macros meant for translation, -and their names are said to be marking keywords. The marking is -attached to strings themselves, rather than to what we do with them. -This approach has more uses. A blatant example is an error message -produced by formatting. The format string needs translation, as -well as some strings inserted through some <SAMP>`%s'</SAMP> specification -in the format, while the result from <CODE>sprintf</CODE> may have so many -different instances that it is unpractical to list them all in some -<SAMP>`error_string_out()'</SAMP> routine, say. - -</P> -<P> -This marking operation has two goals. The first goal of marking -is for triggering the retrieval of the translation, at run time. -The keyword are possibly resolved into a routine able to dynamically -return the proper translation, as far as possible or wanted, for the -argument string. Most localizable strings are found into executable -positions, that is, affected to variables or given as parameter to -functions. But this is not universal usage, and some translatable -strings appear in structured initializations. See section <A HREF="gettext.html#SEC17">Special Cases of Translatable Strings</A>. - -</P> -<P> -The second goal of the marking operation is to help <CODE>xgettext</CODE> -at properly extracting all translatable strings when it scans a set -of program sources and produces PO file templates. - -</P> -<P> -The canonical keyword for marking translatable strings is -<SAMP>`gettext'</SAMP>, it gave its name to the whole GNU <CODE>gettext</CODE> -package. For packages making only light use of the <SAMP>`gettext'</SAMP> -keyword, macro or function, it is easily used <EM>as is</EM>. However, -for packages using the <CODE>gettext</CODE> interface more heavily, it -is usually more convenient giving the main keyword a shorter, less -obtrusive name. Indeed, the keyword might appear on a lot of strings -all over the package, and programmers usually do not want nor need -that their program sources remind them loud, all the time, that they -are internationalized. Further, a long keyword has the disadvantage -of using more horizontal space, forcing more indentation work on -sources for those trying to keep them within 79 or 80 columns. - -</P> -<P> -Many GNU packages use <SAMP>`_'</SAMP> (a simple underline) as a keyword, -and write <SAMP>`_("Translatable string")'</SAMP> instead of <SAMP>`gettext -("Translatable string")'</SAMP>. Further, the usual GNU coding rule -wanting that there is a space between the keyword and the opening -parenthesis is relaxed, in practice, for this particular usage. -So, the textual overhead per translatable string is reduced to -only three characters: the underline and the two parentheses. -However, even if GNU <CODE>gettext</CODE> uses this convention internally, -it does not offer it officially. The real, genuine keyword is truly -<SAMP>`gettext'</SAMP> indeed. It is fairly easy for those wanting to use -<SAMP>`_'</SAMP> instead of <SAMP>`gettext'</SAMP> to declare: - -</P> - -<PRE> -#include <libintl.h> -#define _(String) gettext (String) -</PRE> - -<P> -instead of merely using <SAMP>`#include <libintl.h>'</SAMP>. - -</P> -<P> -Later on, the maintenance is relatively easy. If, as a programmer, -you add or modify a string, you will have to ask yourself if the -new or altered string requires translation, and include it within -<SAMP>`_()'</SAMP> if you think it should be translated. <SAMP>`"%s: %d"'</SAMP> is -an example of string <EM>not</EM> requiring translation! - -</P> - - -<H2><A NAME="SEC16" HREF="gettext_toc.html#TOC16">Marking Translatable Strings</A></H2> - -<P> -In PO mode, one set of features is meant more for the programmer than -for the translator, and allows him to interactively mark which strings, -in a set of program sources, are translatable, and which are not. -Even if it is a fairly easy job for a programmer to find and mark -such strings by other means, using any editor of his choice, PO mode -makes this work more comfortable. Further, this gives translators -who feel a little like programmers, or programmers who feel a little -like translators, a tool letting them work at marking translatable -strings in the program sources, while simultaneously producing a set of -translation in some language, for the package being internationalized. - -</P> -<P> -The set of program sources, aimed by the PO mode commands describe -here, should have an Emacs tags table constructed for your project, -prior to using these PO file commands. This is easy to do. In any -shell window, change the directory to the root of your project, then -execute a command resembling: - -</P> - -<PRE> -etags src/*.[hc] lib/*.[hc] -</PRE> - -<P> -presuming here you want to process all <TT>`.h'</TT> and <TT>`.c'</TT> files -from the <TT>`src/'</TT> and <TT>`lib/'</TT> directories. This command will -explore all said files and create a <TT>`TAGS'</TT> file in your root -directory, somewhat summarizing the contents using a special file -format Emacs can understand. - -</P> -<P> -For official GNU packages which follow the GNU coding standard there is -a make goal <CODE>tags</CODE> or <CODE>TAGS</CODE> which construct the tag files in -all directories and for all files containing source code. - -</P> -<P> -Once your <TT>`TAGS'</TT> file is ready, the following commands assist -the programmer at marking translatable strings in his set of sources. -But these commands are necessarily driven from within a PO file -window, and it is likely that you do not even have such a PO file yet. -This is not a problem at all, as you may safely open a new, empty PO -file, mainly for using these commands. This empty PO file will slowly -fill in while you mark strings as translatable in your program sources. - -</P> -<DL COMPACT> - -<DT><KBD>,</KBD> -<DD> -Search through program sources for a string which looks like a -candidate for translation. - -<DT><KBD>M-,</KBD> -<DD> -Mark the last string found with <SAMP>`_()'</SAMP>. - -<DT><KBD>M-.</KBD> -<DD> -Mark the last string found with a keyword taken from a set of possible -keywords. This command with a prefix allows some management of these -keywords. - -</DL> - -<P> -The <KBD>,</KBD> (<CODE>po-tags-search</CODE>) command search for the next -occurrence of a string which looks like a possible candidate for -translation, and displays the program source in another Emacs window, -positioned in such a way that the string is near the top of this other -window. If the string is to big to fit whole in this window, it is -rather positioned so only its end is shown. In any case, the cursor -is left in the PO file window. If the shown string would be better -presented differently in different native languages, you may mark it -using <KBD>M-,</KBD> or <KBD>M-.</KBD>. Otherwise, you might rather ignore it -and skip to the next string by merely repeating the <KBD>,</KBD> command. - -</P> -<P> -A string is a good candidate for translation if it contains a sequence -of three or more letters. A string containing at most two letters in -a row will be considered as a candidate if it has more letters than -non-letters. The command disregards strings containing no letters, -or isolated letters only. It also disregards strings within comments, -or strings already marked with some keyword PO mode knows (see below). - -</P> -<P> -If you have never told Emacs about some <TT>`TAGS'</TT> file to use, the -command will request that you specify one from the minibuffer, the -first time you use the command. You may later change your <TT>`TAGS'</TT> -file by using the regular Emacs command <KBD>M-x visit-tags-table</KBD>, -which will ask you to name the precise <TT>`TAGS'</TT> file you want -to use. See section `Tag Tables' in <CITE>The Emacs Editor</CITE>. - -</P> -<P> -Each time you use the <KBD>,</KBD> command, the search resumes where it was -left over by the previous search, and goes through all program sources, -obeying the <TT>`TAGS'</TT> file, until all sources have been processed. -However, by giving a prefix argument to the command (<KBD>C-u -,)</KBD>, you may request that the search be restarted all over again -from the first program source; but in this case, strings that you -recently marked as translatable will be automatically skipped. - -</P> -<P> -Using this <KBD>,</KBD> command does not prevent using of other regular -Emacs tags commands. For example, regular <CODE>tags-search</CODE> or -<CODE>tags-query-replace</CODE> commands may be used without disrupting the -independent <KBD>,</KBD> search sequence. However, as implemented, the -<EM>initial</EM> <KBD>,</KBD> command (or the <KBD>,</KBD> command is used with a -prefix) might also reinitialize the regular Emacs tags searching to the -first tags file, this reinitialization might be considered spurious. - -</P> -<P> -The <KBD>M-,</KBD> (<CODE>po-mark-translatable</CODE>) command will mark the -recently found string with the <SAMP>`_'</SAMP> keyword. The <KBD>M-.</KBD> -(<CODE>po-select-mark-and-mark</CODE>) command will request that you type -one keyword from the minibuffer and use that keyword for marking -the string. Both commands will automatically create a new PO file -untranslated entry for the string being marked, and make it the -current entry (making it easy for you to immediately proceed to its -translation, if you feel like doing it right away). It is possible -that the modifications made to the program source by <KBD>M-,</KBD> or -<KBD>M-.</KBD> render some source line longer than 80 columns, forcing you -to break and re-indent this line differently. You may use the <KBD>o</KBD> -command from PO mode, or any other window changing command from -GNU Emacs, to break out into the program source window, and do any -needed adjustments. You will have to use some regular Emacs command -to return the cursor to the PO file window, if you want commanding -<KBD>,</KBD> for the next string, say. - -</P> -<P> -The <KBD>M-.</KBD> command has a few built-in speedups, so you do not -have to explicitly type all keywords all the time. The first such -speedup is that you are presented with a <EM>preferred</EM> keyword, -which you may accept by merely typing <KBD><KBD>RET</KBD></KBD> at the prompt. -The second speedup is that you may type any non-ambiguous prefix of the -keyword you really mean, and the command will complete it automatically -for you. This also means that PO mode has to <EM>know</EM> all -your possible keywords, and that it will not accept mistyped keywords. - -</P> -<P> -If you reply <KBD>?</KBD> to the keyword request, the command gives a -list of all known keywords, from which you may choose. When the -command is prefixed by an argument (<KBD>C-u M-.</KBD>), it inhibits -updating any program source or PO file buffer, and does some simple -keyword management instead. In this case, the command asks for a -keyword, written in full, which becomes a new allowed keyword for -later <KBD>M-.</KBD> commands. Moreover, this new keyword automatically -becomes the <EM>preferred</EM> keyword for later commands. By typing -an already known keyword in response to <KBD>C-u M-.</KBD>, one merely -changes the <EM>preferred</EM> keyword and does nothing more. - -</P> -<P> -All keywords known for <KBD>M-.</KBD> are recognized by the <KBD>,</KBD> command -when scanning for strings, and strings already marked by any of those -known keywords are automatically skipped. If many PO files are opened -simultaneously, each one has its own independent set of known keywords. -There is no provision in PO mode, currently, for deleting a known -keyword, you have to quit the file (maybe using <KBD>q</KBD>) and reopen -it afresh. When a PO file is newly brought up in an Emacs window, only -<SAMP>`gettext'</SAMP> and <SAMP>`_'</SAMP> are known as keywords, and <SAMP>`gettext'</SAMP> -is preferred for the <KBD>M-.</KBD> command. In fact, this is not useful to -prefer <SAMP>`_'</SAMP>, as this one is already built in the <KBD>M-,</KBD> command. - -</P> - - -<H2><A NAME="SEC17" HREF="gettext_toc.html#TOC17">Special Cases of Translatable Strings</A></H2> - -<P> -The attentive reader might now point out that it is not always possible -to mark translatable string with <CODE>gettext</CODE> or something like this. -Consider the following case: - -</P> - -<PRE> -{ - static const char *messages[] = { - "some very meaningful message", - "and another one" - }; - const char *string; - ... - string - = index > 1 ? "a default message" : messages[index]; - - fputs (string); - ... -} -</PRE> - -<P> -While it is no problem to mark the string <CODE>"a default message"</CODE> it -is not possible to mark the string initializers for <CODE>messages</CODE>. -What is to do? We have to fulfill two tasks. First we have to mark the -strings so that the <CODE>xgettext</CODE> program (see section <A HREF="gettext.html#SEC19">Invoking the <CODE>xgettext</CODE> Program</A>) -can find them, and second we have to translate the string at runtime -before printing them. - -</P> -<P> -The first task can be fulfilled by creating a new keyword, which names a -no-op. For the second we have to mark all access points to a string -from the array. So one solution can look like this: - -</P> - -<PRE> -#define gettext_noop(String) (String) - -{ - static const char *messages[] = { - gettext_noop ("some very meaningful message"), - gettext_noop ("and another one") - }; - const char *string; - ... - string - = index > 1 ? gettext ("a default message") : gettext (messages[index]); - - fputs (string); - ... -} -</PRE> - -<P> -Please convince yourself that the string which is written by -<CODE>fputs</CODE> is translated in any case. How to get <CODE>xgettext</CODE> know -the additional keyword <CODE>gettext_noop</CODE> is explained in section <A HREF="gettext.html#SEC19">Invoking the <CODE>xgettext</CODE> Program</A>. - -</P> -<P> -The above is of course not the only solution. You could also come along -with the following one: - -</P> - -<PRE> -#define gettext_noop(String) (String) - -{ - static const char *messages[] = { - gettext_noop ("some very meaningful message", - gettext_noop ("and another one") - }; - const char *string; - ... - string - = index > 1 ? gettext_noop ("a default message") : messages[index]; - - fputs (gettext (string)); - ... -} -</PRE> - -<P> -But this has some drawbacks. First the programmer has to take care that -he uses <CODE>gettext_noop</CODE> for the string <CODE>"a default message"</CODE>. -A use of <CODE>gettext</CODE> could have in rare cases unpredictable results. -The second reason is found in the internals of the GNU <CODE>gettext</CODE> -Library which will make this solution less efficient. - -</P> -<P> -One advantage is that you need not make control flow analysis to make -sure the output is really translated in any case. But this analysis is -generally not very difficult. If it should be in any situation you can -use this second method in this situation. - -</P> - - - -<H1><A NAME="SEC18" HREF="gettext_toc.html#TOC18">Making the Initial PO File</A></H1> - - - -<H2><A NAME="SEC19" HREF="gettext_toc.html#TOC19">Invoking the <CODE>xgettext</CODE> Program</A></H2> - - -<PRE> -xgettext [<VAR>option</VAR>] <VAR>inputfile</VAR> ... -</PRE> - -<DL COMPACT> - -<DT><SAMP>`-a'</SAMP> -<DD> -<DT><SAMP>`--extract-all'</SAMP> -<DD> -Extract all strings. - -<DT><SAMP>`-c [<VAR>tag</VAR>]'</SAMP> -<DD> -<DT><SAMP>`--add-comments[=<VAR>tag</VAR>]'</SAMP> -<DD> -Place comment block with <VAR>tag</VAR> (or those preceding keyword lines) -in output file. - -<DT><SAMP>`-C'</SAMP> -<DD> -<DT><SAMP>`--c++'</SAMP> -<DD> -Recognize C++ style comments. - -<DT><SAMP>`-d <VAR>name</VAR>'</SAMP> -<DD> -<DT><SAMP>`--default-domain=<VAR>name</VAR>'</SAMP> -<DD> -Use <TT>`<VAR>name</VAR>.po'</TT> for output (instead of <TT>`messages.po'</TT>). - -<DT><SAMP>`-D <VAR>directory</VAR>'</SAMP> -<DD> -<DT><SAMP>`--directory=<VAR>directory</VAR>'</SAMP> -<DD> -Change to <VAR>directory</VAR> before beginning to search and scan source -files. The resulting <TT>`.po'</TT> file will be written relative to the -original directory, though. - -<DT><SAMP>`-f <VAR>file</VAR>'</SAMP> -<DD> -<DT><SAMP>`--files-from=<VAR>file</VAR>'</SAMP> -<DD> -Read the names of the input files from <VAR>file</VAR> instead of getting -them from the command line. - -<DT><SAMP>`-h'</SAMP> -<DD> -<DT><SAMP>`--help'</SAMP> -<DD> -Display this help and exit. - -<DT><SAMP>`-I <VAR>list</VAR>'</SAMP> -<DD> -<DT><SAMP>`--input-path=<VAR>list</VAR>'</SAMP> -<DD> -List of directories searched for input files. - -<DT><SAMP>`-j'</SAMP> -<DD> -<DT><SAMP>`--join-existing'</SAMP> -<DD> -Join messages with existing file. - -<DT><SAMP>`-k <VAR>word</VAR>'</SAMP> -<DD> -<DT><SAMP>`--keyword[=<VAR>word</VAR>]'</SAMP> -<DD> -Additonal keyword to be looked for (without <VAR>word</VAR> means not to -use default keywords). - -The default keywords, which are always looked for if not explicitly -disabled, are <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE> and -<CODE>gettext_noop</CODE>. - -<DT><SAMP>`-m [<VAR>string</VAR>]'</SAMP> -<DD> -<DT><SAMP>`--msgstr-prefix[=<VAR>string</VAR>]'</SAMP> -<DD> -Use <VAR>string</VAR> or "" as prefix for msgstr entries. - -<DT><SAMP>`-M [<VAR>string</VAR>]'</SAMP> -<DD> -<DT><SAMP>`--msgstr-suffix[=<VAR>string</VAR>]'</SAMP> -<DD> -Use <VAR>string</VAR> or "" as suffix for msgstr entries. - -<DT><SAMP>`--no-location'</SAMP> -<DD> -Do not write <SAMP>`#: <VAR>filename</VAR>:<VAR>line</VAR>'</SAMP> lines. - -<DT><SAMP>`-n'</SAMP> -<DD> -<DT><SAMP>`--add-location'</SAMP> -<DD> -Generate <SAMP>`#: <VAR>filename</VAR>:<VAR>line</VAR>'</SAMP> lines (default). - -<DT><SAMP>`--omit-header'</SAMP> -<DD> -Don't write header with <SAMP>`msgid ""'</SAMP> entry. - -This is useful for testing purposes because it eliminates a source -of variance for generated <CODE>.gmo</CODE> files. We can ship some of -these files in the GNU <CODE>gettext</CODE> package, and the result of -regenerating them through <CODE>msgfmt</CODE> should yield the same values. - -<DT><SAMP>`-p <VAR>dir</VAR>'</SAMP> -<DD> -<DT><SAMP>`--output-dir=<VAR>dir</VAR>'</SAMP> -<DD> -Output files will be placed in directory <VAR>dir</VAR>. - -<DT><SAMP>`-s'</SAMP> -<DD> -<DT><SAMP>`--sort-output'</SAMP> -<DD> -Generate sorted output and remove duplicates. - -<DT><SAMP>`--strict'</SAMP> -<DD> -Write out strict Uniforum conforming PO file. - -<DT><SAMP>`-v'</SAMP> -<DD> -<DT><SAMP>`--version'</SAMP> -<DD> -Output version information and exit. - -<DT><SAMP>`-x <VAR>file</VAR>'</SAMP> -<DD> -<DT><SAMP>`--exclude-file=<VAR>file</VAR>'</SAMP> -<DD> -Entries from <VAR>file</VAR> are not extracted. - -</DL> - -<P> -Search path for supplementary PO files is: -<TT>`/usr/local/share/nls/src/'</TT>. - -</P> -<P> -If <VAR>inputfile</VAR> is <SAMP>`-'</SAMP>, standard input is read. - -</P> -<P> -This implementation of <CODE>xgettext</CODE> is able to process a few awkward -cases, like strings in preprocessor macros, ANSI concatenation of -adjacent strings, and escaped end of lines for continued strings. - -</P> - - -<H2><A NAME="SEC20" HREF="gettext_toc.html#TOC20">C Sources Context</A></H2> - -<P> -PO mode is particularily powerful when used with PO files -created through GNU <CODE>gettext</CODE> utilities, as those utilities -insert special comments in the PO files they generate. -Some of these special comments relate the PO file entry to -exactly where the untranslated string appears in the program sources. - -</P> -<P> -When the translator gets to an untranslated entry, she is fairly -often faced with an original string which is not as informative as -it normally should, being succinct, cryptic, or otherwise ambiguous. -Before chosing how to translate the string, she needs to understand -better what the string really means and how tight the translation has -to be. Most of times, when problems arise, the only way left to make -her judgment is looking at the true program sources from where this -string originated, searching for surrounding comments the programmer -might have put in there, and looking around for helping clues of -<EM>any</EM> kind. - -</P> -<P> -Surely, when looking at program sources, the translator will receive -more help if she is a fluent programmer. However, even if she is -not versed in programming and feels a little lost in C code, the -translator should not be shy at taking a look, once in a while. -It is most probable that she will still be able to find some of the -hints she needs. She will learn quickly to not feel uncomfortable -in program code, paying more attention to programmer's comments, -variable and function names (if he dared chosing them well), and -overall organization, than to programmation itself. - -</P> -<P> -The following commands are meant to help the translator at getting -program source context for a PO file entry. - -</P> -<DL COMPACT> - -<DT><KBD>c</KBD> -<DD> -Resume the display of a program source context, or cycle through them. - -<DT><KBD>M-c</KBD> -<DD> -Display of a program source context selected by menu. - -<DT><KBD>d</KBD> -<DD> -Add a directory to the search path for source files. - -<DT><KBD>M-d</KBD> -<DD> -Delete a directory from the search path for source files. - -</DL> - -<P> -The commands <KBD>c</KBD> (<CODE>po-cycle-reference</CODE>) and <KBD>M-c</KBD> -(<CODE>po-select-reference</CODE>) both open another window displaying -some source program file, and already positioned in such a way that -it shows an actual use of the current string to translate. By doing -so, the command gives source program context for the string. But if -the entry has no source context references, or if all references -are unresolved along the search path for program sources, then the -command diagnoses this as an error. - -</P> -<P> -Even if <KBD>c</KBD> (or <KBD>M-c</KBD>) opens a new window, the cursor stays -in the PO file window. If the translator really wants to -get into the program source window, she ought to do it explicitly, -maybe by using command <KBD>o</KBD>. - -</P> -<P> -When <KBD>c</KBD> is typed for the first time, or for a PO file entry which -is different of the last one used for getting source context, then the -command reacts by giving the first context available for this entry, -if any. If some context has already been recently displayed for the -current PO file entry, and the translator wandered to do other -things, typing <KBD>c</KBD> again will merely resume, in another window, -the context last displayed. In particular, if the translator moved -the cursor away from the context in the source file, the command will -bring the cursor back to the context. By using <KBD>c</KBD> many times -in a row, with no interning other commands, PO mode will cycle to -the next available contexts for this particular entry, getting back -to the first context once the last has been shown. - -</P> -<P> -The command <KBD>M-c</KBD> behaves differently. Instead of cycling through -references, it lets the translator choose of particular reference among -many, and displays that reference. It is best used with completion, -if the translator types <KBD>TAB</KBD> immediately after <KBD>M-c</KBD>, in -response to the question, she will be offered a menu of all possible -references, as a reminder of which are the acceptable answers. -This command is useful only where there are really many contexts -available for a single string to translate. - -</P> -<P> -Program source files are usually found relative to where the PO -file stands. As a special provision, when this fails, the file is -also looked for, but relative to the directory immediately above it. -Those two cases take proper care of most PO files. However, it might -happen that a PO file has been moved, or is edited in a different -place than its normal location. When this happens, the translator -should tell PO mode in which directory normally sits the genuine PO -file. Many such directories may be specified, and all together, they -constitute what is called the <STRONG>search path</STRONG> for program sources. -The command <KBD>d</KBD> (<CODE>po-add-path</CODE>) is used to interactively -enter a new directory at the front of the search path, and the command -<KBD>M-d</KBD> (<CODE>po-delete-path</CODE>) is used to select, with completion, -one of the directories she does not want anymore on the search path. - -</P> - - -<H2><A NAME="SEC21" HREF="gettext_toc.html#TOC21">Using Translation Compendiums</A></H2> - -<P> -Compendiums are yet to be implemented. - -</P> -<P> -An incoming PO mode feature will let the translator maintain a -compendium of already achieved translations. A <STRONG>compendium</STRONG> -is a special PO file containing a set of translations recurring in -many different packages. The translator will be given commands for -adding entries to her compendium, and later initializing untranslated -entries, or updating already translated entries, from translations -kept in the compendium. For this to work, however, the compendium -would have to be normalized. See section <A HREF="gettext.html#SEC12">Normalizing Strings in Entries</A>. - -</P> - - - -<H1><A NAME="SEC22" HREF="gettext_toc.html#TOC22">Updating Existing PO Files</A></H1> - - - -<H2><A NAME="SEC23" HREF="gettext_toc.html#TOC23">Invoking the <CODE>tupdate</CODE> Program</A></H2> - - -<PRE> -tupdate --help -tupdate --version -tupdate <VAR>new</VAR> <VAR>old</VAR> -</PRE> - -<P> -File <VAR>new</VAR> is the last created PO file (generally by -<CODE>xgettext</CODE>). It need not contain any translations. File -<VAR>old</VAR> is the PO file including the old translations which will -be taken over to the newly created file as long as they still match. - -</P> -<P> -When English messages change in the programs, this is reflected in -the PO file as extracted by <CODE>xgettext</CODE>. In large messages, that -can be hard to detect, and will obviously result in an incomplete -translation. One of the virtues of <CODE>tupdate</CODE> is that it detects -such changes, saving the previous translation into a PO file comment, -so marking the entry as obsolete, and giving the modified string with -an empty translation, that is, marking the entry as untranslated. - -</P> - - -<H2><A NAME="SEC24" HREF="gettext_toc.html#TOC24">Untranslated Entries</A></H2> - -<P> -When <CODE>xgettext</CODE> originally creates a PO file, unless told -otherwise, it initializes the <CODE>msgid</CODE> field with the untranslated -string, and leaves the <CODE>msgstr</CODE> string to be empty. Such entries, -having an empty translation, are said to be <STRONG>untranslated</STRONG> entries. -Later, when the programmer slightly modifies some string right in -the program, this change is later reflected in the PO file -by the appearance of a new untranslated entry for the modified string. - -</P> -<P> -The usual commands moving from entry to entry consider untranslated -entries on the same level as active entries. Untranslated entries -are easily recognizable by the fact they end with <SAMP>`msgstr ""'</SAMP>. - -</P> -<P> -The work of the translator might be (quite naively) seen as the process -of seeking after an untranslated entry, editing a translation for -it, and repeating these actions until no untranslated entries remain. -Some commands are more specifically related to untranslated entry -processing. - -</P> -<DL COMPACT> - -<DT><KBD>e</KBD> -<DD> -Find the next untranslated entry. - -<DT><KBD>M-e</KBD> -<DD> -Find the previous untranslated entry. - -<DT><KBD>k</KBD> -<DD> -Turn the current entry into an untranslated one. - -</DL> - -<P> -The commands <KBD>e</KBD> (<CODE>po-next-empty-entry</CODE>) and <KBD>M-e</KBD> -(<CODE>po-previous-empty</CODE>) move forwards or backwards, chasing for an -obsolete entry. If none is found, the search is extended and wraps -around in the PO file buffer. - -</P> -<P> -An entry can be turned back into an untranslated entry by -merely emptying its translation, using the command <KBD>k</KBD> -(<CODE>po-kill-msgstr</CODE>). See section <A HREF="gettext.html#SEC26">Modifying Translations</A>. - -</P> -<P> -Also, when time comes to quit working on a PO file buffer -with the <KBD>q</KBD> command, the translator is asked for confirmation, -if some untranslated string still exists. - -</P> - - -<H2><A NAME="SEC25" HREF="gettext_toc.html#TOC25">Obsolete Entries</A></H2> - -<P> -By <STRONG>obsolete</STRONG> PO file entries, we mean those entries which are -commented out, usually by <CODE>tupdate</CODE> when it found that the -translation is not needed anymore by the package being localized. - -</P> -<P> -The usual commands moving from entry to entry consider obsolete -entries on the same level as active entries. Obsolete entries are -easily recognizable by the fact that all their lines start with -<KBD>#</KBD>, even those lines containing <CODE>msgid</CODE> or <CODE>msgstr</CODE>. - -</P> -<P> -Commands exist for emptying the translation or reinitializing it -to the original untranslated string. Commands interfacing with the -kill ring may force some previously saved text into the translation. -The user may interactively edit the translation. All these commands -may apply to obsolete entries, carefully leaving the entry obsolete -after the fact. - -</P> -<P> -Moreover, some commands are more specifically related to obsolete -entry processing. - -</P> -<DL COMPACT> - -<DT><KBD>M-n</KBD> -<DD> -<DT><KBD>M-<KBD>SPC</KBD></KBD> -<DD> -Find the next obsolete entry. - -<DT><KBD>M-p</KBD> -<DD> -<DT><KBD>M-<KBD>DEL</KBD></KBD> -<DD> -Find the previous obsolete entry. - -<DT><KBD>z</KBD> -<DD> -Make an active entry obsolete, or zap out an obsolete entry. - -</DL> - -<P> -The commands <KBD>M-n</KBD> (<CODE>po-next-obsolete-entry</CODE>) and <KBD>M-p</KBD> -(<CODE>po-previous-obsolete-entry</CODE>) move forwards or backwards, -chasing for an obsolete entry. If none is found, the search is -extended and wraps around in the PO file buffer. The commands -<KBD>M-<KBD>SPC</KBD></KBD> and <KBD>M-<KBD>DEL</KBD></KBD> are synonymous to <KBD>M-n</KBD> -and <KBD>M-p</KBD>, respectively. - -</P> -<P> -PO mode does not provide ways for un-commenting an obsolete entry -and making it active, because this would reintroduce an original -untranslated string which does not correspond to any marked string -in the program sources. This goes with the philosophy of never -introducing useless <CODE>msgid</CODE> values. - -</P> -<P> -However, it is possible to comment out an active entry, so making -it obsolete. GNU <CODE>gettext</CODE> utilities will later react to the -disappearance of a translation by using the untranslated string. -The command <KBD>z</KBD> (<CODE>po-fade-out-entry</CODE>) pushes the current entry -a little further towards annihilation. If the entry is active, then -the entry is merely commented out. If the entry is already obsolete, -then it is completely deleted from the PO file. It is easy to recycle -the translation so deleted into some other PO file entry, usually -one which is untranslated. See section <A HREF="gettext.html#SEC26">Modifying Translations</A>. - -</P> -<P> -Here is a quite interesting problem to solve for later development of -PO mode, for those nights you are not sleepy. The idea would be that -PO mode might become bright enough, one of these days, to make good -guesses at retrieving the most probable candidate, among all obsolete -entries, for initializing the translation of a newly appeared string. -I think it might be a quite hard problem to do this algorithmically, as -we have to develop good and efficient measures of string similarity. -Right now, PO mode completely lets the decision to the translator, -when the time comes to find the adequate obsolete translation, it -merely tries to provide handy tools for helping her to do so. - -</P> - - -<H2><A NAME="SEC26" HREF="gettext_toc.html#TOC26">Modifying Translations</A></H2> - -<P> -PO mode prevents direct edition of the PO file, by the usual -means Emacs give for altering a buffer's contents. By doing so, -it pretends helping the translator to avoid little clerical errors -about the overall file format, or the proper quoting of strings, -as those errors would be easily made. Other kinds of errors are -still possible, but some may be catched and diagnosed by the batch -validation process, which the translator may always trigger by the -<KBD>v</KBD> command. For all other errors, the translator has to rely on -her own judgment, and also on the linguistic reports submitted to her -by the users of the translated package, having the same mother tongue. - -</P> -<P> -When the time comes to create a translation, correct a error diagnosed -mechanically or reported by a user, the translator have to resort to -using the following commands for modifying the translations. - -</P> -<DL COMPACT> - -<DT><KBD>RET</KBD> -<DD> -Interactively edit the translation. - -<DT><KBD>TAB</KBD> -<DD> -Reinitialize the translation with the original, untranslated string. - -<DT><KBD>k</KBD> -<DD> -Save the translation on the kill ring, and delete it. - -<DT><KBD>w</KBD> -<DD> -Save the translation on the kill ring, without deleting it. - -<DT><KBD>y</KBD> -<DD> -Replace the translation, taking the new from the kill ring. - -</DL> - -<P> -The command <KBD>RET</KBD> (<CODE>po-edit-msgstr</CODE>) opens a new Emacs -window containing a copy of the translation taken from the current -PO file entry, all ready for edition, fully modifiable -and with the complete extent of GNU Emacs modifying commands. -The string is presented to the translator expunged of all quoting -marks, and she will modify the <EM>unquoted</EM> string in this -window to heart's content. Once done, the regular Emacs command -<KBD>M-C-c</KBD> (<CODE>exit-recursive-edit</CODE>) may be used to return the -edited translation into the PO file, replacing the original -translation. The keys <KBD>C-c C-c</KBD> are bound so they have the -same effect as <KBD>M-C-c</KBD>. - -</P> -<P> -If the translator becomes unsatisfied with her translation to the -extent she prefers keeping the translation which was existent prior to -the <KBD>RET</KBD> command, she may use the regular Emacs command <KBD>C-]</KBD> -(<CODE>abort-recursive-edit</CODE>) to merely get rid of edition, while -preserving the original translation. Another way would be for her -to exit normally with <KBD>C-c C-c</KBD>, then type <CODE>u</CODE> once for -undoing the whole effect of last edition. - -</P> -<P> -While editing her translation, the translator should pay attention at -not inserting unwanted <KBD><KBD>RET</KBD></KBD> (carriage returns) characters at -the end of the translated string if those are not meant to be there, -or removing such characters when they are required. Since these -characters are not visible in the editing buffer, they are easily to -introduce by mistake. To help her, <KBD><KBD>RET</KBD></KBD> automatically puts -the character <KBD><</KBD> at the end of the string being edited, but this -<KBD><</KBD> is not really part of the string. On exiting the editing -window with <KBD>C-c C-c</KBD>, PO mode automatically removes such -<KBD><</KBD> and all whitespace added after it. If the translator adds -characters after the terminating <KBD><</KBD>, it looses its delimiting -property and integrally becomes part of the string. If she removes -the delimiting <KBD><</KBD>, then the edited string is taken <EM>as -is</EM>, with all trailing newlines, even if invisible. Also, if the -translated string ought to end itself with a genuine <KBD><</KBD>, then the -delimiting <KBD><</KBD> may not be removed; so the string should appear, -in the editing window, as ending with two <KBD><</KBD> in a row. - -</P> -<P> -When a translation (or a comment) is being edited, the translator -may move the cursor back into the PO file buffer and freely -move to other entries, and browsing at will. The edited entry will -be recovered as soon as the edit ceases, because this is this entry -only which is being modified. If, with an edition still opened, the -translator wanders in the PO file buffer, she cannot modify -any other entry. If she tries to, PO mode will react by suggesting -that she aborts the current edit, or else, by inviting her to finish -the current edit prior to any other modification. - -</P> -<P> -The command <KBD>TAB</KBD> (<CODE>po-msgid-to-msgstr</CODE>) initializes, or -reinitializes the translation with the original string. This command -is normally used when the translator wants to redo a fresh translation -of the original string, disregarding any previous work. - -</P> -<P> -In fact, whether it is best to start a translation with an empty -string, or rather with a copy of the original string, is a matter of -taste or habit. Sometimes, the source mother tongue language and the -target language are so different that is simply best to start writing -on an empty page. At other times, the source and target languages -are so close that it would be a waste to retype a number of words -already being written in the original string. A translator may also -like having the original string right under her eyes, as she will -progressively overwrite the original text with the translation, even -if this requires some extra editing work to get rid of the original. - -</P> -<P> -The command <KBD>k</KBD> (<CODE>po-kill-msgstr</CODE>) merely empties the -translation string, so turning the entry into an untranslated -one. But while doing so, its previous contents is put apart in -a special place, known as the kill ring. The command <KBD>w</KBD> -(<CODE>po-kill-ring-save-msgstr</CODE>) has also the effect of taking a -copy of the translation onto the kill ring, but it otherwise leaves -the entry alone, and does <EM>not</EM> remove the translation from the -entry. Both commands use exactly the Emacs kill ring, which is shared -between buffers, and which is well known already to GNU Emacs lovers. - -</P> -<P> -The translator may use <KBD>k</KBD> or <KBD>w</KBD> many times in the course -of her work, as the kill ring may hold several saved translations. -From the kill ring, strings may later be reinserted in various -Emacs buffers. In particular, the kill ring may be used for moving -translation strings between different entries of a single PO file -buffer, or if the translator is handling many such buffers at once, -even between PO files. - -</P> -<P> -To facilitate exchanges with buffers which are not in PO mode, the -translation string put on the kill ring by the <KBD>k</KBD> command is fully -unquoted before being saved: external quotes are removed, multi-lines -strings are concatenated, and backslashed escaped sequences are turned -into their corresponding characters. In the special case of obsolete -entries, the translation is also uncommented prior to saving. - -</P> -<P> -The command <KBD>y</KBD> (<CODE>po-yank-msgstr</CODE>) completely replaces the -translation of the current entry by a string taken from the kill ring. -Following GNU Emacs terminology, we then say that the replacement -string is <STRONG>yanked</STRONG> into the PO file buffer. -See section `Yanking' in <CITE>The Emacs Editor</CITE>. -The first time <KBD>y</KBD> is used, the translation receives the value of -the most recent addition to the kill ring. If <KBD>y</KBD> is typed once -again, immediately, without intervening keystrokes, the translation -just inserted is taken away and replaced by the second most recent -addition to the kill ring. By repeating <KBD>y</KBD> many times in a row, -the translator may travel along the kill ring for saved strings, -until she finds the string she really wanted. - -</P> -<P> -When a string is yanked into a PO file entry, it is fully and -automatically requoted for complying with the format PO files should -have. Further, if the entry is obsolete, PO mode then appropriately -push the inserted string inside comments. Once again, translators -should not burden themselves with quoting considerations besides, of -course, the necessity of the translated string itself respective to -the program using it. - -</P> -<P> -Note that <KBD>k</KBD> or <KBD>w</KBD> are not the only commands pushing strings -on the kill ring, as almost any PO mode command replacing translation -strings (or the translator comments) automatically save the old string -on the kill ring. The main exceptions to this general rule are the -yanking commands themselves. - -</P> -<P> -To better illustrate the operation of killing and yanking, let's -use an actual example, taken from a common situation. When the -programmer slightly modifies some string right in the program, his -change is later reflected in the PO file by the appearance -of a new untranslated entry for the modified string, and the fact -that the entry translating the original or unmodified string becomes -obsolete. In many cases, the translator might spare herself some work -by retrieving the unmodified translation from the obsolete entry, -then initializing the untranslated entry <CODE>msgstr</CODE> field with -this retrieved translation. Once this done, the obsolete entry is -not wanted anymore, and may be safely deleted. - -</P> -<P> -When the translator finds an untranslated entry and suspects that a -slight variant of the translation exists, she immediately uses <KBD>m</KBD> -to mark the current entry location, then starts chasing obsolete -entries with <KBD>M-SPC</KBD>, hoping to find some translation corresponding -to the unmodified string. Once found, she uses the <KBD>z</KBD> command -for deleting the obsolete entry, knowing that <KBD>z</KBD> also <EM>kills</EM> -the translation, that is, pushes the translation on the kill ring. -Then, <KBD>l</KBD> returns to the initial untranslated entry, <KBD>y</KBD> -then <EM>yanks</EM> the saved translation right into the <CODE>msgstr</CODE> -field. The translator is then free to use <KBD><KBD>RET</KBD></KBD> for fine -tuning the translation contents, and maybe to later use <KBD>e</KBD>, -then <KBD>m</KBD> again, for going on with the next untranslated string. - -</P> -<P> -When some sequence of keys has to be typed over and over again, the -translator may find comfortable to become more acquainted with the GNU -Emacs capability of learning these sequences and playing them back under -request. See section `Keyboard Macros' in <CITE>The Emacs Editor</CITE>. - -</P> - - -<H2><A NAME="SEC27" HREF="gettext_toc.html#TOC27">Modifying Comments</A></H2> - -<P> -Any translation work done seriously will raise many linguistic -difficulties, for which decisions have to be made, and the choices -further documented. These documents may be saved within the -PO file in form of translator comments, which the translator -is free to create, delete, or modify at will. These comments may -be useful to herself when she returns to this PO file after a while. -Memory forgets! - -</P> -<P> -These commands are somewhat similar to those modifying translations, -so the general indications given for these apply here. See section <A HREF="gettext.html#SEC26">Modifying Translations</A>. - -</P> -<DL COMPACT> - -<DT><KBD>M-RET</KBD> -<DD> -Interactively edit the translator comments. - -<DT><KBD>M-k</KBD> -<DD> -Save the translator comments on the kill ring, and delete it. - -<DT><KBD>M-w</KBD> -<DD> -Save the translator comments on the kill ring, without deleting it. - -<DT><KBD>M-y</KBD> -<DD> -Replace the translator comments, taking the new from the kill ring. - -</DL> - -<P> -Those commands parallel PO mode commands for modifying the translation -strings, and behave much the same way as them, except that they handle -this part of PO file comments meant for translator usage, rather -than the translation strings. So, the descriptions given below are -slightly succinct, because the full details have already been given. -See section <A HREF="gettext.html#SEC26">Modifying Translations</A>. - -</P> -<P> -The command <KBD>M-RET</KBD> (<CODE>po-edit-comment</CODE>) opens a new Emacs -window containing a copy of the translator comments the current -PO file entry. If there is no such comments, PO mode -understands that the translator wants to add a comment to the entry, -and she is presented an empty screen. Comment marks (<KBD>#</KBD>) and -the space following them are automatically removed before edition, -and reinstated after. For translator comments pertaining to obsolete -entries, the uncommenting and recommenting operations are done twice. -The command <KBD>#</KBD> also has the same effect as <KBD>M-RET</KBD>, and might -be easier to type. Once in the editing window, the keys <KBD>C-c -C-c</KBD> allow the translator to tell she is finished with editing -the comment. - -</P> -<P> -The command <KBD>M-k</KBD> (<CODE>po-kill-comment</CODE>) get rid of all -translator comments, while saving those comments on the kill ring. -The command <KBD>M-w</KBD> (<CODE>po-kill-ring-save-comment</CODE>) takes -a copy of the translator comments on the kill ring, but leaves -them undisturbed in the current entry. The command <KBD>M-y</KBD> -(<CODE>po-yank-comment</CODE>) completely replaces the translator comments -by a string taken at the front of the kill ring. When this command -is immediately repeated, the comments just inserted are withdrawn, -and replaced by other strings taken along the kill ring. - -</P> -<P> -On the kill ring, all strings have the same nature. There is no -distinction between <EM>translation</EM> strings and <EM>translator -comments</EM> strings. So, for example, let's presume the translator -has just finished editing a translation, and wants to create a new -translator comments for documenting why the previous translation was -not good, just to remember what was the problem. Foreseeing that she -will do that in her documentation, the translator will want to quote -the previous translation in her translator comments. For doing so, she -may initialize the translator comments with the previous translation, -still at the head of the kill ring. Because editing already pushed the -previous translation on the kill ring, she just has to type <KBD>M-w</KBD> -prior to <KBD>#</KBD>, and the previous translation will be right there, -all ready for being introduced by some explanatory text. - -</P> -<P> -On the other hand, presume there are some translator comments already -and that the translator wants to add to those comments, instead -of wholly replacing them. Then, she should edit the comment right -away with <KBD>#</KBD>. Once inside the editing window, she can use the -regular GNU Emacs commands <KBD>C-y</KBD> (<CODE>yank</CODE>) and <KBD>M-y</KBD> -(<CODE>yank-pop</CODE>) for getting the previous translation where she likes. - -</P> - - -<H2><A NAME="SEC28" HREF="gettext_toc.html#TOC28">Consulting Auxiliary PO Files</A></H2> - -<P> -An incoming feature of PO mode should help the knowledgeable translator -to take advantage of translations already achieved in other languages -she just happens to know, by providing these other language translation -as additional context for her own work. Each PO file existing for -the same package the translator is working on, but targeted to a -different mother tongue language, is called an <STRONG>auxiliary</STRONG> PO file. -Commands will exist for declaring and handling auxiliary PO files, -and also for showing contexts for the entry under work. For this to -work fully, all auxiliary PO files will have to be normalized. - -</P> - - -<H1><A NAME="SEC29" HREF="gettext_toc.html#TOC29">Producing Binary MO Files</A></H1> - - - -<H2><A NAME="SEC30" HREF="gettext_toc.html#TOC30">Invoking the <CODE>msgfmt</CODE> Program</A></H2> - - -<PRE> -Usage: msgfmt [<VAR>option</VAR>] <VAR>filename</VAR>.po ... -</PRE> - -<DL COMPACT> - -<DT><SAMP>`-a <VAR>number</VAR>'</SAMP> -<DD> -<DT><SAMP>`--alignment=<VAR>number</VAR>'</SAMP> -<DD> -Align strings to <VAR>number</VAR> bytes (default: 1). - -<DT><SAMP>`-h'</SAMP> -<DD> -<DT><SAMP>`--help'</SAMP> -<DD> -Display this help and exit. - -<DT><SAMP>`-I <VAR>list</VAR>'</SAMP> -<DD> -<DT><SAMP>`--input-path=<VAR>list</VAR>'</SAMP> -<DD> -List of directories searched for input files. - -<DT><SAMP>`--no-hash'</SAMP> -<DD> -Binary file will not include the hash table. - -<DT><SAMP>`-o <VAR>file</VAR>'</SAMP> -<DD> -<DT><SAMP>`--output-file=<VAR>file</VAR>'</SAMP> -<DD> -Specify output file name as <VAR>file</VAR>. - -<DT><SAMP>`-v'</SAMP> -<DD> -<DT><SAMP>`--verbose'</SAMP> -<DD> -Detect and diagnose input file anomalies which might represent -translation errors. The <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings are -studied and compared. It is considered abnormal that one string -starts or ends with a newline while the other does not. Also, both -strings should have the same number of <SAMP>`%'</SAMP> format specifiers, -with matching types. For example, the check will diagnose using -<SAMP>`%.*s'</SAMP> against <SAMP>`%s'</SAMP>, or <SAMP>`%d'</SAMP> against <SAMP>`%s'</SAMP>, or -<SAMP>`%d'</SAMP> against <SAMP>`%x'</SAMP>. It can even handle positional parameters. - -<DT><SAMP>`-V'</SAMP> -<DD> -<DT><SAMP>`--version'</SAMP> -<DD> -Output version information and exit. - -</DL> - -<P> -If input file is <SAMP>`-'</SAMP>, standard input is read. If output file -is <SAMP>`-'</SAMP>, output is written to standard output. - -</P> -<P> -The search patch for <CODE>msgfmt</CODE> is <TT>`/usr/local/share/nls/src/'</TT>, -by default. It represents the path to additional directories where -other PO files can be found. This feature could be used for some -PO files for standard libraries, in case we would like to spare -translating their strings over and over again. The <SAMP>`-x'</SAMP> option -could then exclude these strings from the generation. - -</P> - - -<H2><A NAME="SEC31" HREF="gettext_toc.html#TOC31">The Format of GNU MO Files</A></H2> - -<P> -The format of the generated MO files is best described by a picture, -which appears below. - -</P> -<P> -The first two words serve the identification of the file. The magic -number will always signal GNU MO files. The number is stored in the -byte order of the generating machine, so the magic number really is -two numbers: <CODE>0x950412de</CODE> and <CODE>0xde120495</CODE>. The second -word describes the current revision of the file format. For now the -revision is 0. This might change in future versions, and ensures -that the readers of MO files can distinguish new formats from old -ones, so that both can be handled correctly. The version is kept -separate from the magic number, instead of using different magic -numbers for different formats, mainly because <TT>`/etc/magic'</TT> is -not updated often. It might be better to have magic separated from -internal format version identification. - -</P> -<P> -Follow a number of pointers to later tables in the file, allowing -for the extension of the prefix part of MO files without having to -recompile programs reading them. This might become useful for later -inserting a few flag bits, indication about the charset used, new -tables, or other things. - -</P> -<P> -Then, at offset <VAR>O</VAR> and offset <VAR>T</VAR> in the picture, two tables -of string descriptors can be found. In both tables, each string -descriptor uses two 32 bits integers, one for the string length, -another for the offset of the string in the MO file, counting in bytes -from the start of the file. The first table contains descriptors -for the original strings, and is sorted so the original strings -are in increasing lexicographical order. The second table contains -descriptors for the translated strings, and is parallel to the first -table: to find the corresponding translation one has to access the -array slot in the second array with the same index. - -</P> -<P> -Having the original strings sorted enables the use of simple binary -search, for when the MO file does not contain an hashing table, or -for when it is not practical to use the hashing table provided in -the MO file. This also has another advantage, as the empty string -in a PO file GNU <CODE>gettext</CODE> is usually <EM>translated</EM> into -some system information attached to that particular MO file, and the -empty string necessarily becomes the first in both the original and -translated tables, making the system information very easy to find. - -</P> -<P> -The size <VAR>S</VAR> of the hash table can be zero. In this case, the -hash table itself is not contained in the MO file. Some people might -prefer this because a precomputed hashing table takes disk space, and -does not win <EM>that</EM> much speed. The hash table contains indices -to the sorted array of strings in the MO file. Conflict resolution is -done by double hashing. The precise hashing algorithm used is fairly -dependent of GNU <CODE>gettext</CODE> code, and is not documented here. - -</P> -<P> -As for the strings themselves, they follow the hash file, and each -is terminated with a <KBD>NUL</KBD>, and this <KBD>NUL</KBD> is not counted in -the length which appears in the string descriptor. The <CODE>msgfmt</CODE> -program has an option selecting the alignment for MO file strings. -With this option, each string is separately aligned so it starts at -an offset which is a multiple of the alignment value. On some RISC -machines, a correct alignment will speed things up. - -</P> -<P> -Nothing prevents an MO file from having embedded <KBD>NUL</KBD>s in strings. -However, the program interface currently used already presumes -that strings are <KBD>NUL</KBD> terminated, so embedded <KBD>NUL</KBD>s are -somewhat useless. But MO file format is general enough so other -interfaces would be later possible, if for example, we ever want to -implement wide characters right in MO files, where <KBD>NUL</KBD> bytes may -accidently appear. - -</P> -<P> -This particular issue has been strongly debated in the GNU -<CODE>gettext</CODE> development forum, and it is expectable that MO file -format will evolve or change over time. It is even possible that many -formats may later be supported concurrently. But surely, we got to -start somewhere, and the MO file format described here is a good start. -Nothing is cast in concrete, and the format may later evolve fairly -easily, so we should feel comfortable with the current approach. - -</P> - -<PRE> - byte - +------------------------------------------+ - 0 | magic number = 0x950412de | - | | - 4 | file format revision = 0 | - | | - 8 | number of strings | == N - | | - 12 | offset of table with original strings | == O - | | - 16 | offset of table with translation strings | == T - | | - 20 | size of hashing table | == S - | | - 24 | offset of hashing table | == H - | | - . . - . (possibly more entries later) . - . . - | | - O | length & offset 0th string ----------------. - O + 8 | length & offset 1st string ------------------. - ... ... | | -O + ((N-1)*8)| length & offset (N-1)th string | | | - | | | | - T | length & offset 0th translation ---------------. - T + 8 | length & offset 1st translation -----------------. - ... ... | | | | -T + ((N-1)*8)| length & offset (N-1)th translation | | | | | - | | | | | | - H | start hash table | | | | | - ... ... | | | | - H + S * 4 | end hash table | | | | | - | | | | | | - | NUL terminated 0th string <----------------' | | | - | | | | | - | NUL terminated 1st string <------------------' | | - | | | | - ... ... | | - | | | | - | NUL terminated 0th translation <---------------' | - | | | - | NUL terminated 1st translation <-----------------' - | | - ... ... - | | - +------------------------------------------+ -</PRE> - - - -<H1><A NAME="SEC32" HREF="gettext_toc.html#TOC32">The User's View</A></H1> - -<P> -When GNU <CODE>gettext</CODE> will truly have reached is goal, average users -should feel some kind of astonished pleasure, seeing the effect of -that strange kind of magic that just makes their own native language -appear everywhere on their screens. As for naive users, they would -ideally have no special pleasure about it, merely taking their own -language for <EM>granted</EM>, and becoming rather unhappy otherwise. - -</P> -<P> -So, let's try to describe here how we would like the magic to operate, -as we want the users' view to be the simplest, among all ways one -could look at GNU <CODE>gettext</CODE>. All other software engineers: -programmers, translators, maintainers, should work together in such a -way that the magic becomes possible. This is a long and progressive -undertaking, and information is available about the progress of the -GNU Translation Project. - -</P> -<P> -When a package is distributed, there are two kind of users: -<STRONG>installers</STRONG> who fetch the distribution, unpack it, configure -it, compile it and install it for themselves or others to use; and -<STRONG>end users</STRONG> that call programs of the package, once these have -been installed at their site. GNU <CODE>gettext</CODE> is offering magic -for both installers and end users. - -</P> - - - -<H2><A NAME="SEC33" HREF="gettext_toc.html#TOC33">The Current <TT>`NLS'</TT> Matrix for GNU</A></H2> - -<P> -Languages are not equally supported in all GNU packages. To know -if some GNU package uses GNU <CODE>gettext</CODE>, one may check -the distribution for the <TT>`NLS'</TT> information file, for some -<TT>`<VAR>ll</VAR>.po'</TT> files, often kept together into some <TT>`po/'</TT> -directory, or for an <TT>`intl/'</TT> directory. Internationalized -packages have usually many <TT>`<VAR>ll</VAR>.po'</TT> files, where <VAR>ll</VAR> -represents the language. section <A HREF="gettext.html#SEC35">Magic for End Users</A> for a complete description -of the format for <VAR>ll</VAR>. - -</P> -<P> -More generally, a matrix is available for showing the current state -of GNU internationalization, listing which packages are prepared -for multi-lingual messages, and which languages is supported by each. -Because this information changes often, this matrix is not kept within -this GNU <CODE>gettext</CODE> manual. This information is often found in -file <TT>`NLS'</TT> from various GNU distributions, but is also as old -as the distribution itself. A recent copy of this <TT>`NLS'</TT> file, -containing up-to-date information, should generally be found on most -GNU archive sites. - -</P> - - -<H2><A NAME="SEC34" HREF="gettext_toc.html#TOC34">Magic for Installers</A></H2> - -<P> -By default, packages fully using GNU <CODE>gettext</CODE>, internally, -are installed in such a way that they to allow translation of -messages. At <EM>configuration</EM> time, those packages should -automatically detect whether the underlying host system provides usable -<CODE>catgets</CODE> or <CODE>gettext</CODE> functions. If neither is present, -the GNU <CODE>gettext</CODE> library should be automatically prepared -and used. Installers may use special options at configuration -time for changing this behavior. The command <SAMP>`./configure ---with-gnu-gettext'</SAMP> bypasses system <CODE>catgets</CODE> or <CODE>gettext</CODE> to -use GNU <CODE>gettext</CODE> instead, while <SAMP>`./configure --disable-nls'</SAMP> -produces program totally unable to translate messages. - -</P> -<P> -Internationalized packages have usually many <TT>`<VAR>ll</VAR>.po'</TT> -files. Unless -translations are disabled, all those available are installed together -with the package. However, the environment variable <CODE>LINGUAS</CODE> -may be set, prior to configuration, to limit the installed set. -<CODE>LINGUAS</CODE> should then contain a space separated list of two-letter -codes, stating which languages are allowed. - -</P> - - -<H2><A NAME="SEC35" HREF="gettext_toc.html#TOC35">Magic for End Users</A></H2> - -<P> -We consider here those packages using GNU <CODE>gettext</CODE> internally, -and for which the installers did not disable translation at -<EM>configure</EM> time. Then, users only have to set the <CODE>LANG</CODE> -environment variable to the appropriate <SAMP>`<VAR>ll</VAR>'</SAMP> prior to -using the programs in the package. See section <A HREF="gettext.html#SEC33">The Current <TT>`NLS'</TT> Matrix for GNU</A>. For example, -let's presume a German site. At the shell prompt, users merely have to -execute <SAMP>`setenv LANG de'</SAMP> (in <CODE>csh</CODE>) or <SAMP>`export -LANG; LANG=de'</SAMP> (in <CODE>sh</CODE>). They could even do this from their -<TT>`.login'</TT> or <TT>`.profile'</TT> file. - -</P> - - -<H1><A NAME="SEC36" HREF="gettext_toc.html#TOC36">The Programmer's View</A></H1> - -<P> -One aim of the current message catalog implementation provided by -GNU <CODE>gettext</CODE> was to use the systems message catalog handling, if the -installer wishes to do so. So we perhaps should first take a look at -the solutions we know about. The people in the POSIX committee does not -manage to agree on one of the semi-official standards which we'll -describe below. In fact they couldn't agree on anything, so nothing -decide only to include an example of an interface. The major Unix vendors -are split in the usage of the two most important specifications: X/Opens -catgets vs. Uniforums gettext interface. We'll describe them both and -later explain our solution of this dilemma. - -</P> - - - -<H2><A NAME="SEC37" HREF="gettext_toc.html#TOC37">About <CODE>catgets</CODE></A></H2> - -<P> -The <CODE>catgets</CODE> implementation is defined in the X/Open Portability -Guide, Volume 3, XSI Supplementary Definitions, Chapter 5. But the -process of creating this standard seemed to be too slow for some of -the Unix vendors so they created their implementations on preliminary -versions of the standard. Of course this leads again to problems while -writing platform independent programs: even the usage of <CODE>catgets</CODE> -does not guarantee a unique interface. - -</P> -<P> -Another, personal comment on this that only a bunch of committee members -could have made this interface. They never really tried to program -using this interface. It is a fast, memory-saving implementation, an -user can happily live with it. But programmers hate it (at least me and -some others do...) - -</P> -<P> -But we must not forget one point: after all the trouble with transfering -the rights on Unix(tm) they at last came to X/Open, the very same who -published this specifications. This leads me to making the prediction -that this interface will be in future Unix standards (e.g. Spec1170) and -therefore part of all Unix implementation (implementations, which are -<EM>allowed</EM> to wear this name). - -</P> - - - -<H3><A NAME="SEC38" HREF="gettext_toc.html#TOC38">The Interface</A></H3> - -<P> -The interface to the <CODE>catgets</CODE> implementation consists of three -functions which correspond to those used in file access: <CODE>catopen</CODE> -to open the catalog for using, <CODE>catgets</CODE> for accessing the message -tables, and <CODE>catclose</CODE> for closing after work is done. Prototypes -for the functions and the needed definitions are in the -<CODE><nl_types.h></CODE> header file. - -</P> -<P> -<CODE>catopen</CODE> is used like in this: - -</P> - -<PRE> -nl_catd catd = catopen ("catalog_name", 0); -</PRE> - -<P> -The function takes as the argument the name of the catalog. This usual -refers to the name of the program or the package. The second parameter -is not further specified in the standard. I don't even know whether it -is implemented consistently among various systems. So the common advice -is to use <CODE>0</CODE> as the value. The return value is a handle to the -message catalog, equivalent to handles to file returned by <CODE>open</CODE>. - -</P> -<P> -This handle is of course used in the <CODE>catgets</CODE> function which can -be used like this: - -</P> - -<PRE> -char *translation = catgets (catd, set_no, msg_id, "original string"); -</PRE> - -<P> -The first parameter is this catalog descriptor. The second parameter -specifies the set of messages in this catalog, in which the message -described by <CODE>msg_id</CODE> is obtained. <CODE>catgets</CODE> therefore uses a -three-stage addressing: - -</P> - -<PRE> -catalog name => set number => message ID => translation -</PRE> - -<P> -The fourth argument is not used to address the translation. It is given -as a default value in case when one of the addressing stages fail. One -important thing to remember is that although the return type of catgets -is <CODE>char *</CODE> the resulting string <EM>must not</EM> be changed. It -should better <CODE>const char *</CODE>, but the standard is published in -1988, one year before ANSI C. - -</P> -<P> -The last of these function functions is used and behaves as expected: - -</P> - -<PRE> -catclose (catd); -</PRE> - -<P> -After this no <CODE>catgets</CODE> call using the descriptor is legal anymore. - -</P> - - -<H3><A NAME="SEC39" HREF="gettext_toc.html#TOC39">Problems with the <CODE>catgets</CODE> Interface?!</A></H3> - -<P> -Now that this descriptions seemed to be really easy where are the -problem we speak of. In fact the interface could be used in a -reasonable way, but constructing the message catalogs is a pain. The -reason for this lies in the third argument of <CODE>catgets</CODE>: the unique -message ID. This has to be a numeric value for all messages in a single -set. Perhaps you could imagine the problems keeping such list while -changing the source code. Add a new message here, remove one there. Of -course there have been developed a lot of tools helping to organize this -chaos but one as the other fails in one aspect or the other. We don't -want to say that the other approach has no problems but they are far -more easily to manage. - -</P> - - -<H2><A NAME="SEC40" HREF="gettext_toc.html#TOC40">About <CODE>gettext</CODE></A></H2> - -<P> -The definition of the <CODE>gettext</CODE> interface comes from a Uniforum -proposal and it is followed by at least one major Unix vendor -(Sun) in its last developments. It is not specified in any official -standard, though. - -</P> -<P> -The main points about this solution is that it does not follow the -method of normal file handling (open-use-close) and that it does not -burden the programmer so many task, especially the unique key handling. -Of course here is also a unique key needed, but this key is the -message itself (how long or short it is). See section <A HREF="gettext.html#SEC45">Comparing the Two Interfaces</A> for a -more detailed comparison of the two methods. - -</P> -<P> -The following section contains a rather detailed description of the -interface. We make it that detailed because this is the interface -we chose for the GNU <CODE>gettext</CODE> Library. Programmers interested -in using this library will be interested in this description. - -</P> - - - -<H3><A NAME="SEC41" HREF="gettext_toc.html#TOC41">The Interface</A></H3> - -<P> -The minimal functionality an interface must have is a) to select a -domain the strings are coming from (a single domain for all programs is -not reasonable because its construction and maintenance is difficult, -perhaps impossible) and b) to access a string in a selected domain. - -</P> -<P> -This is principally the description of the <CODE>gettext</CODE> interface. It -has an global domain which unqualified usages reference. Of course this -domain is selectable by the user. - -</P> - -<PRE> -char *textdomain (const char *domain_name); -</PRE> - -<P> -This provides the possibility to change or query the current status of -the current global domain of the <CODE>LC_MESSAGE</CODE> category. The -argument is a null-terminated string, whose characters must be legal in -the use in filenames. If the <VAR>domain_name</VAR> argument is <CODE>NULL</CODE>, -the function return the current value. If no value has been set -before, the name of the default domain is returned: <EM>messages</EM>. -Please note that although the return value of <CODE>textdomain</CODE> is of -type <CODE>char *</CODE> no changing is allowed. It is also important to know -that no checks of the availability are made. If the name is not -available you will see this by the fact that no translations are provided. - -</P> -<P> -To use a domain set by <CODE>textdomain</CODE> the function - -</P> - -<PRE> -char *gettext (const char *msgid); -</PRE> - -<P> -is to be used. This is the simplest reasonable form one can imagine. -The translation of the string <VAR>msgid</VAR> is returned if it is available -in the current domain. If not available the argument itself is -returned. If the argument is <CODE>NULL</CODE> the result is undefined. - -</P> -<P> -One things which should come into mind is that no explicit dependency to -the used domain is given. The current value of the domain for the -<CODE>LC_MESSAGES</CODE> locale is used. If this changes between two -executions of the same <CODE>gettext</CODE> call in the program, both calls -reference a different message catalog. - -</P> -<P> -For the easiest case, which is normally used in internationalized GNU -packages, once at the beginning of execution a call to <CODE>textdomain</CODE> -is issued, setting the domain to a unique name, normally the package -name. In the following code all strings which have to be translated are -filtered through the gettext function. That's all, the package speaks -your language. - -</P> - - -<H3><A NAME="SEC42" HREF="gettext_toc.html#TOC42">Solving Ambiguities</A></H3> - -<P> -While this single name domain work good for most applications there -might be the need to get translations from more than one domain. Of -course one could switch between different domains with calls to -<CODE>textdomain</CODE>, but this is really not convenient nor is it fast. A -possible situation could be one case discussing while this writing: all -error messages of functions in the set of common used functions should -go into a separate domain <CODE>error</CODE>. By this mean we would only need -to translate them once. - -</P> -<P> -For this reasons there are two more functions to retrieve strings: - -</P> - -<PRE> -char *dgettext (const char *domain_name, const char *msgid); -char *dcgettext (const char *domain_name, const char *msgid, - int category); -</PRE> - -<P> -Both take an additional argument at the first place, which corresponds -to the argument of <CODE>textdomain</CODE>. The third argument of -<CODE>dcgettext</CODE> allows to use another locale but <CODE>LC_MESSAGES</CODE>. -But I really don't know where this can be useful. If the -<VAR>domain_name</VAR> is <CODE>NULL</CODE> or <VAR>category</VAR> has an value beside -the known ones, the result is undefined. It should also be noted that -this function is not part of the second known implementation of this -function family, the one found in Solaris. - -</P> -<P> -A second ambiguity can arise by the fact, that perhaps more than one -domain has the same name. This can be solved by specifying where the -needed message catalog files can be found. - -</P> - -<PRE> -char *bindtextdomain (const char *domain_name, - const char *dir_name); -</PRE> - -<P> -Calling this function binds the given domain to a file in the specified -directory (how this file is determined follows below). Esp a file in -the systems default place is not favored against the specified file -anymore (as it would be by solely using <CODE>textdomain</CODE>). A <CODE>NULL</CODE> -pointer for the <VAR>dir_name</VAR> parameter returns the binding associated -with <VAR>domain_name</VAR>. If <VAR>domain_name</VAR> itself is <CODE>NULL</CODE> -nothing happens and a <CODE>NULL</CODE> pointer is returned. Here again as -for all the other functions is true that none of the return value must -be changed! - -</P> - - -<H3><A NAME="SEC43" HREF="gettext_toc.html#TOC43">Locating Message Catalog Files</A></H3> - -<P> -Because many different languages for many different packages have to be -stored we need some way to add these information to file message catalog -files. The way usually used in Unix environments is have this encoding -in the file name. This is also done here. The directory name given in -<CODE>bindtextdomain</CODE>s second argument (or the default directory), -followed by the value and name of the locale and the domain name are -concatenated: - -</P> - -<PRE> -<VAR>dir_name</VAR>/<VAR>locale</VAR>/LC_<VAR>category</VAR>/<VAR>domain_name</VAR>.mo -</PRE> - -<P> -The default value for <VAR>dir_name</VAR> is system specific. For the GNU -library it's: - -<PRE> -/usr/local/share/locale -</PRE> - -<P> -<VAR>locale</VAR> is the value of the locale whose name is this -<CODE>LC_<VAR>category</VAR></CODE>. For <CODE>gettext</CODE> and <CODE>dgettext</CODE> this -locale is always <CODE>LC_MESSAGES</CODE>. <CODE>dcgettext</CODE> specifies the -locale by the third argument.<A NAME="DOCF2" HREF="gettext_foot.html#FOOT2">(2)</A> <A NAME="DOCF3" HREF="gettext_foot.html#FOOT3">(3)</A> - -</P> - - -<H3><A NAME="SEC44" HREF="gettext_toc.html#TOC44">Optimization of the *gettext functions</A></H3> - -<P> -At this point of the discussion we should talk about an advantage of the -GNU <CODE>gettext</CODE> implementation. Some readers might have pointed out -that an internationalized program might have a poor performance if some -string has to be translated in an inner loop. While this is unavoidable -when the string varies from one run of the loop to the other it is -simply a waste of time when the string is always the same. Take the -following example: - -</P> - -<PRE> -{ - while (...) - { - puts (gettext ("Hello world")); - } -} -</PRE> - -<P> -When the locale selection does not change between two runs the resulting -string is always the same. One way to use this is: - -</P> - -<PRE> -{ - str = gettext ("Hello world"); - while (...) - { - puts (str); - } -} -</PRE> - -<P> -But this solution is not usable in all situation (e.g. when the locale -selection changes) nor is it good readable. - -</P> -<P> -The GNU C compiler, version 2.7 and above, provide another solution for -this. To describe this we show here some lines of the -<TT>`intl/libgettext.h'</TT> file. For an explanation of the expression -command block see section `Statements and Declarations in Expressions' in <CITE>The GNU CC Manual</CITE>. - -</P> - -<PRE> -# if defined __GNUC__ && __GNUC__ == 2 && __GNUC_MINOR__ >= 7 -# define dcgettext(domainname, msgid, category) \ - (__extension__ \ - ({ \ - char *result; \ - if (__builtin_constant_p (msgid)) \ - { \ - extern int _nl_msg_cat_cntr; \ - static char *__translation__; \ - static int __catalog_counter__; \ - if (! __translation__ \ - || __catalog_counter__ != _nl_msg_cat_cntr) \ - { \ - __translation__ = \ - dcgettext__ ((domainname), (msgid), (category)); \ - __catalog_counter__ = _nl_msg_cat_cntr; \ - } \ - result = __translation__; \ - } \ - else \ - result = dcgettext__ ((domainname), (msgid), (category)); \ - result; \ - })) -# endif -</PRE> - -<P> -The interesting thing here is the <CODE>__builtin_constant_p</CODE> predicate. -This is evaluated at compile time and so optimization can take place -immediately. Here two cases are distinguished: the argument to -<CODE>gettext</CODE> is not a constant value in which case simply the function -<CODE>dcgettext__</CODE> is called, the real implementation of the -<CODE>dcgettext</CODE> function. - -</P> -<P> -If the string argument <EM>is</EM> constant we can reuse the once gained -translation when the locale selection has not changed. This is exactly -what is done here. The <CODE>_nl_msg_cat_cntr</CODE> variable is defined in -the <TT>`loadmsgcat.c'</TT> which is available in <TT>`libintl.a'</TT> and is -changed whenever a new message catalog is loaded. - -</P> - - -<H2><A NAME="SEC45" HREF="gettext_toc.html#TOC45">Comparing the Two Interfaces</A></H2> - -<P> -The following discussion is perhaps a little bit colored. As said -above we implemented GNU <CODE>gettext</CODE> following the Uniforum -proposal and this surely has its reasons. But it should show how we -came to this decision. - -</P> -<P> -First we take a look at the developing process. When we write an -application using NLS provided by <CODE>gettext</CODE> we proceed as always. -Only when we come to a string which might be seen by the users and thus -has to be translated we use <CODE>gettext("...")</CODE> instead of -<CODE>"..."</CODE>. At the beginning of each source file (or in a central -header file) we define - -</P> - -<PRE> -#define gettext(String) (String) -</PRE> - -<P> -Even this definition can be avoided when the system supports the -<CODE>gettext</CODE> function in its C library. When we compile this code the -result is the same as if no NLS code is used. When you take a look at -the GNU <CODE>gettext</CODE> code you will see that we use <CODE>_("...")</CODE> -instead of <CODE>gettext("...")</CODE>. This reduces the number of -additional characters per translatable string to <EM>3</EM> (in words: -three). - -</P> -<P> -When now a production version of the program is needed we simply replace -the definition - -</P> - -<PRE> -#define _(String) (String) -</PRE> - -<P> -by - -</P> - -<PRE> -#include <libintl.h> -#define _(String) gettext (String) -</PRE> - -<P> -and include the header <TT>`libintl.h'</TT>. Additionally we run the -program <TT>`xgettext'</TT> on all source code file which contain -translatable strings and we are gone. We have a running program which -does not depend on translations to be available, but which can use any -that becomes available. - -</P> -<P> -The same procedure can be done for the <CODE>gettext_noop</CODE> invocations -(see section <A HREF="gettext.html#SEC17">Special Cases of Translatable Strings</A>). First you can define <CODE>gettext_noop</CODE> to a -no-op macro and later use the definition from <TT>`libintl.h'</TT>. Because -this name is not used in Suns implementation of <TT>`libintl.h'</TT>, -you should consider the following code for your project: - -</P> - -<PRE> -#ifdef gettext_noop -# define N_(Str) gettext_noop (Str) -#else -# define N_(Str) (Str) -#endif -</PRE> - -<P> -<CODE>N_</CODE> is a short form similar to <CODE>_</CODE>. The <TT>`Makefile'</TT> in -the <TT>`po/'</TT> directory of GNU gettext knows by default both of the -mentioned short forms so you are invited to follow this proposal for -your own ease. - -</P> -<P> -Now to <CODE>catgets</CODE>. The main problem is the work for the -programmer. Every time he comes to a translatable string he has to -define a number (or a symbolic constant) which has also be defined in -the message catalog file. He also has to take care for duplicate -entries, duplicate message IDs etc. If he wants to have the same -quality in the message catalog as the GNU <CODE>gettext</CODE> program -provides he also has to put the descriptive comments for the strings and -the location in all source code files in the message catalog. This is -nearly a Mission: Impossible. - -</P> -<P> -But there are also some points people might call advantages speaking for -<CODE>catgets</CODE>. If you have a single word in a string and this string -is used in different contexts it is likely that in one or the other -language the word has different translations. Example: - -</P> - -<PRE> -printf ("%s: %d", gettext ("number"), number_of_errors) - -printf ("you should see %d %s", number_count, - number_count == 1 ? gettext ("number") : gettext ("numbers")) -</PRE> - -<P> -Here we have to translate two times the string <CODE>"number"</CODE>. Even -if you do not speak a language beside English it might be possible to -recognize that the two words have a different meaning. In German the -first appearance has to be translated to <CODE>"Anzahl"</CODE> and the second -to <CODE>"Zahl"</CODE>. - -</P> -<P> -Now you can say that this example is really esoteric. And you are -right! This is exactly how we felt about this problem and decide that -it does not weight that much. The solution for the above problem could -be very easy: - -</P> - -<PRE> -printf (gettext ("number: %d"), number_of_errors) - -printf (number_count == 1 ? gettext ("you should see %d number") - : gettext ("you should see %d numbers"), - number_count) -</PRE> - -<P> -We believe that we can solve all conflicts with this method. If it is -difficult one can also consider changing one of the conflicting string a -little bit. But it is not impossible to overcome. - -</P> -<P> -Translator note: It is perhaps appropriate here to tell those English -speaking programmers that the plural form of a noun cannot be formed by -appending a single `s'. Most other languages use different methods. So -you should at least use the method given in the above example. - -</P> -<P> -But I have been told that some languages have even more complex rules. -A good approach might be to consider methods like the one used for -<CODE>LC_TIME</CODE> in the POSIX.2 standard. - -</P> - - - -<H2><A NAME="SEC46" HREF="gettext_toc.html#TOC46">Using libintl.a in own programs</A></H2> - -<P> -Starting with version 0.9.4 the library <CODE>libintl.h</CODE> should be more -or less self-contained. I.e. you can use it in your own programs. The -<TT>`Makefile'</TT> will put the header and the library in directories -selected using the <CODE>$(prefix)</CODE>. - -</P> -<P> -One exception of the above is found on HP-UX systems. Here the C library -does not contain the <CODE>alloca</CODE> function (and the HP compiler does -not generate it inlined). But it is not intended to rewrite the whole -library just because of this dumb system. Instead include the -<CODE>alloca</CODE> function in all package you use the <CODE>libintl.a</CODE> in. - -</P> - - - -<H2><A NAME="SEC47" HREF="gettext_toc.html#TOC47">Being a <CODE>gettext</CODE> grok</A></H2> - -<P> -To fully exploit the functionality of the GNU <CODE>gettext</CODE> library it -is surely helpful to read the source code. But for those who don't want -to spend that much time in reading the (sometimes complicated) code here -is a list comments: - -</P> - -<UL> -<LI>Changing the language at runtime - -For interactive programs it might be useful to offer a selection of the -used language at runtime. To understand how to do this one need to know -how the used language is determined while executing the <CODE>gettext</CODE> -function. The method which is presented here only works correctly -with the GNU implementation of the <CODE>gettext</CODE> functions. It is not -possible with underlying <CODE>catgets</CODE> functions or <CODE>gettext</CODE> -functions from the systems C library. The exception is of course the -GNU C Library which uses the GNU gettext Library for message handling. - -In the function <CODE>dcgettext</CODE> at every call the current setting of -the highest priority environment variable is determined and used. -Highest priority means here the following list with decreasing -priority: - - -<OL> -<LI><CODE>LANGUAGE</CODE> - -<LI><CODE>LC_ALL</CODE> - -<LI><CODE>LC_xxx</CODE>, according to selected locale - -<LI><CODE>LANG</CODE> - -</OL> - -Afterwards the path is constructed using the found value and the -translation file is loaded if available. - -What is now when the value for, say, <CODE>LANGUAGE</CODE> changes. According -to the process explained above the new value of this variable is found -as soon as the <CODE>dcgettext</CODE> function is called. But this also means -the (perhaps) different message catalog file is loaded. In other -words: the used language is changed. - -But there is one little hook. The code for gcc-2.7.0 and up provides -some optimization. This optimization normally prevents the calling of -the <CODE>dcgettext</CODE> function as long as now new catalog is loaded. But -if <CODE>dcgettext</CODE> is not called we program also cannot find the -<CODE>LANGUAGE</CODE> variable be changed (see section <A HREF="gettext.html#SEC44">Optimization of the *gettext functions</A>). But the -solution is very easy. Include the following code in the language -switching function. - - -<PRE> - /* Change language. */ - setenv ("LANGUAGE", "fr", 1); - - /* Make change known. */ - { - extern int _nl_msg_cat_cntr; - ++_nl_msg_cat_cntr; - } -</PRE> - -The variable <CODE>_nl_msg_cat_cntr</CODE> is defined in <TT>`loadmsgcat.c'</TT>. - -</UL> - - - -<H2><A NAME="SEC48" HREF="gettext_toc.html#TOC48">Temporary Notes for the Programmers Chapter</A></H2> - - - -<H3><A NAME="SEC49" HREF="gettext_toc.html#TOC49">Temporary - Two Possible Implementations</A></H3> - -<P> -There are two competing methods for language independent messages: -the X/Open <CODE>catgets</CODE> method, and the Uniforum <CODE>gettext</CODE> -method. The <CODE>catgets</CODE> method indexes messages by integers; the -<CODE>gettext</CODE> method indexes them by their English translations. -The <CODE>catgets</CODE> method has been around longer and is supported -by more vendors. The <CODE>gettext</CODE> method is supported by Sun, -and it has been heard that the COSE multi-vendor initiative is -supporting it. Neither method is a POSIX standard; the POSIX.1 -committee had a lot of disagreement in this area. - -</P> -<P> -Neither one is in the POSIX standard. There was much disagreement -in the POSIX.1 committee about using the <CODE>gettext</CODE> routines -vs. <CODE>catgets</CODE> (XPG). In the end the committee couldn't -agree on anything, so no messaging system was included as part -of the standard. I believe the informative annex of the standard -includes the XPG3 messaging interfaces, "...as an example of -a messaging system that has been implemented..." - -</P> -<P> -They were very careful not to say anywhere that you should use one -set of interfaces over the other. For more on this topic please -see the Programming for Internationalization FAQ. - -</P> - - -<H3><A NAME="SEC50" HREF="gettext_toc.html#TOC50">Temporary - About <CODE>catgets</CODE></A></H3> - -<P> -There have been a few discussions of late on the use of -<CODE>catgets</CODE> as a base. I think it important to present both -sides of the argument and hence am opting to play devil's advocate -for a little bit. - -</P> -<P> -I'll not deny the fact that <CODE>catgets</CODE> could have been designed -a lot better. It currently has quite a number of limitations and -these have already been pointed out. - -</P> -<P> -However there is a great deal to be said for consistency and -standardization. A common recurring problem when writing Unix -software is the myriad portability problems across Unix platforms. -It seems as if every Unix vendor had a look at the operating system -and found parts they could improve upon. Undoubtedly, these -modifications are probably innovative and solve real problems. -However, software developers have a hard time keeping up with all -these changes across so many platforms. - -</P> -<P> -And this has prompted the Unix vendors to begin to standardize their -systems. Hence the impetus for Spec1170. Every major Unix vendor -has committed to supporting this standard and every Unix software -developer waits with glee the day they can write software to this -standard and simply recompile (without having to use autoconf) -across different platforms. - -</P> -<P> -As I understand it, Spec1170 is roughly based upon version 4 of the -X/Open Portability Guidelines (XPG4). Because <CODE>catgets</CODE> and -friends are defined in XPG4, I'm led to believe that <CODE>catgets</CODE> -is a part of Spec1170 and hence will become a standardized component -of all Unix systems. - -</P> - - -<H3><A NAME="SEC51" HREF="gettext_toc.html#TOC51">Temporary - Why a single implementation</A></H3> - -<P> -Now it seems kind of wasteful to me to have two different systems -installed for accessing message catalogs. If we do want to remedy -<CODE>catgets</CODE> deficiencies why don't we try to expand <CODE>catgets</CODE> -(in a compatible manner) rather than implement an entirely new system. -Otherwise, we'll end up with two message catalog access systems -installed with an operating system - one set of routines for GNU -software, and another set of routines (catgets) for all other software. -Bloated? - -</P> -<P> -Supposing another catalog access system is implemented. Which do -we recommend? At least for Linux, we need to attract as many -software developers as possible. Hence we need to make it as easy -for them to port their software as possible. Which means supporting -<CODE>catgets</CODE>. We will be implementing the <CODE>glocale</CODE> code -within our <CODE>libc</CODE>, but does this mean we also have to incorporate -another message catalog access scheme within our <CODE>libc</CODE> as well? -And what about people who are going to be using the <CODE>glocale</CODE> -+ non-<CODE>catgets</CODE> routines. When they port their software to -other platforms, they're now going to have to include the front-end -(<CODE>glocale</CODE>) code plus the back-end code (the non-<CODE>catgets</CODE> -access routines) with their software instead of just including the -<CODE>glocale</CODE> code with their software. - -</P> -<P> -Message catalog support is however only the tip of the iceberg. -What about the data for the other locale categories. They also have -a number of deficiencies. Are we going to abandon them as well and -develop another duplicate set of routines (should <CODE>glocale</CODE> -expand beyond message catalog support)? - -</P> -<P> -Like many parts of Unix that can be improved upon, we're stuck with balancing -compatibility with the past with useful improvements and innovations for -the future. - -</P> - - -<H3><A NAME="SEC52" HREF="gettext_toc.html#TOC52">Temporary - Double layer solution</A></H3> - -<P> -GNU locale implements a <CODE>gettext</CODE>-style interface on top of a -<CODE>catgets</CODE>-style interface. - -</P> -<P> -This is not needless complexity. It is absolutely vital, because -it enables <CODE>gettext</CODE> to run on top of <CODE>catgets</CODE>, which -enables Linux International to recommend users use it <EM>today</EM>. - -</P> -<P> -Rewriting <CODE>gettext</CODE> so that it could use <EM>either</EM> -<CODE>catgets</CODE> <EM>or</EM> some simpler mechanism would not break -anything, but would not reduce complexity either. It might be -worth doing, but it isn't urgent. - -</P> -<P> -In general, simplicity is not enough of a reason to rewrite a -program that works. Simplicity is just one desirable thing. -It is not overridingly important. - -</P> - - -<H3><A NAME="SEC53" HREF="gettext_toc.html#TOC53">Temporary - Notes</A></H3> - -<P> -X/Open agreed very late on the standard form so that many -implementations differ from the final form. Both of my system (old -Linux catgets and Ultrix-4) have a strange variation. - -</P> -<P> -OK. After incorporating the last changes I have to spend some time on -making the GNU/Linux libc gettext functions. So in future Solaris is -not the only system having gettext. - -</P> - - -<H1><A NAME="SEC54" HREF="gettext_toc.html#TOC54">The Translator's View</A></H1> - - - -<H2><A NAME="SEC55" HREF="gettext_toc.html#TOC55">Introduction 0</A></H2> - -<P> -GNU is going international! The GNU Translation Project is a way -to get maintainers, translators and users all together, so GNU will -gradually become able to speak many native languages. - -</P> -<P> -The GNU <CODE>gettext</CODE> tool set contains <EM>everything</EM> maintainers -need for internationalizing their packages for messages. It also -contains quite useful tools for helping translators at localizing -messages to their native language, once a package has already been -internationalized. - -</P> -<P> -To achieve the GNU Translation Project, we need many interested -people who like their own language and write it well, and who are also -able to synergize with other translators speaking the same language. -If you'd like to volunteer to <EM>work</EM> at translating messages, -please send mail to your translating team. - -</P> -<P> -Each team has its own mailing list, courtesy of Linux -International. You may reach your translating team at the address -<TT>`<VAR>ll</VAR>@li.org'</TT>, replacing <VAR>ll</VAR> by the two-letter ISO 639 -code for your language. Language codes are <EM>not</EM> the same as -country codes given in ISO 3166. The following translating teams -exist: - -</P> - -<BLOCKQUOTE> -<P> -Chinese <CODE>zh</CODE>, Czech <CODE>cs</CODE>, Danish <CODE>da</CODE>, Dutch <CODE>nl</CODE>, -Esperanto <CODE>eo</CODE>, Finnish <CODE>fi</CODE>, French <CODE>fr</CODE>, Irish -<CODE>ga</CODE>, German <CODE>de</CODE>, Greek <CODE>el</CODE>, Italian <CODE>it</CODE>, -Japanese <CODE>ja</CODE>, Indonesian <CODE>in</CODE>, Norwegian <CODE>no</CODE>, Polish -<CODE>pl</CODE>, Portuguese <CODE>pt</CODE>, Russian <CODE>ru</CODE>, Spanish <CODE>es</CODE>, -Swedish <CODE>sv</CODE> and Turkish <CODE>tr</CODE>. -</BLOCKQUOTE> - -<P> -For example, you may reach the Chinese translating team by writing to -<TT>`zh@li.org'</TT>. When you become a member of the translating team -for your own language, you may subscribe to its list. For example, -Swedish people can send a message to <TT>`sv-request@li.org'</TT>, -having this message body: - -</P> - -<PRE> -subscribe -</PRE> - -<P> -Keep in mind that team members should be interested in <EM>working</EM> -at translations, or at solving translational difficulties, rather than -merely lurking around. If your team does not exist yet and you want to -start one, please write to <TT>`gnu-translation@prep.ai.mit.edu'</TT>; -you will then reach the GNU coordinator for all translator teams. - -</P> -<P> -A handful of GNU packages have already been adapted and provided -with message translations for several languages. Translation -teams have begun to organize, using these packages as a starting -point. But there are many more packages and many languages for -which we have no volunteer translators. If you would like to -volunteer to work at translating messages, please send mail to -<TT>`gnu-translation@prep.ai.mit.edu'</TT> indicating what language(s) -you can work on. - -</P> - - -<H2><A NAME="SEC56" HREF="gettext_toc.html#TOC56">Introduction 1</A></H2> - -<P> -This is now official, GNU is going international! Here is the -announcement submitted for the January 1995 GNU Bulletin: - -</P> - -<BLOCKQUOTE> -<P> -A handful of GNU packages have already been adapted and provided -with message translations for several languages. Translation -teams have begun to organize, using these packages as a starting -point. But there are many more packages and many languages -for which we have no volunteer translators. If you'd like to -volunteer to work at translating messages, please send mail to -<SAMP>`gnu-translation@prep.ai.mit.edu'</SAMP> indicating what language(s) -you can work on. -</BLOCKQUOTE> - -<P> -This document should answer many questions for those who are curious -about the process or would like to contribute. Please at least skim -over it, hoping to cut down a little of the high volume of email -generated by this collective effort towards GNU internationalization. - -</P> -<P> -GNU programming is done in English, and currently, English is used -as the main communicating language between national communities -collaborating to the GNU project. This very document is written -in English. This will not change in the foreseeable future. - -</P> -<P> -However, there is a strong appetite from national communities for -having more software able to write using national language and habits, -and there is an on-going effort to modify GNU software in such a way -that it becomes able to do so. The experiments driven so far raised -an enthusiastic response from pretesters, so we believe that GNU -internationalization is dedicated to succeed. - -</P> -<P> -For suggestion clarifications, additions or corrections to this -document, please email to <TT>`gnu-translation@prep.ai.mit.edu'</TT>. - -</P> - - -<H2><A NAME="SEC57" HREF="gettext_toc.html#TOC57">Discussions</A></H2> - -<P> -Facing this internationalization effort, a few users expressed their -concerns. Some of these doubts are presented and discussed, here. - -</P> - -<UL> -<LI>Smaller groups - -Some languages are not spoken by a very large number of people, -so people speaking them sometimes consider that there may not be -all that much demand such versions of GNU packages. Moreover, many -people being <EM>into computers</EM>, in some countries, generally seem -to prefer English versions of their software. - -On the other end, people might enjoy their own language a lot, and -be very motivated at providing to themselves the pleasure of having -their beloved GNU software speaking their mother tongue. They do -themselves a personal favor, and do not pay that much attention to -the number of people beneficiating of their work. - -<LI>Misinterpretation - -Other users are shy to push forward their own language, seeing in this -some kind of misplaced propaganda. Someone thought there must be some -users of the language over the networks pestering other people with it. - -But any spoken language is worth localization, because there are -people behind the language for whom the language is important and -dear to their hearts. - -<LI>Odd translations - -The biggest problem is to find the right translations so that -everybody can understand the messages. Translations are usually a -little odd. Some people get used to English, to the extent they may -find translations into their own language "rather pushy, obnoxious -and sometimes even hilarious." As a French speaking man, I have -the experience of those instruction manuals for goods, so poorly -translated in French in Korea or Taiwan... - -The fact is that we sometimes have to create a kind of national -computer culture, and this is not easy without the collaboration of -many people liking their mother tongue. This is why translations are -better achieved by people knowing and loving their own language, and -ready to work together at improving the results they obtain. - -<LI>Dependencies over the GPL - -Some people wonder if using GNU <CODE>gettext</CODE> necessarily brings their package -under the protective wing of the GNU General Public License, when they -do not want to make their program free, or want other kinds of freedom. -The simplest answer is yes. - -The mere marking of localizable strings in a package, or conditional -inclusion of a few lines for initialization, is not really including -GPL'ed code. However, the localization routines themselves are under -the GPL and would bring the remainder of the package under the GPL -if they were distributed with it. So, I presume that, for those -for which this is a problem, it could be circumvented by letting to -the end installers the burden of assembling a package prepared for -localization, but not providing the localization routines themselves. - -</UL> - - - -<H2><A NAME="SEC58" HREF="gettext_toc.html#TOC58">Organization</A></H2> - -<P> -On a larger scale, the true solution would be to organize some kind of -fairly precise set up in which volunteers could participate. I gave -some thought to this idea lately, and realize there will be some -touchy points. I thought of writing to Richard Stallman to launch -such a project, but feel it might be good to shake out the ideas -between ourselves first. Most probably that Linux International has -some experience in the field already, or would like to orchestrate -the volunteer work, maybe. Food for thought, in any case! - -</P> -<P> -I guess we have to setup something early, somehow, that will help -many possible contributors of the same language to interlock and avoid -work duplication, and further be put in contact for solving together -problems particular to their tongue (in most languages, there are many -difficulties peculiar to translating technical English). My Swedish -contributor acknowledged these difficulties, and I'm well aware of -them for French. - -</P> -<P> -This is surely not a technical issue, but we should manage so the -effort of locale contributors be maximally useful, despite the national -team layer interface between contributors and maintainers. - -</P> -<P> -GNU needs some setup for coordinating language coordinators. -Localizing evolving GNU programs will surely become a permanent -and continuous activity in GNU, once started. The setup should be -minimally completed and tested before GNU <CODE>gettext</CODE> becomes an official -reality. The email address <TT>`gnu-translation@prep.ai.mit.edu'</TT> -has been setup for receiving offers from volunteers and general -email on these topics. This address reaches the GNU Translation -Project coordinator. - -</P> - - - -<H3><A NAME="SEC59" HREF="gettext_toc.html#TOC59">Central Coordination</A></H3> - -<P> -I also think GNU will need sooner than it thinks, that someone setup -a way to organize and coordinate these groups. Some kind of group -of groups. My opinion is that it would be good that GNU delegate -this task to a small group of collaborating volunteers, shortly. -Perhaps in <TT>`gnu.announce'</TT> a list of this national committee's -can be published. - -</P> -<P> -My role as coordinator would simply be to refer to Ulrich any German -speaking volunteer interested to localization of GNU programs, and -maybe helping national groups to initially organize, while maintaining -national registries for until national groups are ready to take over. -In fact, the coordinator should ease volunteers to get in contact with -one another for creating national teams, which should then select -one coordinator per language, or country (regionalized language). -If well done, the coordination should be useful without being an -overwhelming task, the time to put delegations in place. - -</P> - - -<H3><A NAME="SEC60" HREF="gettext_toc.html#TOC60">National Teams</A></H3> - -<P> -I suggest we look for volunteer coordinators/editors for individual -languages. These people will scan contributions of translation files -for various programs, for their own languages, and will ensure high -and uniform standards of diction. - -</P> -<P> -From my current experience with other people in these days, those who -provide localizations are very enthusiastic about the process, and are -more interested in the localization process than in the program they -localize, and want to do many programs, not just one. This seems -to confirm that having a coordinator/editor for each language is a -good idea. - -</P> -<P> -We need to choose someone who is good at writing clear and concise -prose in the language in question. That is hard--we can't check -it ourselves. So we need to ask a few people to judge each others' -writing and select the one who is best. - -</P> -<P> -I announce my prerelease to a few dozen people, and you would not -believe all the discussions it generated already. I shudder to think -what will happen when this will be launched, for true, officially, -world wide. Who am I to arbitrate between two Czekolsovak users -contradicting each other, for example? - -</P> -<P> -I assume that your German is not much better than my French so that -I would not be able to judge about these formulations. What I would -suggest is that for each language there is a group for people who -maintain the PO files and judge about changes. I suspect there will -be cultural differences between how such groups of people will behave. -Some will have relaxed ways, reach consensus easily, and have anyone -of the group relate to the maintainers, while others will fight to -death, organize heavy administrations up to national standards, and -use strict channels. - -</P> -<P> -The German team is putting out a good example. Right now, they are -maybe half a dozen people revising translations of each other and -discussing the linguistic issues. I do not even have all the names. -Ulrich Drepper is taking care of coordinating the German team. -He subscribed to all my pretest lists, so I do not even have to warn -him specifically of incoming releases. - -</P> -<P> -I'm sure, that is a good idea to get teams for each language working -on translations. That will make the translations better and more -consistent. - -</P> - - - -<H4><A NAME="SEC61" HREF="gettext_toc.html#TOC61">Sub-Cultures</A></H4> - -<P> -Taking French for example, there are a few sub-cultures around -computers which developed diverging vocabularies. Picking volunteers -here and there without addressing this problem in an organized way, -soon in the project, might produce a distasteful mix of GNU programs, -and possibly trigger endless quarrels among those who really care. - -</P> -<P> -Keeping some kind of unity in the way French localization of GNU -programs is achieved is a difficult (and delicate) job. Knowing the -latin character of French people (:-), if we take this the wrong -way, we could end up nowhere, or spoil a lot of energies. Maybe we -should begin to address this problem seriously <EM>before</EM> GNU -<CODE>gettext</CODE> become officially published. And I suspect that this -means soon! - -</P> - - -<H4><A NAME="SEC62" HREF="gettext_toc.html#TOC62">Organizational Ideas</A></H4> - -<P> -I expect the next big changes after the official release. Please note -that I use the German translation of the short GPL message. We need -to set a few good examples before the localization goes out for true -in GNU. Here are a few points to discuss: - -</P> - -<UL> -<LI> - -Each group should have one FTP server (at least one master). - -<LI> - -The files on the server should reflect the latest version (of -course!) and it should also contain a RCS directory with the -corresponding archives (I don't have this now). - -<LI> - -There should also be a ChangeLog file (this is more useful than the -RCS archive but can be generated automatically from the later by -Emacs). - -<LI> - -A <STRONG>core group</STRONG> should judge about questionable changes (for now -this group consists solely by me but I ask some others occasionally; -this also seems to work). - -</UL> - - - -<H3><A NAME="SEC63" HREF="gettext_toc.html#TOC63">Mailing Lists</A></H3> - -<P> -If we get any inquiries about GNU <CODE>gettext</CODE>, send them on to: - -</P> - -<PRE> -<TT>`gnu-translation@prep.ai.mit.edu'</TT> -</PRE> - -<P> -The <TT>`*-pretest'</TT> lists are quite useful to me, maybe the idea could -be generalized to all GNU packages. But each maintainer his/her way! - -</P> -<P> -, we have a mechanism in place here at -<TT>`gnu.ai.mit.edu'</TT> to track teams, support mailing lists for -them and log members. We have a slight preference that you use it. -If this is OK with you, I can get you clued in. - -</P> -<P> -Things are changing! A few years ago, when Daniel Fekete and I -asked for a mailing list for GNU localization, nested at the FSF, we -were politely invited to organize it anywhere else, and so did we. -For communicating with my pretesters, I later made a handful of -mailing lists located at iro.umontreal.ca and administrated by -<CODE>majordomo</CODE>. These lists have been <EM>very</EM> dependable -so far... - -</P> -<P> -I suspect that the German team will organize itself a mailing list -located in Germany, and so forth for other countries. But before they -organize for true, it could surely be useful to offer mailing lists -located at the FSF to each national team. So yes, please explain me -how I should proceed to create and handle them. - -</P> -<P> -We should create temporary mailing lists, one per country, to help -people organize. Temporary, because once regrouped and structured, it -would be fair the volunteers from country bring back <EM>their</EM> list -in there and manage it as they want. My feeling is that, in the long -run, each team should run its own list, from within their country. -There also should be some central list to which all teams could -subscribe as they see fit, as long as each team is represented in it. - -</P> - - -<H2><A NAME="SEC64" HREF="gettext_toc.html#TOC64">Information Flow</A></H2> - -<P> -There will surely be some discussion about this messages after the -packages are finally released. If people now send you some proposals -for better messages, how do you proceed? Jim, please note that -right now, as I put forward nearly a dozen of localizable programs, I -receive both the translations and the coordination concerns about them. - -</P> -<P> -If I put one of my things to pretest, Ulrich receives the announcement -and passes it on to the German team, who make last minute revisions. -Then he submits the translation files to me <EM>as the maintainer</EM>. -For GNU packages I do not maintain, I would not even hear about it. -This scheme could be made to work GNU-wide, I think. For security -reasons, maybe Ulrich (national coordinators, in fact) should update -central registry kept by GNU (Jim, me, or Len's recruits) once in -a while. - -</P> -<P> -In December/January, I was aggressively ready to internationalize -all of GNU, giving myself the duty of one small GNU package per week -or so, taking many weeks or months for bigger packages. But it does -not work this way. I first did all the things I'm responsible for. -I've nothing against some missionary work on other maintainers, but -I'm also loosing a lot of energy over it--same debates over again. - -</P> -<P> -And when the first localized packages are released we'll get a lot of -responses about ugly translations :-). Surely, and we need to have -beforehand a fairly good idea about how to handle the information -flow between the national teams and the package maintainers. - -</P> -<P> -Please start saving somewhere a quick history of each PO file. I know -for sure that the file format will change, allowing for comments. -It would be nice that each file has a kind of log, and references for -those who want to submit comments or gripes, or otherwise contribute. -I sent a proposal for a fast and flexible format, but it is not -receiving acceptance yet by the GNU deciders. I'll tell you when I -have more information about this. - -</P> - - -<H1><A NAME="SEC65" HREF="gettext_toc.html#TOC65">The Maintainer's View</A></H1> - -<P> -The maintainer of a package has many responsibilities. One of them -is ensuring that the package will install easily on many platforms, -and that the magic we described earlier (see section <A HREF="gettext.html#SEC32">The User's View</A>) will work -for installers and end users. - -</P> -<P> -Of course, there are many possible ways by which GNU <CODE>gettext</CODE> -might be integrated in a distribution, and this chapter does not cover -them in all generality. Instead, it details one possible approach -which is especially adequate for many GNU distributions, because -GNU <CODE>gettext</CODE> is purposely for helping the internationalization -of the whole GNU project. So, the maintainer's view presented here -presumes that the package already has a <TT>`configure.in'</TT> file and -uses Autoconf. - -</P> -<P> -Nevertheless, GNU <CODE>gettext</CODE> may surely be useful for non-GNU -packages, but the maintainers of such packages might have to show -imagination and initiative in organizing their distributions so -<CODE>gettext</CODE> work for them in all situations. There are surely -many, out there. - -</P> -<P> -Even if <CODE>gettext</CODE> methods are now stabilizing, slight adjustments -might be needed between successive <CODE>gettext</CODE> versions, so you -should ideally revise this chapter in subsequent releases, looking -for changes. - -</P> - - - -<H2><A NAME="SEC66" HREF="gettext_toc.html#TOC66">Flat or Non-Flat Directory Structures</A></H2> - -<P> -Some GNU packages are distributed as <CODE>tar</CODE> files which unpack -in a single directory, these are said to be <STRONG>flat</STRONG> distributions. -Other GNU packages have a one level hierarchy of subdirectories, using -for example a subdirectory named <TT>`doc/'</TT> for the Texinfo manual and -man pages, another called <TT>`lib/'</TT> for holding functions meant to -replace or complement C libraries, and a subdirectory <TT>`src/'</TT> for -holding the proper sources for the package. These other distributions -are said to be <STRONG>non-flat</STRONG>. - -</P> -<P> -For now, we cannot say much about flat distributions. A flat -directory structure has the disadvantage of increasing the difficulty -of updating to a new version of GNU <CODE>gettext</CODE>. Also, if you have -many PO files, this could somewhat pollute your single directory. -In the GNU <CODE>gettext</CODE> distribution, the <TT>`misc/'</TT> directory -contains a shell script named <TT>`combine-sh'</TT>. That script may -be used for combining all the C files of the <TT>`intl/'</TT> directory -into a pair of C files (one <TT>`.c'</TT> and one <TT>`.h'</TT>). Those two -generated files would fit more easily in a flat directory structure, -and you will then have to add these two files to your project. - -</P> -<P> -Maybe because GNU <CODE>gettext</CODE> itself has a non-flat structure, -we have more experience with this approach, and this is what will be -described in the remaining of this chapter. Some maintainers might -use this as an opportunity to unflatten their package structure. -Only later, once gained more experience adapting GNU <CODE>gettext</CODE> -to flat distributions, we might add some notes about how to proceed -in flat situations. - -</P> - - -<H2><A NAME="SEC67" HREF="gettext_toc.html#TOC67">Prerequisite Works</A></H2> - -<P> -There are some works which are required for using GNU <CODE>gettext</CODE> -in one of your package. These works have some kind of generality -that escape the point by point descriptions used in the remainder -of this chapter. So, we describe them here. - -</P> - -<UL> -<LI> - -Before attempting to use you should install some other packages first. -Ensure that recent versions of GNU <CODE>m4</CODE>, GNU Autoconf and GNU -<CODE>gettext</CODE> are already installed at your site, and if not, proceed -to do this first. If you got to install these things, beware that -GNU <CODE>m4</CODE> must be fully installed before GNU Autoconf is even -<EM>configured</EM>. - -Those three packages are only needed to you, as a maintainer; the -installers of your own package and end users do not really need any -of GNU <CODE>m4</CODE>, GNU Autoconf or GNU <CODE>gettext</CODE> for successfully -installing and running your package, with messages properly translated. -But this is not completely true if you provide internationalized -shell scripts within your own package: GNU <CODE>gettext</CODE> shall -then be installed at the user site if the end users want to see the -translation of shell script messages. - -<LI> - -Your package should use Autoconf and have a <TT>`configure.in'</TT> file. -If it does not, you have to learn how. The Autoconf documentation -is quite well written, it is a good idea that you print it and get -familiar with it. - -<LI> - -Your C sources should have already been modified according to -instructions given earlier in this manual. See section <A HREF="gettext.html#SEC13">Preparing Program Sources</A>. - -<LI> - -Your <TT>`po/'</TT> directory should receive all PO files submitted to you -by the translator teams, each having <TT>`<VAR>ll</VAR>.po'</TT> as a name. -This is not usually easy to get translation -work done before your package gets internationalized and available! -Since the cycle has to start somewhere, the easiest for the maintainer -is to start with absolutely no PO files, and wait until various -translator teams get interested in your package, and submit PO files. - -</UL> - -<P> -It is worth adding here a few words about how the maintainer should -ideally behave with PO files submissions. As a maintainer, your -role is to authentify the origin of the submission as being the -representative of the appropriate GNU translating team (forward the -submission to <TT>`gnu-translation@prep.ai.mit.edu'</TT> in case of -doubt), to ensure that the PO file format is not severely broken and -does not prevent successful installation, and for the rest, to merely -to put these PO files in <TT>`po/'</TT> for distribution. - -</P> -<P> -As a maintainer, you do not have to take on your shoulders the -responsibility of checking if the translations are adequate or -complete, and should avoid diving into linguistic matters. Translation -teams drive themselves and are fully responsible of their linguistic -choices for GNU. Keep in mind that translator teams are <EM>not</EM> -driven by maintainers. You can help by carefully redirecting all -communications and reports from users about linguistic matters to the -appropriate translation team, or explain users how to reach or join -their team. The simplest might be to send them the <TT>`NLS'</TT> file. - -</P> -<P> -Maintainers should <EM>never ever</EM> apply PO file bug reports -themselves, short-cutting translation teams. If some translator has -difficulty to get some of her points through her team, it should not be -an issue for her to directly negotiate translations with maintainers. -Teams ought to settle their problems themselves, if any. If you, as -a maintainer, ever think there is a real problem with a team, please -never try to <EM>solve</EM> a team's problem on your own. - -</P> - - -<H2><A NAME="SEC68" HREF="gettext_toc.html#TOC68">Invoking the <CODE>gettextize</CODE> Program</A></H2> - -<P> -Some files are consistently and identically needed in every package -internationalized through GNU <CODE>gettext</CODE>. As a matter of -convenience, the <CODE>gettextize</CODE> program puts all these files right -in your package. This program has the following synopsis: - -</P> - -<PRE> -gettextize [ <VAR>option</VAR>... ] [ <VAR>directory</VAR> ] -</PRE> - -<P> -and accepts the following options: - -</P> -<DL COMPACT> - -<DT><SAMP>`-f'</SAMP> -<DD> -<DT><SAMP>`--force'</SAMP> -<DD> -Force replacement of files which already exist. - -<DT><SAMP>`-h'</SAMP> -<DD> -<DT><SAMP>`--help'</SAMP> -<DD> -Display this help and exit. - -<DT><SAMP>`--version'</SAMP> -<DD> -Output version information and exit. - -</DL> - -<P> -If <VAR>directory</VAR> is given, this is the top level directory of a -package to prepare for using GNU <CODE>gettext</CODE>. If not given, it -is assumed that the current directory is the top level directory of -such a package. - -</P> -<P> -The program <CODE>gettextize</CODE> provides the following files. However, -no existing file will be replaced unless the option <CODE>--force</CODE> -(<CODE>-f</CODE>) is specified. - -</P> - -<OL> -<LI> - -The <TT>`NLS'</TT> file is copied in the main directory of your package, -the one being at the top level. This file gives the main indications -about how to install and use the Native Language Support features -of your program. You might elect to use a more recent copy of this -<TT>`NLS'</TT> file than the one provided through <CODE>gettextize</CODE>, if -you have one handy. You may also fetch a more recent copy of file -<TT>`NLS'</TT> from most GNU archive sites. - -<LI> - -A <TT>`po/'</TT> directory is created for eventually holding -all translation files, but initially only containing the file -<TT>`po/Makefile.in.in'</TT> from the GNU <CODE>gettext</CODE> distribution. -(beware the double <SAMP>`.in'</SAMP> in the file name). If the <TT>`po/'</TT> -directory already exists, it will be preserved along with the files -it contains, and only <TT>`Makefile.in.in'</TT> will be overwritten. - -<LI> - -A <TT>`intl/'</TT> directory is created and filled with most of the files -originally in the <TT>`intl/'</TT> directory of the GNU <CODE>gettext</CODE> -distribution. Also, if option <CODE>--force</CODE> (<CODE>-f</CODE>) is given, -the <TT>`intl/'</TT> directory is emptied first. - -</OL> - -<P> -If your site support symbolic links, <CODE>gettextize</CODE> will not -actually copy the files into your package, but establish symbolic -links instead. This avoids duplicating the disk space needed in -all packages. Merely using the <SAMP>`-h'</SAMP> option while creating the -<CODE>tar</CODE> archive of your distribution will resolve each link by an -actual copy in the distribution archive. So, to insist, you really -should use <SAMP>`-h'</SAMP> option with <CODE>tar</CODE> within your <CODE>dist</CODE> -goal of your main <TT>`Makefile.in'</TT>. - -</P> -<P> -It is interesting to understand that most new files for supporting -GNU <CODE>gettext</CODE> facilities in one package go in <TT>`intl/'</TT> -and <TT>`po/'</TT> subdirectories. One distinction between these two -directories is that <TT>`intl/'</TT> is meant to be completely identical -in all packages using GNU <CODE>gettext</CODE>, while all newly created -files, which have to be different, go into <TT>`po/'</TT>. There is a -common <TT>`Makefile.in.in'</TT> in <TT>`po/'</TT>, because the <TT>`po/'</TT> -directory needs its own <TT>`Makefile'</TT>, and it has been designed so -it can be identical in all packages. - -</P> - - -<H2><A NAME="SEC69" HREF="gettext_toc.html#TOC69">Files You Must Create or Alter</A></H2> - -<P> -Besides files which are automatically added through <CODE>gettextize</CODE>, -there are many files needing revision for properly interacting with -GNU <CODE>gettext</CODE>. If you are closely following GNU standards for -Makefile engineering and auto-configuration, the adaptations should -be easier to achieve. Here is a point by point description of the -changes needed in each. - -</P> -<P> -So, here comes a list of files, each one followed by a description of -all alterations it needs. Many examples are taken out from the GNU -<CODE>gettext</CODE> 0.10 distribution itself. You may indeed -refer to the source code of the GNU <CODE>gettext</CODE> package, as it -is intended to be a good example and master implementation for using -its own functionality. - -</P> - - - -<H3><A NAME="SEC70" HREF="gettext_toc.html#TOC70"><TT>`POTFILES'</TT> in <TT>`po/'</TT></A></H3> - -<P> -The <TT>`po/'</TT> directory should receive a file named -<TT>`POTFILES.in'</TT>. This file tells which files, among all program -sources, have marked strings needing translation. Here is an example -of such a file: - -</P> - -<PRE> -# List of source files containing translatable strings. -# Copyright (C) 1995 Free Software Foundation, Inc. - -# Common library files -lib/error.c -lib/getopt.c -lib/xmalloc.c - -# Package source files -src/gettextp.c -src/msgfmt.c -src/xgettext.c -</PRE> - -<P> -Dashed comments and white lines are ignored. All other lines -list those source files containing strings marked for translation -(see section <A HREF="gettext.html#SEC15">How Marks Appears in Sources</A>), in a notation relative to the top level -of your whole distribution, rather than the location of the -<TT>`POTFILES.in'</TT> file itself. - -</P> - - -<H3><A NAME="SEC71" HREF="gettext_toc.html#TOC71"><TT>`configure.in'</TT> at top level</A></H3> - - -<OL> -<LI>Declare the package and version. - -This is done by a set of lines like these: - - -<PRE> -PACKAGE=gettext -VERSION=0.10 -AC_DEFINE_UNQUOTED(PACKAGE, "$PACKAGE") -AC_DEFINE_UNQUOTED(VERSION, "$VERSION") -AC_SUBST(PACKAGE) -AC_SUBST(VERSION) -</PRE> - -Of course, you replace <SAMP>`gettext'</SAMP> with the name of your package, -and <SAMP>`0.10'</SAMP> by its version numbers, exactly as they -should appear in the packaged <CODE>tar</CODE> file name of your distribution -(<TT>`gettext-0.10.tar.gz'</TT>, here). - -<LI>Declare the available translations. - -This is done by defining <CODE>ALL_LINGUAS</CODE> to the white separated, -quoted list of available languages, in a single line, like this: - - -<PRE> -ALL_LINGUAS="de fr" -</PRE> - -This example means that German and French PO files are available, so -that these languages are currently supported by your package. If you -want to further restrict, at installation time, the set of installed -languages, this should not be done by modifying <CODE>ALL_LINGUAS</CODE> in -<TT>`configure.in'</TT>, but rather by using the <CODE>LINGUAS</CODE> environment -variable (see section <A HREF="gettext.html#SEC34">Magic for Installers</A>). - -<LI>Check for internationalization support. - -Here is the main <CODE>m4</CODE> macro for triggering internationalization -support. Just add this line to <TT>`configure.in'</TT>: - - -<PRE> -ud_GNU_GETTEXT -</PRE> - -This call is purposely simple, even if it generates a lot of configure -time checking and actions. - -<LI>Obtain some <TT>`libintl.h'</TT> header file. - -Once you called <CODE>ud_GNU_GETTEXT</CODE> in <TT>`configure.in'</TT>, use: - - -<PRE> -AC_LINK_FILES($nls_cv_header_libgt, $nls_cv_header_intl) -</PRE> - -This will create one header file <TT>`libintl.h'</TT>. The reason for -this has to do with the fact that some systems, using the Uniforum -message handling functions, already have a file of this name. - -The <CODE>AC_LINK_FILES</CODE> call has not been integrated into the -<CODE>ud_GNU_GETTEXT</CODE> macro because there can be only one such call -in a <TT>`configure'</TT> file. If you already use it, you will have to -<EM>merge</EM> the needed <CODE>AC_LINK_FILES</CODE> within yours, by adding -the first argument at the end of the list of your first argument, -and adding the second argument at the end of the list of your second -argument. - -<LI>Have output files created. - -The <CODE>AC_OUTPUT</CODE> directive, at the end of your <TT>`configure.in'</TT> -file, needs to be modified in two ways: - - -<PRE> -AC_OUTPUT([<VAR>existing configuration files</VAR> intl/Makefile po/Makefile.in], -[sed -e "/POTFILES =/r po/POTFILES" po/Makefile.in > po/Makefile -<VAR>existing additional actions</VAR>]) -</PRE> - -The modification to the first argument to <CODE>AC_OUTPUT</CODE> asks -for substitution in the <TT>`intl/'</TT> and <TT>`po/'</TT> directories. -Note the <SAMP>`.in'</SAMP> suffix used for <TT>`po/'</TT> only. This is because -the distributed file is really <TT>`po/Makefile.in.in'</TT>. - -The modification to the second argument ensures that <TT>`po/Makefile'</TT> -gets generated out of the <TT>`po/Makefile.in'</TT> just created, including -in it the <TT>`po/POTFILES'</TT> produced by <CODE>ud_GNU_GETTEXT</CODE>. -Two steps are needed because <TT>`po/POTFILES'</TT> can get lengthy in -some packages, too lengthy in fact for being able to merely use an -Autoconf substituted variable, as many <CODE>sed</CODE>s cannot handle very -long lines. - -</OL> - - - -<H3><A NAME="SEC72" HREF="gettext_toc.html#TOC72"><TT>`aclocal.m4'</TT> at top level</A></H3> - -<P> -If you do not have an <TT>`aclocal.m4'</TT> file in your distribution, -the simplest is taking a copy of <TT>`aclocal.m4'</TT> from -GNU <CODE>gettext</CODE>. But to be precise, you only need macros -<CODE>ud_LC_MESSAGES</CODE>, <CODE>ud_WITH_NLS</CODE> and <CODE>ud_GNU_GETTEXT</CODE>, -so you may use an editor and remove macros you do not need. - -</P> -<P> -If you already have an <TT>`aclocal.m4'</TT> file, then you will have -to merge the said macros into your <TT>`aclocal.m4'</TT>. Note that if -you are upgrading from a previous release of GNU <CODE>gettext</CODE>, you -should most probably <EM>replace</EM> the said macros, as they usually -change a little from one release of GNU <CODE>gettext</CODE> to the next. -Their contents may vary as we get more experience with strange systems -out there. - -</P> -<P> -These macros check for the internationalization support functions -and related informations. Hopefully, once stabilized, these macros -might be integrated in the standard Autoconf set, because this -piece of <CODE>m4</CODE> code will be the same for all projects using GNU -<CODE>gettext</CODE>. - -</P> - - -<H3><A NAME="SEC73" HREF="gettext_toc.html#TOC73"><TT>`acconfig.h'</TT> at top level</A></H3> - -<P> -If you do not have an <TT>`acconfig.h'</TT> file in your distribution, -the simplest is use take a copy of <TT>`acconfig.h'</TT> from -GNU <CODE>gettext</CODE>. But to be precise, you only need the -lines and comments for <CODE>ENABLE_NLS</CODE>, <CODE>HAVE_CATGETS</CODE>, -<CODE>HAVE_GETTEXT</CODE> and <CODE>HAVE_LC_MESSAGES</CODE>, so you may use -an editor and remove everything else. If you already have an -<TT>`acconfig.h'</TT> file, then you should merge the said definitions -into your <TT>`acconfig.h'</TT>. - -</P> - - -<H3><A NAME="SEC74" HREF="gettext_toc.html#TOC74"><TT>`Makefile.in'</TT> at top level</A></H3> - -<P> -Here are a few modifications you need to make to your main, top-level -<TT>`Makefile.in'</TT> file. - -</P> - -<OL> -<LI> - -Add the following lines near the beginning of your <TT>`Makefile.in'</TT>, -so the <SAMP>`dist:'</SAMP> goal will work properly (as explained further down): - - -<PRE> -PACKAGE = @PACKAGE@ -VERSION = @VERSION@ -</PRE> - -<LI> - -Add file <TT>`NLS'</TT> to the <CODE>DISTFILES</CODE> definition, so the file gets -distributed. - -<LI> - -Wherever you process subdirectories in your <TT>`Makefile.in'</TT>, be -sure you also process <CODE>@INTLSUB@</CODE> and <CODE>@POSUB@</CODE>, which -are replaced respectively by <SAMP>`intl'</SAMP> and <SAMP>`po'</SAMP>, or empty -when the configuration processes decides these directories should -not be processed. - -Here is an example of a canonical order of processing. In this -example, we also define <CODE>SUBDIRS</CODE> in <CODE>Makefile.in</CODE> for it -to be further used in the <SAMP>`dist:'</SAMP> goal. - - -<PRE> -SUBDIRS = doc lib @INTLSUB@ src @POSUB@ -</PRE> - -that you will have to adapt to your own package. - -<LI> - -A delicate point is the <SAMP>`dist:'</SAMP> goal, as both -<TT>`intl/Makefile'</TT> and <TT>`po/Makefile'</TT> will later assume that the -proper directory has been set up from the main <TT>`Makefile'</TT>. Here is -an example at what the <SAMP>`dist:'</SAMP> goal might look like: - - -<PRE> -distdir = $(PACKAGE)-$(VERSION) -dist: Makefile - rm -fr $(distdir) - mkdir $(distdir) - chmod 777 $(distdir) - for file in $(DISTFILES); do \ - ln $$file $(distdir) 2>/dev/null || cp -p $$file $(distdir); \ - done - for subdir in $(SUBDIRS); do \ - mkdir $(distdir)/$$subdir || exit 1; \ - chmod 777 $(distdir)/$$subdir; \ - (cd $$subdir && $(MAKE) $@) || exit 1; \ - done - tar chozf $(distdir).tar.gz $(distdir) - rm -fr $(distdir) -</PRE> - -</OL> - - - -<H3><A NAME="SEC75" HREF="gettext_toc.html#TOC75"><TT>`Makefile.in'</TT> in <TT>`src/'</TT></A></H3> - -<P> -Some of the modifications made in the main <TT>`Makefile.in'</TT> will -also be needed in the <TT>`Makefile.in'</TT> from your package sources, -which we assume here to be in the <TT>`src/'</TT> subdirectory. Here are -all the modifications needed in <TT>`src/Makefile.in'</TT>: - -</P> - -<OL> -<LI> - -In view of the <SAMP>`dist:'</SAMP> goal, you should have these lines near the -beginning of <TT>`src/Makefile.in'</TT>: - - -<PRE> -PACKAGE = @PACKAGE@ -VERSION = @VERSION@ -</PRE> - -<LI> - -If not done already, you should guarantee that <CODE>top_srcdir</CODE> -gets defined. This will serve for <CODE>cpp</CODE> include files. Just add -the line: - - -<PRE> -top_srcdir = @top_srcdir@ -</PRE> - -<LI> - -You might also want to define <CODE>subdir</CODE> as <SAMP>`src'</SAMP>, later -allowing for almost uniform <SAMP>`dist:'</SAMP> goals in all your -<TT>`Makefile.in'</TT>. At list, the <SAMP>`dist:'</SAMP> goal below assume that -you used: - - -<PRE> -subdir = src -</PRE> - -<LI> - -You should ensure that the final linking will use <CODE>@INTLLIBS@</CODE> as -a library. An easy way to achieve this is to manage that it gets into -<CODE>LIBS</CODE>, like this: - - -<PRE> -LIBS = @INTLLIBS@ @LIBS@ -</PRE> - -In most GNU packages one will find a directory <TT>`lib/'</TT> in which a -library containing some helper functions will be build. (You need at -least the few functions which the GNU <CODE>gettext</CODE> Library itself -needs.) However some of the functions in the <TT>`lib/'</TT> also give -messages to the user which of course should be translated, too. Taking -care of this it is not enough to place the support library (say -<TT>`libsupport.a'</TT>) just between the <CODE>@INTLLIBS@</CODE> and -<CODE>@LIBS@</CODE> in the above example. Instead one has to write this: - - -<PRE> -LIBS = ../lib/libsupport.a @INTLLIBS@ ../lib/libsupport.a @LIBS@ -</PRE> - -<LI> - -You should also ensure that directory <TT>`intl/'</TT> will be searched for -C preprocessor include files in all circumstances. So, you have to -manage so both <SAMP>`-I../intl'</SAMP> and <SAMP>`-I$(top_srcdir)/intl'</SAMP> will -be given to the C compiler. - -<LI> - -Your <SAMP>`dist:'</SAMP> goal has to conform with others. Here is a -reasonable definition for it: - - -<PRE> -distdir = ../$(PACKAGE)-$(VERSION)/$(subdir) -dist: Makefile $(DISTFILES) - for file in $(DISTFILES); do \ - ln $$file $(distdir) 2>/dev/null || cp -p $$file $(distdir); \ - done -</PRE> - -</OL> - - - -<H1><A NAME="SEC76" HREF="gettext_toc.html#TOC76">Concluding Remarks</A></H1> - -<P> -We would like to conclude this GNU <CODE>gettext</CODE> manual by presenting -an history of the GNU Translation Project so far. We finally give -a few pointers for those who want to do further research or readings -about Native Language Support matters. - -</P> - - - -<H2><A NAME="SEC77" HREF="gettext_toc.html#TOC77">History of GNU <CODE>gettext</CODE></A></H2> - -<P> -Internationalization concerns and algorithms have been informally -and casually discussed for years in GNU, sometimes around GNU -<CODE>libc</CODE>, maybe around the incoming <CODE>Hurd</CODE>, or otherwise -(nobody clearly remembers). And even then, when the work started for -real, this was somewhat independently of these previous discussions. - -</P> -<P> -This all began in July 1994, when Patrick D'Cruze had the idea and -initiative of internationalizing version 3.9.2 of GNU <CODE>fileutils</CODE>. -He then asked Jim Meyering, the maintainer, how to get those changes -folded into an official release. That first draft was full of -<CODE>#ifdef</CODE>s and somewhat disconcerting, and Jim wanted to find -nicer ways. Patrick and Jim shared some tries and experimentations -in this area. Then, feeling that this might eventually have a deeper -impact on GNU, Jim wanted to know what standards were, and contacted -Richard Stallman, who very quickly and verbally described an overall -design for what was meant to become <CODE>glocale</CODE>, at that time. - -</P> -<P> -Jim implemented <CODE>glocale</CODE> and got a lot of exhausting feedback -from Patrick and Richard, of course, but also from Mitchum DSouza -(who wrote a <CODE>catgets</CODE>-like package), Roland McGrath, maybe David -MacKenzie, Pinard, and Paul Eggert, all pushing and -pulling in various directions, not always compatible, to the extent -that after a couple of test releases, <CODE>glocale</CODE> was torn apart. - -</P> -<P> -While Jim took some distance and time and became dad for a second -time, Roland wanted to get GNU <CODE>libc</CODE> internationalized, and -got Ulrich Drepper involved in that project. Instead of starting -from <CODE>glocale</CODE>, Ulrich rewrote something from scratch, but -more conformant to the set of guidelines who emerged out of the -<CODE>glocale</CODE> effort. Then, Ulrich got people from the previous -forum to involve themselves into this new project, and the switch -from <CODE>glocale</CODE> to what was first named <CODE>msgutils</CODE>, renamed -<CODE>nlsutils</CODE>, and later <CODE>gettext</CODE>, became officially accepted -by Richard in May 1995 or so. - -</P> -<P> -Let's summarize by saying that Ulrich Drepper wrote GNU <CODE>gettext</CODE> -in April 1995. The first official release of the package, including -PO mode, occurred in July 1995, and was numbered 0.7. Other people -contributed to the effort by providing a discussion forum around -Ulrich, writing little pieces of code, or testing. These are quoted -in the <CODE>THANKS</CODE> file which comes with the GNU <CODE>gettext</CODE> -distribution. - -</P> -<P> -While this was being done, adapted half a dozen of -GNU packages to <CODE>glocale</CODE> first, then later to <CODE>gettext</CODE>, -putting them in pretest, so providing along the way an effective -user environment for fine tuning the evolving tools. He also took -the responsibility of organizing and coordinating the GNU Translation -Project. After nearly a year of informal exchanges between people from -many countries, translator teams started to exist in May 1995, through -the creation and support by Patrick D'Cruze of twenty unmoderated -mailing lists for that many native languages, and two moderated -lists: one for reaching all teams at once, the other for reaching -all maintainers of internationalized packages in GNU. - -</P> -<P> - also wrote PO mode in June 1995 with the collaboration -of Greg McGary, as a kind of contribution to Ulrich's package. -He also gave a hand with the GNU <CODE>gettext</CODE> Texinfo manual. - -</P> - - -<H2><A NAME="SEC78" HREF="gettext_toc.html#TOC78">Related Readings</A></H2> - -<P> -Eugene H. Dorr (<TT>`dorre@well.com'</TT>) maintains an interesting -bibliography on internationalization matters, called -<CITE>Internationalization Reference List</CITE>, which is available as: - -<PRE> -ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/i18n-books.txt -</PRE> - -<P> -Michael Gschwind (<TT>`mike@vlsivie.tuwien.ac.at'</TT>) maintains a -Frequently Asked Questions (FAQ) list, entitled <CITE>Programming for -Internationalisation</CITE>. This FAQ discusses writing programs which -can handle different language conventions, character sets, etc.; -and is applicable to all character set encodings, with particular -emphasis on ISO 8859-1. It is regularly published in Usenet -groups <TT>`comp.unix.questions'</TT>, <TT>`comp.std.internat'</TT>, -<TT>`comp.software.international'</TT>, <TT>`comp.lang.c'</TT>, -<TT>`comp.windows.x'</TT>, <TT>`comp.std.c'</TT>, <TT>`comp.answers'</TT> -and <TT>`news.answers'</TT>. The home location of this document is: - -<PRE> -ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/ISO-programming -</PRE> - -<P> -Patrick D'Cruze (<TT>`pdcruze@li.org'</TT>) wrote a tutorial about NLS -matters, and Jochen Hein (<TT>`Hein@student.tu-clausthal.de'</TT>) took -over the responsibility of maintaining it. It may be found as: - -<PRE> -ftp://sunsite.unc.edu/pub/Linux/utils/nls/catalogs/Incoming/... - ...locale-tutorial-0.8.txt.gz -</PRE> - -<P> -This site is mirrored in: - -<PRE> -ftp://ftp.ibp.fr/pub/linux/sunsite/ -</PRE> - -<P> -A French version of the same tutorial should be findable at: - -<PRE> -ftp://ftp.ibp.fr/pub/linux/french/docs/ -</PRE> - -<P> -together with French translations of many Linux-related documents. - -</P> -<P><HR><P> -This document was generated on 4 September 1998 using the -<A HREF="http://wwwcn.cern.ch/dci/texi2html/">texi2html</A> -translator version 1.51.</P> -</BODY> -</HTML>