]> git.saurik.com Git - wxWidgets.git/blame - docs/html/gettext/gettext_2.html
merged 2.2 branch
[wxWidgets.git] / docs / html / gettext / gettext_2.html
CommitLineData
f6bcfd97
BP
1<HTML>
2<HEAD>
3<!-- This HTML file has been created by texi2html 1.54
4 from gettext.texi on 25 January 1999 -->
5
6<TITLE>GNU gettext utilities - PO Files and PO Mode Basics</TITLE>
7<link href="gettext_3.html" rel=Next>
8<link href="gettext_1.html" rel=Previous>
9<link href="gettext_toc.html" rel=ToC>
10
11</HEAD>
12<BODY>
13<p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
14<P><HR><P>
15
16
17<H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">PO Files and PO Mode Basics</A></H1>
18
19<P>
20The GNU <CODE>gettext</CODE> toolset helps programmers and translators
21at producing, updating and using translation files, mainly those
22PO files which are textual, editable files. This chapter stresses
23the format of PO files, and contains a PO mode starter. PO mode
24description is spread throughout this manual instead of being concentrated
25in one place. Here we present only the basics of PO mode.
26
27</P>
28
29
30
31<H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">Completing GNU <CODE>gettext</CODE> Installation</A></H2>
32
33<P>
34Once you have received, unpacked, configured and compiled the GNU
35<CODE>gettext</CODE> distribution, the <SAMP>`make install'</SAMP> command puts in
36place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and
37<CODE>msgmerge</CODE>, as well as their available message catalogs. To
38top off a comfortable installation, you might also want to make the
39PO mode available to your GNU Emacs users.
40
41</P>
42<P>
43During the installation of the PO mode, you might want modify your
44file <TT>`.emacs'</TT>, once and for all, so it contains a few lines looking
45like:
46
47</P>
48
49<PRE>
50(setq auto-mode-alist
51 (cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist))
52(autoload 'po-mode "po-mode")
53</PRE>
54
55<P>
56Later, whenever you edit some <TT>`.po'</TT>, <TT>`.pot'</TT> or <TT>`.pox'</TT>
57file, or any file having the string <SAMP>`.po.'</SAMP> within its name,
58Emacs loads <TT>`po-mode.elc'</TT> (or <TT>`po-mode.el'</TT>) as needed, and
59automatically activates PO mode commands for the associated buffer.
60The string <EM>PO</EM> appears in the mode line for any buffer for
61which PO mode is active. Many PO files may be active at once in a
62single Emacs session.
63
64</P>
65<P>
66If you are using Emacs version 20 or better, and have already installed
67the appropriate international fonts on your system, you may also manage
68for the these fonts to be automatically loaded and used for displaying
69the translations on your Emacs screen, whenever necessary. For this to
70happen, you might want to add the lines:
71
72</P>
73
74<PRE>
75(autoload 'po-find-file-coding-system "po-mode")
76(modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\."
77 'po-find-file-coding-system)
78</PRE>
79
80<P>
81to your <TT>`.emacs'</TT> file.
82
83</P>
84
85
86<H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">The Format of PO Files</A></H2>
87
88<P>
89A PO file is made up of many entries, each entry holding the relation
90between an original untranslated string and its corresponding
91translation. All entries in a given PO file usually pertain
92to a single project, and all translations are expressed in a single
93target language. One PO file <STRONG>entry</STRONG> has the following schematic
94structure:
95
96</P>
97
98<PRE>
99<VAR>white-space</VAR>
100# <VAR>translator-comments</VAR>
101#. <VAR>automatic-comments</VAR>
102#: <VAR>reference</VAR>...
103#, <VAR>flag</VAR>...
104msgid <VAR>untranslated-string</VAR>
105msgstr <VAR>translated-string</VAR>
106</PRE>
107
108<P>
109The general structure of a PO file should be well understood by
110the translator. When using PO mode, very little has to be known
111about the format details, as PO mode takes care of them for her.
112
113</P>
114<P>
115Entries begin with some optional white space. Usually, when generated
116through GNU <CODE>gettext</CODE> tools, there is exactly one blank line
117between entries. Then comments follow, on lines all starting with the
118character <KBD>#</KBD>. There are two kinds of comments: those which have
119some white space immediately following the <KBD>#</KBD>, which comments are
120created and maintained exclusively by the translator, and those which
121have some non-white character just after the <KBD>#</KBD>, which comments
122are created and maintained automatically by GNU <CODE>gettext</CODE> tools.
123All comments, of either kind, are optional.
124
125</P>
126<P>
127After white space and comments, entries show two strings, giving
128first the untranslated string as it appears in the original program
129sources, and then, the translation of this string. The original
130string is introduced by the keyword <CODE>msgid</CODE>, and the translation,
131by <CODE>msgstr</CODE>. The two strings, untranslated and translated,
132are quoted in various ways in the PO file, using <KBD>"</KBD>
133delimiters and <KBD>\</KBD> escapes, but the translator does not really
134have to pay attention to the precise quoting format, as PO mode fully
135intend to take care of quoting for her.
136
137</P>
138<P>
139The <CODE>msgid</CODE> strings, as well as automatic comments, are produced
140and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not
141provide means for the translator to alter these. The most she can
142do is merely deleting them, and only by deleting the whole entry.
143On the other hand, the <CODE>msgstr</CODE> string, as well as translator
144comments, are really meant for the translator, and PO mode gives her
145the full control she needs.
146
147</P>
148<P>
149The comment lines beginning with <KBD>#,</KBD> are special because they are
150not completely ignored by the programs as comments generally are. The
151comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE>
152program to give the user some better disgnostic messages. Currently
153there are two forms of flags defined:
154
155</P>
156<DL COMPACT>
157
158<DT><KBD>fuzzy</KBD>
159<DD>
160This flag can be generated by the <CODE>msgmerge</CODE> program or it can be
161inserted by the translator herself. It shows that the <CODE>msgstr</CODE>
162string might not be a correct translation (anymore). Only the translator
163can judge if the translation requires further modification, or is
164acceptable as is. Once satisfied with the translation, she then removes
165this <KBD>fuzzy</KBD> attribute. The <CODE>msgmerge</CODE> programs inserts this
166when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy
167search only. See section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>.
168
169<DT><KBD>c-format</KBD>
170<DD>
171<DT><KBD>no-c-format</KBD>
172<DD>
173These flags should not be added by a human. Instead only the
174<CODE>xgettext</CODE> program adds them. In an automatized PO file processing
175system as proposed here the user changes would be thrown away again as
176soon as the <CODE>xgettext</CODE> program generates a new template file.
177
178In case the <KBD>c-format</KBD> flag is given for a string the <CODE>msgfmt</CODE>
179does some more tests to check to validity of the translation.
180See section <A HREF="gettext_6.html#SEC33">Invoking the <CODE>msgfmt</CODE> Program</A>.
181
182</DL>
183
184<P>
185It happens that some lines, usually whitespace or comments, follow the
186very last entry of a PO file. Such lines are not part of any entry,
187and PO mode is unable to take action on those lines. By using the
188PO mode function <KBD>M-x po-normalize</KBD>, the translator may get
189rid of those spurious lines. See section <A HREF="gettext_2.html#SEC12">Normalizing Strings in Entries</A>.
190
191</P>
192<P>
193The remainder of this section may be safely skipped by those using
194PO mode, yet it may be interesting for everybody to have a better
195idea of the precise format of a PO file. On the other hand, those
196not having GNU Emacs handy should carefully continue reading on.
197
198</P>
199<P>
200Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects
201the C syntax for a character string, including the surrounding quotes
202and imbedded backslashed escape sequences. When the time comes
203to write multi-line strings, one should not use escaped newlines.
204Instead, a closing quote should follow the last character on the
205line to be continued, and an opening quote should resume the string
206at the beginning of the following PO file line. For example:
207
208</P>
209
210<PRE>
211msgid ""
212"Here is an example of how one might continue a very long string\n"
213"for the common case the string represents multi-line output.\n"
214</PRE>
215
216<P>
217In this example, the empty string is used on the first line, to
218allow better alignment of the <KBD>H</KBD> from the word <SAMP>`Here'</SAMP>
219over the <KBD>f</KBD> from the word <SAMP>`for'</SAMP>. In this example, the
220<CODE>msgid</CODE> keyword is followed by three strings, which are meant
221to be concatenated. Concatenating the empty string does not change
222the resulting overall string, but it is a way for us to comply with
223the necessity of <CODE>msgid</CODE> to be followed by a string on the same
224line, while keeping the multi-line presentation left-justified, as
225we find this to be a cleaner disposition. The empty string could have
226been omitted, but only if the string starting with <SAMP>`Here'</SAMP> was
227promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF1" HREF="gettext_foot.html#FOOT1">(1)</A> It was not really necessary
228either to switch between the two last quoted strings immediately after
229the newline <SAMP>`\n'</SAMP>, the switch could have occurred after <EM>any</EM>
230other character, we just did it this way because it is neater.
231
232</P>
233<P>
234One should carefully distinguish between end of lines marked as
235<SAMP>`\n'</SAMP> <EM>inside</EM> quotes, which are part of the represented
236string, and end of lines in the PO file itself, outside string quotes,
237which have no incidence on the represented string.
238
239</P>
240<P>
241Outside strings, white lines and comments may be used freely.
242Comments start at the beginning of a line with <SAMP>`#'</SAMP> and extend
243until the end of the PO file line. Comments written by translators
244should have the initial <SAMP>`#'</SAMP> immediately followed by some white
245space. If the <SAMP>`#'</SAMP> is not immediately followed by white space,
246this comment is most likely generated and managed by specialized GNU
247tools, and might disappear or be replaced unexpectedly when the PO
248file is given to <CODE>msgmerge</CODE>.
249
250</P>
251
252
253<H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">Main PO mode Commands</A></H2>
254
255<P>
256After setting up Emacs with something similar to the lines in
257section <A HREF="gettext_2.html#SEC8">Completing GNU <CODE>gettext</CODE> Installation</A>, PO mode is activated for a window when Emacs finds a
258PO file in that window. This puts the window read-only and establishes a
259po-mode-map, which is a genuine Emacs mode, in a way that is not derived
260from text mode in any way. Functions found on <CODE>po-mode-hook</CODE>,
261if any, will be executed.
262
263</P>
264<P>
265When PO mode is active in a window, the letters <SAMP>`PO'</SAMP> appear
266in the mode line for that window. The mode line also displays how
267many entries of each kind are held in the PO file. For example,
268the string <SAMP>`132t+3f+10u+2o'</SAMP> would tell the translator that the
269PO mode contains 132 translated entries (see section <A HREF="gettext_5.html#SEC25">Translated Entries</A>,
2703 fuzzy entries (see section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>), 10 untranslated entries
271(see section <A HREF="gettext_5.html#SEC27">Untranslated Entries</A>) and 2 obsolete entries (see section <A HREF="gettext_5.html#SEC28">Obsolete Entries</A>). Zero-coefficients items are not shown. So, in this example, if
272the fuzzy entries were unfuzzied, the untranslated entries were translated
273and the obsolete entries were deleted, the mode line would merely display
274<SAMP>`145t'</SAMP> for the counters.
275
276</P>
277<P>
278The main PO commands are those which do not fit into the other categories of
279subsequent sections. These allow for quitting PO mode or for managing windows
280in special ways.
281
282</P>
283<DL COMPACT>
284
285<DT><KBD>U</KBD>
286<DD>
287Undo last modification to the PO file.
288
289<DT><KBD>Q</KBD>
290<DD>
291Quit processing and save the PO file.
292
293<DT><KBD>q</KBD>
294<DD>
295Quit processing, possibly after confirmation.
296
297<DT><KBD>O</KBD>
298<DD>
299Temporary leave the PO file window.
300
301<DT><KBD>?</KBD>
302<DD>
303<DT><KBD>h</KBD>
304<DD>
305Show help about PO mode.
306
307<DT><KBD>=</KBD>
308<DD>
309Give some PO file statistics.
310
311<DT><KBD>V</KBD>
312<DD>
313Batch validate the format of the whole PO file.
314
315</DL>
316
317<P>
318The command <KBD>U</KBD> (<CODE>po-undo</CODE>) interfaces to the GNU Emacs
319<EM>undo</EM> facility. See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>. Each time <KBD>U</KBD> is typed, modifications which the translator
320did to the PO file are undone a little more. For the purpose of
321undoing, each PO mode command is atomic. This is especially true for
322the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single
323use of this command is undone at once, even if the edition itself
324implied several actions. However, while in the editing window, one
325can undo the edition work quite parsimoniously.
326
327</P>
328<P>
329The commands <KBD>Q</KBD> (<CODE>po-quit</CODE>) and <KBD>q</KBD>
330(<CODE>po-confirm-and-quit</CODE>) are used when the translator is done with the
331PO file. The former is a bit less verbose than the latter. If the file
332has been modified, it is saved to disk first. In both cases, and prior to
333all this, the commands check if some untranslated message remains in the
334PO file and, if yes, the translator is asked if she really wants to leave
335off working with this PO file. This is the preferred way of getting rid
336of an Emacs PO file buffer. Merely killing it through the usual command
337<KBD>C-x k</KBD> (<CODE>kill-buffer</CODE>) is not the tidiest way to proceed.
338
339</P>
340<P>
341The command <KBD>O</KBD> (<CODE>po-other-window</CODE>) is another, softer way,
342to leave PO mode, temporarily. It just moves the cursor to some other
343Emacs window, and pops one if necessary. For example, if the translator
344just got PO mode to show some source context in some other, she might
345discover some apparent bug in the program source that needs correction.
346This command allows the translator to change sex, become a programmer,
347and have the cursor right into the window containing the program she
348(or rather <EM>he</EM>) wants to modify. By later getting the cursor back
349in the PO file window, or by asking Emacs to edit this file once again,
350PO mode is then recovered.
351
352</P>
353<P>
354The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all available PO
355mode commands. The translator should then type any character to resume
356normal PO mode operations. The command <KBD>?</KBD> has the same effect
357as <KBD>h</KBD>.
358
359</P>
360<P>
361The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number of
362entries in the PO file, the ordinal of the current entry (counted from
3631), the number of untranslated entries, the number of obsolete entries,
364and displays all these numbers.
365
366</P>
367<P>
368The command <KBD>V</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in verbose
369mode over the current PO file. This command first offers to save the
370current PO file on disk. The <CODE>msgfmt</CODE> tool, from GNU <CODE>gettext</CODE>,
371has the purpose of creating a MO file out of a PO file, and PO mode uses
372the features of this program for checking the overall format of a PO file,
373as well as all individual entries.
374
375</P>
376<P>
377The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so the
378translator regains control immediately while her PO file is being studied.
379Error output is collected in the GNU Emacs <SAMP>`*compilation*'</SAMP> buffer,
380displayed in another window. The regular GNU Emacs command <KBD>C-x`</KBD>
381(<CODE>next-error</CODE>), as well as other usual compile commands, allow the
382translator to reposition quickly to the offending parts of the PO file.
383Once the cursor is on the line in error, the translator may decide on
384any PO mode action which would help correcting the error.
385
386</P>
387
388
389<H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">Entry Positioning</A></H2>
390
391<P>
392The cursor in a PO file window is almost always part of
393an entry. The only exceptions are the special case when the cursor
394is after the last entry in the file, or when the PO file is
395empty. The entry where the cursor is found to be is said to be the
396current entry. Many PO mode commands operate on the current entry,
397so moving the cursor does more than allowing the translator to browse
398the PO file, this also selects on which entry commands operate.
399
400</P>
401<P>
402Some PO mode commands alter the position of the cursor in a specialized
403way. A few of those special purpose positioning are described here,
404the others are described in following sections.
405
406</P>
407<DL COMPACT>
408
409<DT><KBD>.</KBD>
410<DD>
411Redisplay the current entry.
412
413<DT><KBD>n</KBD>
414<DD>
415<DT><KBD>n</KBD>
416<DD>
417Select the entry after the current one.
418
419<DT><KBD>p</KBD>
420<DD>
421<DT><KBD>p</KBD>
422<DD>
423Select the entry before the current one.
424
425<DT><KBD>&#60;</KBD>
426<DD>
427Select the first entry in the PO file.
428
429<DT><KBD>&#62;</KBD>
430<DD>
431Select the last entry in the PO file.
432
433<DT><KBD>m</KBD>
434<DD>
435Record the location of the current entry for later use.
436
437<DT><KBD>l</KBD>
438<DD>
439Return to a previously saved entry location.
440
441<DT><KBD>x</KBD>
442<DD>
443Exchange the current entry location with the previously saved one.
444
445</DL>
446
447<P>
448Any GNU Emacs command able to reposition the cursor may be used
449to select the current entry in PO mode, including commands which
450move by characters, lines, paragraphs, screens or pages, and search
451commands. However, there is a kind of standard way to display the
452current entry in PO mode, which usual GNU Emacs commands moving
453the cursor do not especially try to enforce. The command <KBD>.</KBD>
454(<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the
455current entry properly, after the current entry has been changed by
456means external to PO mode, or the Emacs screen otherwise altered.
457
458</P>
459<P>
460It is yet to be decided if PO mode helps the translator, or otherwise
461irritates her, by forcing a rigid window disposition while she
462is doing her work. We originally had quite precise ideas about
463how windows should behave, but on the other hand, anyone used to
464GNU Emacs is often happy to keep full control. Maybe a fixed window
465disposition might be offered as a PO mode option that the translator
466might activate or deactivate at will, so it could be offered on an
467experimental basis. If nobody feels a real need for using it, or
468a compulsion for writing it, we should drop this whole idea.
469The incentive for doing it should come from translators rather than
470programmers, as opinions from an experienced translator are surely
471more worth to me than opinions from programmers <EM>thinking</EM> about
472how <EM>others</EM> should do translation.
473
474</P>
475<P>
476The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD>
477(<CODE>po-previous-entry</CODE>) move the cursor the entry following,
478or preceding, the current one. If <KBD>n</KBD> is given while the
479cursor is on the last entry of the PO file, or if <KBD>p</KBD>
480is given while the cursor is on the first entry, no move is done.
481
482</P>
483<P>
484The commands <KBD>&#60;</KBD> (<CODE>po-first-entry</CODE>) and <KBD>&#62;</KBD>
485(<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last
486entry, of the PO file. When the cursor is located past the last
487entry in a PO file, most PO mode commands will return an error saying
488<SAMP>`After last entry'</SAMP>. Moreover, the commands <KBD>&#60;</KBD> and <KBD>&#62;</KBD>
489have the special property of being able to work even when the cursor
490is not into some PO file entry, and one may use them for nicely
491correcting this situation. But even these commands will fail on a
492truly empty PO file. There are development plans for the PO mode for it
493to interactively fill an empty PO file from sources. See section <A HREF="gettext_3.html#SEC16">Marking Translatable Strings</A>.
494
495</P>
496<P>
497The translator may decide, before working at the translation of
498a particular entry, that she needs to browse the remainder of the
499PO file, maybe for finding the terminology or phraseology used
500in related entries. She can of course use the standard Emacs idioms
501for saving the current cursor location in some register, and use that
502register for getting back, or else, use the location ring.
503
504</P>
505<P>
506PO mode offers another approach, by which cursor locations may be saved
507onto a special stack. The command <KBD>m</KBD> (<CODE>po-push-location</CODE>)
508merely adds the location of current entry to the stack, pushing
509the already saved locations under the new one. The command
510<KBD>r</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and
511reposition the cursor to the entry associated with that top element.
512This position is then lost, for the next <KBD>r</KBD> will move the cursor
513to the previously saved location, and so on until no locations remain
514on the stack.
515
516</P>
517<P>
518If the translator wants the position to be kept on the location stack,
519maybe for taking a look at the entry associated with the top
520element, then go elsewhere with the intent of getting back later, she
521ought to use <KBD>m</KBD> immediately after <KBD>r</KBD>.
522
523</P>
524<P>
525The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously
526reposition the cursor to the entry associated with the top element of
527the stack of saved locations, and replace that top element with the
528location of the current entry before the move. Consequently, repeating
529the <KBD>x</KBD> command toggles alternatively between two entries.
530For achieving this, the translator will position the cursor on the
531first entry, use <KBD>m</KBD>, then position to the second entry, and
532merely use <KBD>x</KBD> for making the switch.
533
534</P>
535
536
537<H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">Normalizing Strings in Entries</A></H2>
538
539<P>
540There are many different ways for encoding a particular string into a
541PO file entry, because there are so many different ways to split and
542quote multi-line strings, and even, to represent special characters
543by backslahsed escaped sequences. Some features of PO mode rely on
544the ability for PO mode to scan an already existing PO file for a
545particular string encoded into the <CODE>msgid</CODE> field of some entry.
546Even if PO mode has internally all the built-in machinery for
547implementing this recognition easily, doing it fast is technically
548difficult. To facilitate a solution to this efficiency problem,
549we decided on a canonical representation for strings.
550
551</P>
552<P>
553A conventional representation of strings in a PO file is currently
554under discussion, and PO mode experiments with a canonical representation.
555Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform
556way of representing equivalent strings would be useful, as the internal
557normalization needed by PO mode could be automatically satisfied
558when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>. An explicit
559PO mode normalization should then be only necessary for PO files
560imported from elsewhere, or for when the convention itself evolves.
561
562</P>
563<P>
564So, for achieving normalization of at least the strings of a given
565PO file needing a canonical representation, the following PO mode
566command is available:
567
568</P>
569<DL COMPACT>
570
571<DT><KBD>M-x po-normalize</KBD>
572<DD>
573Tidy the whole PO file by making entries more uniform.
574
575</DL>
576
577<P>
578The special command <KBD>M-x po-normalize</KBD>, which has no associate
579keys, revises all entries, ensuring that strings of both original
580and translated entries use uniform internal quoting in the PO file.
581It also removes any crumb after the last entry. This command may be
582useful for PO files freshly imported from elsewhere, or if we ever
583improve on the canonical quoting format we use. This canonical format
584is not only meant for getting cleaner PO files, but also for greatly
585speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands.
586
587</P>
588<P>
589<KBD>M-x po-normalize</KBD> presently makes three passes over the entries.
590The first implements heuristics for converting PO files for GNU
591<CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE>
592fields were using K&#38;R style C string syntax for multi-line strings.
593These heuristics may fail for comments not related to obsolete
594entries and ending with a backslash; they also depend on subsequent
595passes for finalizing the proper commenting of continued lines for
596obsolete entries. This first pass might disappear once all oldish PO
597files would have been adjusted. The second and third pass normalize
598all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively. They also
599clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE>
600for continued lines.
601
602</P>
603<P>
604Having such an explicit normalizing command allows for importing PO
605files from other sources, but also eases the evolution of the current
606convention, evolution driven mostly by aesthetic concerns, as of now.
607It is easy to make suggested adjustments at a later time, as the
608normalizing command and eventually, other GNU <CODE>gettext</CODE> tools
609should greatly automate conformance. A description of the canonical
610string format is given below, for the particular benefit of those not
611having GNU Emacs handy, and who would nevertheless want to handcraft
612their PO files in nice ways.
613
614</P>
615<P>
616Right now, in PO mode, strings are single line or multi-line. A string
617goes multi-line if and only if it has <EM>embedded</EM> newlines, that
618is, if it matches <SAMP>`[^\n]\n+[^\n]'</SAMP>. So, we would have:
619
620</P>
621
622<PRE>
623msgstr "\n\nHello, world!\n\n\n"
624</PRE>
625
626<P>
627but, replacing the space by a newline, this becomes:
628
629</P>
630
631<PRE>
632msgstr ""
633"\n"
634"\n"
635"Hello,\n"
636"world!\n"
637"\n"
638"\n"
639</PRE>
640
641<P>
642We are deliberately using a caricatural example, here, to make the
643point clearer. Usually, multi-lines are not that bad looking.
644It is probable that we will implement the following suggestion.
645We might lump together all initial newlines into the empty string,
646and also all newlines introducing empty lines (that is, for <VAR>n</VAR>
647&#62; 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate
648string), so making the previous example appear:
649
650</P>
651
652<PRE>
653msgstr "\n\n"
654"Hello,\n"
655"world!\n"
656"\n\n"
657</PRE>
658
659<P>
660There are a few yet undecided little points about string normalization,
661to be documented in this manual, once these questions settle.
662
663</P>
664<P><HR><P>
665<p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
666</BODY>
667</HTML>