3 <!-- This HTML file has been created by texi2html 1.54 
   4      from gettext.texi on 25 January 1999 --> 
   6 <TITLE>GNU gettext utilities - PO Files and PO Mode Basics
</TITLE> 
   7 <link href=
"gettext_3.html" rel=Next
> 
   8 <link href=
"gettext_1.html" rel=Previous
> 
   9 <link href=
"gettext_toc.html" rel=ToC
> 
  13 <p>Go to the 
<A HREF=
"gettext_1.html">first
</A>, 
<A HREF=
"gettext_1.html">previous
</A>, 
<A HREF=
"gettext_3.html">next
</A>, 
<A HREF=
"gettext_12.html">last
</A> section, 
<A HREF=
"gettext_toc.html">table of contents
</A>.
 
  17 <H1><A NAME=
"SEC7" HREF=
"gettext_toc.html#TOC7">PO Files and PO Mode Basics
</A></H1> 
  20 The GNU 
<CODE>gettext
</CODE> toolset helps programmers and translators
 
  21 at producing, updating and using translation files, mainly those
 
  22 PO files which are textual, editable files.  This chapter stresses
 
  23 the format of PO files, and contains a PO mode starter.  PO mode
 
  24 description is spread throughout this manual instead of being concentrated
 
  25 in one place.  Here we present only the basics of PO mode.
 
  31 <H2><A NAME=
"SEC8" HREF=
"gettext_toc.html#TOC8">Completing GNU 
<CODE>gettext
</CODE> Installation
</A></H2> 
  34 Once you have received, unpacked, configured and compiled the GNU
 
  35 <CODE>gettext
</CODE> distribution, the 
<SAMP>`make install'
</SAMP> command puts in
 
  36 place the programs 
<CODE>xgettext
</CODE>, 
<CODE>msgfmt
</CODE>, 
<CODE>gettext
</CODE>, and
 
  37 <CODE>msgmerge
</CODE>, as well as their available message catalogs.  To
 
  38 top off a comfortable installation, you might also want to make the
 
  39 PO mode available to your GNU Emacs users.
 
  43 During the installation of the PO mode, you might want modify your
 
  44 file 
<TT>`.emacs'
</TT>, once and for all, so it contains a few lines looking
 
  51       (cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist))
 
  52 (autoload 'po-mode "po-mode")
 
  56 Later, whenever you edit some 
<TT>`.po'
</TT>, 
<TT>`.pot'
</TT> or 
<TT>`.pox'
</TT> 
  57 file, or any file having the string 
<SAMP>`.po.'
</SAMP> within its name,
 
  58 Emacs loads 
<TT>`po-mode.elc'
</TT> (or 
<TT>`po-mode.el'
</TT>) as needed, and
 
  59 automatically activates PO mode commands for the associated buffer.
 
  60 The string 
<EM>PO
</EM> appears in the mode line for any buffer for
 
  61 which PO mode is active.  Many PO files may be active at once in a
 
  66 If you are using Emacs version 
20 or better, and have already installed
 
  67 the appropriate international fonts on your system, you may also manage
 
  68 for the these fonts to be automatically loaded and used for displaying
 
  69 the translations on your Emacs screen, whenever necessary.  For this to
 
  70 happen, you might want to add the lines:
 
  75 (autoload 'po-find-file-coding-system "po-mode")
 
  76 (modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\."
 
  77                             'po-find-file-coding-system)
 
  81 to your 
<TT>`.emacs'
</TT> file.
 
  86 <H2><A NAME=
"SEC9" HREF=
"gettext_toc.html#TOC9">The Format of PO Files
</A></H2> 
  89 A PO file is made up of many entries, each entry holding the relation
 
  90 between an original untranslated string and its corresponding
 
  91 translation.  All entries in a given PO file usually pertain
 
  92 to a single project, and all translations are expressed in a single
 
  93 target language.  One PO file 
<STRONG>entry
</STRONG> has the following schematic
 
  99 <VAR>white-space
</VAR> 
 100 #  
<VAR>translator-comments
</VAR> 
 101 #. 
<VAR>automatic-comments
</VAR> 
 102 #: 
<VAR>reference
</VAR>...
 
 103 #, 
<VAR>flag
</VAR>...
 
 104 msgid 
<VAR>untranslated-string
</VAR> 
 105 msgstr 
<VAR>translated-string
</VAR> 
 109 The general structure of a PO file should be well understood by
 
 110 the translator.  When using PO mode, very little has to be known
 
 111 about the format details, as PO mode takes care of them for her.
 
 115 Entries begin with some optional white space.  Usually, when generated
 
 116 through GNU 
<CODE>gettext
</CODE> tools, there is exactly one blank line
 
 117 between entries.  Then comments follow, on lines all starting with the
 
 118 character 
<KBD>#
</KBD>.  There are two kinds of comments: those which have
 
 119 some white space immediately following the 
<KBD>#
</KBD>, which comments are
 
 120 created and maintained exclusively by the translator, and those which
 
 121 have some non-white character just after the 
<KBD>#
</KBD>, which comments
 
 122 are created and maintained automatically by GNU 
<CODE>gettext
</CODE> tools.
 
 123 All comments, of either kind, are optional.
 
 127 After white space and comments, entries show two strings, giving
 
 128 first the untranslated string as it appears in the original program
 
 129 sources, and then, the translation of this string.  The original
 
 130 string is introduced by the keyword 
<CODE>msgid
</CODE>, and the translation,
 
 131 by 
<CODE>msgstr
</CODE>.  The two strings, untranslated and translated,
 
 132 are quoted in various ways in the PO file, using 
<KBD>"</KBD> 
 133 delimiters and <KBD>\</KBD> escapes, but the translator does not really 
 134 have to pay attention to the precise quoting format, as PO mode fully 
 135 intend to take care of quoting for her. 
 139 The <CODE>msgid</CODE> strings, as well as automatic comments, are produced 
 140 and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not 
 141 provide means for the translator to alter these.  The most she can 
 142 do is merely deleting them, and only by deleting the whole entry. 
 143 On the other hand, the <CODE>msgstr</CODE> string, as well as translator 
 144 comments, are really meant for the translator, and PO mode gives her 
 145 the full control she needs. 
 149 The comment lines beginning with <KBD>#,</KBD> are special because they are 
 150 not completely ignored by the programs as comments generally are.  The 
 151 comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE> 
 152 program to give the user some better disgnostic messages.  Currently 
 153 there are two forms of flags defined: 
 160 This flag can be generated by the <CODE>msgmerge</CODE> program or it can be 
 161 inserted by the translator herself.  It shows that the <CODE>msgstr</CODE> 
 162 string might not be a correct translation (anymore).  Only the translator 
 163 can judge if the translation requires further modification, or is 
 164 acceptable as is.  Once satisfied with the translation, she then removes 
 165 this <KBD>fuzzy</KBD> attribute.  The <CODE>msgmerge</CODE> programs inserts this 
 166 when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy 
 167 search only.  See section <A HREF="gettext_5.html#SEC26
">Fuzzy Entries</A>. 
 169 <DT><KBD>c-format</KBD> 
 171 <DT><KBD>no-c-format</KBD> 
 173 These flags should not be added by a human.  Instead only the 
 174 <CODE>xgettext</CODE> program adds them.  In an automatized PO file processing 
 175 system as proposed here the user changes would be thrown away again as 
 176 soon as the <CODE>xgettext</CODE> program generates a new template file. 
 178 In case the <KBD>c-format</KBD> flag is given for a string the <CODE>msgfmt</CODE> 
 179 does some more tests to check to validity of the translation. 
 180 See section <A HREF="gettext_6.html#SEC33
">Invoking the <CODE>msgfmt</CODE> Program</A>. 
 185 It happens that some lines, usually whitespace or comments, follow the 
 186 very last entry of a PO file.  Such lines are not part of any entry, 
 187 and PO mode is unable to take action on those lines.  By using the 
 188 PO mode function <KBD>M-x po-normalize</KBD>, the translator may get 
 189 rid of those spurious lines.  See section <A HREF="gettext_2.html#SEC12
">Normalizing Strings in Entries</A>. 
 193 The remainder of this section may be safely skipped by those using 
 194 PO mode, yet it may be interesting for everybody to have a better 
 195 idea of the precise format of a PO file.  On the other hand, those 
 196 not having GNU Emacs handy should carefully continue reading on. 
 200 Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects 
 201 the C syntax for a character string, including the surrounding quotes 
 202 and imbedded backslashed escape sequences.  When the time comes 
 203 to write multi-line strings, one should not use escaped newlines. 
 204 Instead, a closing quote should follow the last character on the 
 205 line to be continued, and an opening quote should resume the string 
 206 at the beginning of the following PO file line.  For example: 
 212 "Here is an example of how one might continue a very long string\n"
 
 213 "for the common case the string represents multi-line output.\n"
 
 217 In this example, the empty string is used on the first line, to
 
 218 allow better alignment of the 
<KBD>H
</KBD> from the word 
<SAMP>`Here'
</SAMP> 
 219 over the 
<KBD>f
</KBD> from the word 
<SAMP>`for'
</SAMP>.  In this example, the
 
 220 <CODE>msgid
</CODE> keyword is followed by three strings, which are meant
 
 221 to be concatenated.  Concatenating the empty string does not change
 
 222 the resulting overall string, but it is a way for us to comply with
 
 223 the necessity of 
<CODE>msgid
</CODE> to be followed by a string on the same
 
 224 line, while keeping the multi-line presentation left-justified, as
 
 225 we find this to be a cleaner disposition.  The empty string could have
 
 226 been omitted, but only if the string starting with 
<SAMP>`Here'
</SAMP> was
 
 227 promoted on the first line, right after 
<CODE>msgid
</CODE>.
<A NAME=
"DOCF1" HREF=
"gettext_foot.html#FOOT1">(
1)
</A> It was not really necessary
 
 228 either to switch between the two last quoted strings immediately after
 
 229 the newline 
<SAMP>`\n'
</SAMP>, the switch could have occurred after 
<EM>any
</EM> 
 230 other character, we just did it this way because it is neater.
 
 234 One should carefully distinguish between end of lines marked as
 
 235 <SAMP>`\n'
</SAMP> <EM>inside
</EM> quotes, which are part of the represented
 
 236 string, and end of lines in the PO file itself, outside string quotes,
 
 237 which have no incidence on the represented string.
 
 241 Outside strings, white lines and comments may be used freely.
 
 242 Comments start at the beginning of a line with 
<SAMP>`#'
</SAMP> and extend
 
 243 until the end of the PO file line.  Comments written by translators
 
 244 should have the initial 
<SAMP>`#'
</SAMP> immediately followed by some white
 
 245 space.  If the 
<SAMP>`#'
</SAMP> is not immediately followed by white space,
 
 246 this comment is most likely generated and managed by specialized GNU
 
 247 tools, and might disappear or be replaced unexpectedly when the PO
 
 248 file is given to 
<CODE>msgmerge
</CODE>.
 
 253 <H2><A NAME=
"SEC10" HREF=
"gettext_toc.html#TOC10">Main PO mode Commands
</A></H2> 
 256 After setting up Emacs with something similar to the lines in
 
 257 section 
<A HREF=
"gettext_2.html#SEC8">Completing GNU 
<CODE>gettext
</CODE> Installation
</A>, PO mode is activated for a window when Emacs finds a
 
 258 PO file in that window.  This puts the window read-only and establishes a
 
 259 po-mode-map, which is a genuine Emacs mode, in a way that is not derived
 
 260 from text mode in any way.  Functions found on 
<CODE>po-mode-hook
</CODE>,
 
 261 if any, will be executed.
 
 265 When PO mode is active in a window, the letters 
<SAMP>`PO'
</SAMP> appear
 
 266 in the mode line for that window.  The mode line also displays how
 
 267 many entries of each kind are held in the PO file.  For example,
 
 268 the string 
<SAMP>`
132t+
3f+
10u+
2o'
</SAMP> would tell the translator that the
 
 269 PO mode contains 
132 translated entries (see section 
<A HREF=
"gettext_5.html#SEC25">Translated Entries
</A>,
 
 270 3 fuzzy entries (see section 
<A HREF=
"gettext_5.html#SEC26">Fuzzy Entries
</A>), 
10 untranslated entries
 
 271 (see section 
<A HREF=
"gettext_5.html#SEC27">Untranslated Entries
</A>) and 
2 obsolete entries (see section 
<A HREF=
"gettext_5.html#SEC28">Obsolete Entries
</A>).  Zero-coefficients items are not shown.  So, in this example, if
 
 272 the fuzzy entries were unfuzzied, the untranslated entries were translated
 
 273 and the obsolete entries were deleted, the mode line would merely display
 
 274 <SAMP>`
145t'
</SAMP> for the counters.
 
 278 The main PO commands are those which do not fit into the other categories of
 
 279 subsequent sections.  These allow for quitting PO mode or for managing windows
 
 287 Undo last modification to the PO file.
 
 291 Quit processing and save the PO file.
 
 295 Quit processing, possibly after confirmation.
 
 299 Temporary leave the PO file window.
 
 305 Show help about PO mode.
 
 309 Give some PO file statistics.
 
 313 Batch validate the format of the whole PO file.
 
 318 The command 
<KBD>U
</KBD> (
<CODE>po-undo
</CODE>) interfaces to the GNU Emacs
 
 319 <EM>undo
</EM> facility.  See section `Undoing Changes' in 
<CITE>The Emacs Editor
</CITE>.  Each time 
<KBD>U
</KBD> is typed, modifications which the translator
 
 320 did to the PO file are undone a little more.  For the purpose of
 
 321 undoing, each PO mode command is atomic.  This is especially true for
 
 322 the 
<KBD><KBD>RET
</KBD></KBD> command: the whole edition made by using a single
 
 323 use of this command is undone at once, even if the edition itself
 
 324 implied several actions.  However, while in the editing window, one
 
 325 can undo the edition work quite parsimoniously.
 
 329 The commands 
<KBD>Q
</KBD> (
<CODE>po-quit
</CODE>) and 
<KBD>q
</KBD> 
 330 (
<CODE>po-confirm-and-quit
</CODE>) are used when the translator is done with the
 
 331 PO file.  The former is a bit less verbose than the latter.  If the file
 
 332 has been modified, it is saved to disk first.  In both cases, and prior to
 
 333 all this, the commands check if some untranslated message remains in the
 
 334 PO file and, if yes, the translator is asked if she really wants to leave
 
 335 off working with this PO file.  This is the preferred way of getting rid
 
 336 of an Emacs PO file buffer.  Merely killing it through the usual command
 
 337 <KBD>C-x k
</KBD> (
<CODE>kill-buffer
</CODE>) is not the tidiest way to proceed.
 
 341 The command 
<KBD>O
</KBD> (
<CODE>po-other-window
</CODE>) is another, softer way,
 
 342 to leave PO mode, temporarily.  It just moves the cursor to some other
 
 343 Emacs window, and pops one if necessary.  For example, if the translator
 
 344 just got PO mode to show some source context in some other, she might
 
 345 discover some apparent bug in the program source that needs correction.
 
 346 This command allows the translator to change sex, become a programmer,
 
 347 and have the cursor right into the window containing the program she
 
 348 (or rather 
<EM>he
</EM>) wants to modify.  By later getting the cursor back
 
 349 in the PO file window, or by asking Emacs to edit this file once again,
 
 350 PO mode is then recovered.
 
 354 The command 
<KBD>h
</KBD> (
<CODE>po-help
</CODE>) displays a summary of all available PO
 
 355 mode commands.  The translator should then type any character to resume
 
 356 normal PO mode operations.  The command 
<KBD>?
</KBD> has the same effect
 
 361 The command 
<KBD>=
</KBD> (
<CODE>po-statistics
</CODE>) computes the total number of
 
 362 entries in the PO file, the ordinal of the current entry (counted from
 
 363 1), the number of untranslated entries, the number of obsolete entries,
 
 364 and displays all these numbers.
 
 368 The command 
<KBD>V
</KBD> (
<CODE>po-validate
</CODE>) launches 
<CODE>msgfmt
</CODE> in verbose
 
 369 mode over the current PO file.  This command first offers to save the
 
 370 current PO file on disk.  The 
<CODE>msgfmt
</CODE> tool, from GNU 
<CODE>gettext
</CODE>,
 
 371 has the purpose of creating a MO file out of a PO file, and PO mode uses
 
 372 the features of this program for checking the overall format of a PO file,
 
 373 as well as all individual entries.
 
 377 The program 
<CODE>msgfmt
</CODE> runs asynchronously with Emacs, so the
 
 378 translator regains control immediately while her PO file is being studied.
 
 379 Error output is collected in the GNU Emacs 
<SAMP>`*compilation*'
</SAMP> buffer,
 
 380 displayed in another window.  The regular GNU Emacs command 
<KBD>C-x`
</KBD> 
 381 (
<CODE>next-error
</CODE>), as well as other usual compile commands, allow the
 
 382 translator to reposition quickly to the offending parts of the PO file.
 
 383 Once the cursor is on the line in error, the translator may decide on
 
 384 any PO mode action which would help correcting the error.
 
 389 <H2><A NAME=
"SEC11" HREF=
"gettext_toc.html#TOC11">Entry Positioning
</A></H2> 
 392 The cursor in a PO file window is almost always part of
 
 393 an entry.  The only exceptions are the special case when the cursor
 
 394 is after the last entry in the file, or when the PO file is
 
 395 empty.  The entry where the cursor is found to be is said to be the
 
 396 current entry.  Many PO mode commands operate on the current entry,
 
 397 so moving the cursor does more than allowing the translator to browse
 
 398 the PO file, this also selects on which entry commands operate.
 
 402 Some PO mode commands alter the position of the cursor in a specialized
 
 403 way.  A few of those special purpose positioning are described here,
 
 404 the others are described in following sections.
 
 411 Redisplay the current entry.
 
 417 Select the entry after the current one.
 
 423 Select the entry before the current one.
 
 427 Select the first entry in the PO file.
 
 431 Select the last entry in the PO file.
 
 435 Record the location of the current entry for later use.
 
 439 Return to a previously saved entry location.
 
 443 Exchange the current entry location with the previously saved one.
 
 448 Any GNU Emacs command able to reposition the cursor may be used
 
 449 to select the current entry in PO mode, including commands which
 
 450 move by characters, lines, paragraphs, screens or pages, and search
 
 451 commands.  However, there is a kind of standard way to display the
 
 452 current entry in PO mode, which usual GNU Emacs commands moving
 
 453 the cursor do not especially try to enforce.  The command 
<KBD>.
</KBD> 
 454 (
<CODE>po-current-entry
</CODE>) has the sole purpose of redisplaying the
 
 455 current entry properly, after the current entry has been changed by
 
 456 means external to PO mode, or the Emacs screen otherwise altered.
 
 460 It is yet to be decided if PO mode helps the translator, or otherwise
 
 461 irritates her, by forcing a rigid window disposition while she
 
 462 is doing her work.  We originally had quite precise ideas about
 
 463 how windows should behave, but on the other hand, anyone used to
 
 464 GNU Emacs is often happy to keep full control.  Maybe a fixed window
 
 465 disposition might be offered as a PO mode option that the translator
 
 466 might activate or deactivate at will, so it could be offered on an
 
 467 experimental basis.  If nobody feels a real need for using it, or
 
 468 a compulsion for writing it, we should drop this whole idea.
 
 469 The incentive for doing it should come from translators rather than
 
 470 programmers, as opinions from an experienced translator are surely
 
 471 more worth to me than opinions from programmers 
<EM>thinking
</EM> about
 
 472 how 
<EM>others
</EM> should do translation.
 
 476 The commands 
<KBD>n
</KBD> (
<CODE>po-next-entry
</CODE>) and 
<KBD>p
</KBD> 
 477 (
<CODE>po-previous-entry
</CODE>) move the cursor the entry following,
 
 478 or preceding, the current one.  If 
<KBD>n
</KBD> is given while the
 
 479 cursor is on the last entry of the PO file, or if 
<KBD>p
</KBD> 
 480 is given while the cursor is on the first entry, no move is done.
 
 484 The commands 
<KBD><</KBD> (
<CODE>po-first-entry
</CODE>) and 
<KBD>></KBD> 
 485 (
<CODE>po-last-entry
</CODE>) move the cursor to the first entry, or last
 
 486 entry, of the PO file.  When the cursor is located past the last
 
 487 entry in a PO file, most PO mode commands will return an error saying
 
 488 <SAMP>`After last entry'
</SAMP>.  Moreover, the commands 
<KBD><</KBD> and 
<KBD>></KBD> 
 489 have the special property of being able to work even when the cursor
 
 490 is not into some PO file entry, and one may use them for nicely
 
 491 correcting this situation.  But even these commands will fail on a
 
 492 truly empty PO file.  There are development plans for the PO mode for it
 
 493 to interactively fill an empty PO file from sources.  See section 
<A HREF=
"gettext_3.html#SEC16">Marking Translatable Strings
</A>.
 
 497 The translator may decide, before working at the translation of
 
 498 a particular entry, that she needs to browse the remainder of the
 
 499 PO file, maybe for finding the terminology or phraseology used
 
 500 in related entries.  She can of course use the standard Emacs idioms
 
 501 for saving the current cursor location in some register, and use that
 
 502 register for getting back, or else, use the location ring.
 
 506 PO mode offers another approach, by which cursor locations may be saved
 
 507 onto a special stack.  The command 
<KBD>m
</KBD> (
<CODE>po-push-location
</CODE>)
 
 508 merely adds the location of current entry to the stack, pushing
 
 509 the already saved locations under the new one.  The command
 
 510 <KBD>r
</KBD> (
<CODE>po-pop-location
</CODE>) consumes the top stack element and
 
 511 reposition the cursor to the entry associated with that top element.
 
 512 This position is then lost, for the next 
<KBD>r
</KBD> will move the cursor
 
 513 to the previously saved location, and so on until no locations remain
 
 518 If the translator wants the position to be kept on the location stack,
 
 519 maybe for taking a look at the entry associated with the top
 
 520 element, then go elsewhere with the intent of getting back later, she
 
 521 ought to use 
<KBD>m
</KBD> immediately after 
<KBD>r
</KBD>.
 
 525 The command 
<KBD>x
</KBD> (
<CODE>po-exchange-location
</CODE>) simultaneously
 
 526 reposition the cursor to the entry associated with the top element of
 
 527 the stack of saved locations, and replace that top element with the
 
 528 location of the current entry before the move.  Consequently, repeating
 
 529 the 
<KBD>x
</KBD> command toggles alternatively between two entries.
 
 530 For achieving this, the translator will position the cursor on the
 
 531 first entry, use 
<KBD>m
</KBD>, then position to the second entry, and
 
 532 merely use 
<KBD>x
</KBD> for making the switch.
 
 537 <H2><A NAME=
"SEC12" HREF=
"gettext_toc.html#TOC12">Normalizing Strings in Entries
</A></H2> 
 540 There are many different ways for encoding a particular string into a
 
 541 PO file entry, because there are so many different ways to split and
 
 542 quote multi-line strings, and even, to represent special characters
 
 543 by backslahsed escaped sequences.  Some features of PO mode rely on
 
 544 the ability for PO mode to scan an already existing PO file for a
 
 545 particular string encoded into the 
<CODE>msgid
</CODE> field of some entry.
 
 546 Even if PO mode has internally all the built-in machinery for
 
 547 implementing this recognition easily, doing it fast is technically
 
 548 difficult.  To facilitate a solution to this efficiency problem,
 
 549 we decided on a canonical representation for strings.
 
 553 A conventional representation of strings in a PO file is currently
 
 554 under discussion, and PO mode experiments with a canonical representation.
 
 555 Having both 
<CODE>xgettext
</CODE> and PO mode converging towards a uniform
 
 556 way of representing equivalent strings would be useful, as the internal
 
 557 normalization needed by PO mode could be automatically satisfied
 
 558 when using 
<CODE>xgettext
</CODE> from GNU 
<CODE>gettext
</CODE>.  An explicit
 
 559 PO mode normalization should then be only necessary for PO files
 
 560 imported from elsewhere, or for when the convention itself evolves.
 
 564 So, for achieving normalization of at least the strings of a given
 
 565 PO file needing a canonical representation, the following PO mode
 
 566 command is available:
 
 571 <DT><KBD>M-x po-normalize
</KBD> 
 573 Tidy the whole PO file by making entries more uniform.
 
 578 The special command 
<KBD>M-x po-normalize
</KBD>, which has no associate
 
 579 keys, revises all entries, ensuring that strings of both original
 
 580 and translated entries use uniform internal quoting in the PO file.
 
 581 It also removes any crumb after the last entry.  This command may be
 
 582 useful for PO files freshly imported from elsewhere, or if we ever
 
 583 improve on the canonical quoting format we use.  This canonical format
 
 584 is not only meant for getting cleaner PO files, but also for greatly
 
 585 speeding up 
<CODE>msgid
</CODE> string lookup for some other PO mode commands.
 
 589 <KBD>M-x po-normalize
</KBD> presently makes three passes over the entries.
 
 590 The first implements heuristics for converting PO files for GNU
 
 591 <CODE>gettext
</CODE> 0.6 and earlier, in which 
<CODE>msgid
</CODE> and 
<CODE>msgstr
</CODE> 
 592 fields were using K
&R style C string syntax for multi-line strings.
 
 593 These heuristics may fail for comments not related to obsolete
 
 594 entries and ending with a backslash; they also depend on subsequent
 
 595 passes for finalizing the proper commenting of continued lines for
 
 596 obsolete entries.  This first pass might disappear once all oldish PO
 
 597 files would have been adjusted.  The second and third pass normalize
 
 598 all 
<CODE>msgid
</CODE> and 
<CODE>msgstr
</CODE> strings respectively.  They also
 
 599 clean out those trailing backslashes used by XView's 
<CODE>msgfmt
</CODE> 
 604 Having such an explicit normalizing command allows for importing PO
 
 605 files from other sources, but also eases the evolution of the current
 
 606 convention, evolution driven mostly by aesthetic concerns, as of now.
 
 607 It is easy to make suggested adjustments at a later time, as the
 
 608 normalizing command and eventually, other GNU 
<CODE>gettext
</CODE> tools
 
 609 should greatly automate conformance.  A description of the canonical
 
 610 string format is given below, for the particular benefit of those not
 
 611 having GNU Emacs handy, and who would nevertheless want to handcraft
 
 612 their PO files in nice ways.
 
 616 Right now, in PO mode, strings are single line or multi-line.  A string
 
 617 goes multi-line if and only if it has 
<EM>embedded
</EM> newlines, that
 
 618 is, if it matches 
<SAMP>`[^\n]\n+[^\n]'
</SAMP>.  So, we would have:
 
 623 msgstr "\n\nHello, world!\n\n\n"
 
 627 but, replacing the space by a newline, this becomes:
 
 642 We are deliberately using a caricatural example, here, to make the
 
 643 point clearer.  Usually, multi-lines are not that bad looking.
 
 644 It is probable that we will implement the following suggestion.
 
 645 We might lump together all initial newlines into the empty string,
 
 646 and also all newlines introducing empty lines (that is, for 
<VAR>n
</VAR> 
 647 > 1, the 
<VAR>n
</VAR>-
1'th last newlines would go together on a separate
 
 648 string), so making the previous example appear:
 
 660 There are a few yet undecided little points about string normalization,
 
 661 to be documented in this manual, once these questions settle.
 
 665 <p>Go to the 
<A HREF=
"gettext_1.html">first
</A>, 
<A HREF=
"gettext_1.html">previous
</A>, 
<A HREF=
"gettext_3.html">next
</A>, 
<A HREF=
"gettext_12.html">last
</A> section, 
<A HREF=
"gettext_toc.html">table of contents
</A>.