| 1 | <HTML> |
| 2 | <HEAD> |
| 3 | <!-- This HTML file has been created by texi2html 1.54 |
| 4 | from gettext.texi on 25 January 1999 --> |
| 5 | |
| 6 | <TITLE>GNU gettext utilities - PO Files and PO Mode Basics</TITLE> |
| 7 | <link href="gettext_3.html" rel=Next> |
| 8 | <link href="gettext_1.html" rel=Previous> |
| 9 | <link href="gettext_toc.html" rel=ToC> |
| 10 | |
| 11 | </HEAD> |
| 12 | <BODY> |
| 13 | <p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. |
| 14 | <P><HR><P> |
| 15 | |
| 16 | |
| 17 | <H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">PO Files and PO Mode Basics</A></H1> |
| 18 | |
| 19 | <P> |
| 20 | The GNU <CODE>gettext</CODE> toolset helps programmers and translators |
| 21 | at producing, updating and using translation files, mainly those |
| 22 | PO files which are textual, editable files. This chapter stresses |
| 23 | the format of PO files, and contains a PO mode starter. PO mode |
| 24 | description is spread throughout this manual instead of being concentrated |
| 25 | in one place. Here we present only the basics of PO mode. |
| 26 | |
| 27 | </P> |
| 28 | |
| 29 | |
| 30 | |
| 31 | <H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">Completing GNU <CODE>gettext</CODE> Installation</A></H2> |
| 32 | |
| 33 | <P> |
| 34 | Once you have received, unpacked, configured and compiled the GNU |
| 35 | <CODE>gettext</CODE> distribution, the <SAMP>`make install'</SAMP> command puts in |
| 36 | place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and |
| 37 | <CODE>msgmerge</CODE>, as well as their available message catalogs. To |
| 38 | top off a comfortable installation, you might also want to make the |
| 39 | PO mode available to your GNU Emacs users. |
| 40 | |
| 41 | </P> |
| 42 | <P> |
| 43 | During the installation of the PO mode, you might want modify your |
| 44 | file <TT>`.emacs'</TT>, once and for all, so it contains a few lines looking |
| 45 | like: |
| 46 | |
| 47 | </P> |
| 48 | |
| 49 | <PRE> |
| 50 | (setq auto-mode-alist |
| 51 | (cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist)) |
| 52 | (autoload 'po-mode "po-mode") |
| 53 | </PRE> |
| 54 | |
| 55 | <P> |
| 56 | Later, whenever you edit some <TT>`.po'</TT>, <TT>`.pot'</TT> or <TT>`.pox'</TT> |
| 57 | file, or any file having the string <SAMP>`.po.'</SAMP> within its name, |
| 58 | Emacs loads <TT>`po-mode.elc'</TT> (or <TT>`po-mode.el'</TT>) as needed, and |
| 59 | automatically activates PO mode commands for the associated buffer. |
| 60 | The string <EM>PO</EM> appears in the mode line for any buffer for |
| 61 | which PO mode is active. Many PO files may be active at once in a |
| 62 | single Emacs session. |
| 63 | |
| 64 | </P> |
| 65 | <P> |
| 66 | If you are using Emacs version 20 or better, and have already installed |
| 67 | the appropriate international fonts on your system, you may also manage |
| 68 | for the these fonts to be automatically loaded and used for displaying |
| 69 | the translations on your Emacs screen, whenever necessary. For this to |
| 70 | happen, you might want to add the lines: |
| 71 | |
| 72 | </P> |
| 73 | |
| 74 | <PRE> |
| 75 | (autoload 'po-find-file-coding-system "po-mode") |
| 76 | (modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\." |
| 77 | 'po-find-file-coding-system) |
| 78 | </PRE> |
| 79 | |
| 80 | <P> |
| 81 | to your <TT>`.emacs'</TT> file. |
| 82 | |
| 83 | </P> |
| 84 | |
| 85 | |
| 86 | <H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">The Format of PO Files</A></H2> |
| 87 | |
| 88 | <P> |
| 89 | A PO file is made up of many entries, each entry holding the relation |
| 90 | between an original untranslated string and its corresponding |
| 91 | translation. All entries in a given PO file usually pertain |
| 92 | to a single project, and all translations are expressed in a single |
| 93 | target language. One PO file <STRONG>entry</STRONG> has the following schematic |
| 94 | structure: |
| 95 | |
| 96 | </P> |
| 97 | |
| 98 | <PRE> |
| 99 | <VAR>white-space</VAR> |
| 100 | # <VAR>translator-comments</VAR> |
| 101 | #. <VAR>automatic-comments</VAR> |
| 102 | #: <VAR>reference</VAR>... |
| 103 | #, <VAR>flag</VAR>... |
| 104 | msgid <VAR>untranslated-string</VAR> |
| 105 | msgstr <VAR>translated-string</VAR> |
| 106 | </PRE> |
| 107 | |
| 108 | <P> |
| 109 | The general structure of a PO file should be well understood by |
| 110 | the translator. When using PO mode, very little has to be known |
| 111 | about the format details, as PO mode takes care of them for her. |
| 112 | |
| 113 | </P> |
| 114 | <P> |
| 115 | Entries begin with some optional white space. Usually, when generated |
| 116 | through GNU <CODE>gettext</CODE> tools, there is exactly one blank line |
| 117 | between entries. Then comments follow, on lines all starting with the |
| 118 | character <KBD>#</KBD>. There are two kinds of comments: those which have |
| 119 | some white space immediately following the <KBD>#</KBD>, which comments are |
| 120 | created and maintained exclusively by the translator, and those which |
| 121 | have some non-white character just after the <KBD>#</KBD>, which comments |
| 122 | are created and maintained automatically by GNU <CODE>gettext</CODE> tools. |
| 123 | All comments, of either kind, are optional. |
| 124 | |
| 125 | </P> |
| 126 | <P> |
| 127 | After white space and comments, entries show two strings, giving |
| 128 | first the untranslated string as it appears in the original program |
| 129 | sources, and then, the translation of this string. The original |
| 130 | string is introduced by the keyword <CODE>msgid</CODE>, and the translation, |
| 131 | by <CODE>msgstr</CODE>. The two strings, untranslated and translated, |
| 132 | are quoted in various ways in the PO file, using <KBD>"</KBD> |
| 133 | delimiters and <KBD>\</KBD> escapes, but the translator does not really |
| 134 | have to pay attention to the precise quoting format, as PO mode fully |
| 135 | intend to take care of quoting for her. |
| 136 | |
| 137 | </P> |
| 138 | <P> |
| 139 | The <CODE>msgid</CODE> strings, as well as automatic comments, are produced |
| 140 | and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not |
| 141 | provide means for the translator to alter these. The most she can |
| 142 | do is merely deleting them, and only by deleting the whole entry. |
| 143 | On the other hand, the <CODE>msgstr</CODE> string, as well as translator |
| 144 | comments, are really meant for the translator, and PO mode gives her |
| 145 | the full control she needs. |
| 146 | |
| 147 | </P> |
| 148 | <P> |
| 149 | The comment lines beginning with <KBD>#,</KBD> are special because they are |
| 150 | not completely ignored by the programs as comments generally are. The |
| 151 | comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE> |
| 152 | program to give the user some better diagnostic messages. Currently |
| 153 | there are two forms of flags defined: |
| 154 | |
| 155 | </P> |
| 156 | <DL COMPACT> |
| 157 | |
| 158 | <DT><KBD>fuzzy</KBD> |
| 159 | <DD> |
| 160 | This flag can be generated by the <CODE>msgmerge</CODE> program or it can be |
| 161 | inserted by the translator herself. It shows that the <CODE>msgstr</CODE> |
| 162 | string might not be a correct translation (anymore). Only the translator |
| 163 | can judge if the translation requires further modification, or is |
| 164 | acceptable as is. Once satisfied with the translation, she then removes |
| 165 | this <KBD>fuzzy</KBD> attribute. The <CODE>msgmerge</CODE> programs inserts this |
| 166 | when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy |
| 167 | search only. See section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>. |
| 168 | |
| 169 | <DT><KBD>c-format</KBD> |
| 170 | <DD> |
| 171 | <DT><KBD>no-c-format</KBD> |
| 172 | <DD> |
| 173 | These flags should not be added by a human. Instead only the |
| 174 | <CODE>xgettext</CODE> program adds them. In an automated PO file processing |
| 175 | system as proposed here the user changes would be thrown away again as |
| 176 | soon as the <CODE>xgettext</CODE> program generates a new template file. |
| 177 | |
| 178 | In case the <KBD>c-format</KBD> flag is given for a string the <CODE>msgfmt</CODE> |
| 179 | does some more tests to check to validity of the translation. |
| 180 | See section <A HREF="gettext_6.html#SEC33">Invoking the <CODE>msgfmt</CODE> Program</A>. |
| 181 | |
| 182 | </DL> |
| 183 | |
| 184 | <P> |
| 185 | It happens that some lines, usually whitespace or comments, follow the |
| 186 | very last entry of a PO file. Such lines are not part of any entry, |
| 187 | and PO mode is unable to take action on those lines. By using the |
| 188 | PO mode function <KBD>M-x po-normalize</KBD>, the translator may get |
| 189 | rid of those spurious lines. See section <A HREF="gettext_2.html#SEC12">Normalizing Strings in Entries</A>. |
| 190 | |
| 191 | </P> |
| 192 | <P> |
| 193 | The remainder of this section may be safely skipped by those using |
| 194 | PO mode, yet it may be interesting for everybody to have a better |
| 195 | idea of the precise format of a PO file. On the other hand, those |
| 196 | not having GNU Emacs handy should carefully continue reading on. |
| 197 | |
| 198 | </P> |
| 199 | <P> |
| 200 | Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects |
| 201 | the C syntax for a character string, including the surrounding quotes |
| 202 | and embedded backslashed escape sequences. When the time comes |
| 203 | to write multi-line strings, one should not use escaped newlines. |
| 204 | Instead, a closing quote should follow the last character on the |
| 205 | line to be continued, and an opening quote should resume the string |
| 206 | at the beginning of the following PO file line. For example: |
| 207 | |
| 208 | </P> |
| 209 | |
| 210 | <PRE> |
| 211 | msgid "" |
| 212 | "Here is an example of how one might continue a very long string\n" |
| 213 | "for the common case the string represents multi-line output.\n" |
| 214 | </PRE> |
| 215 | |
| 216 | <P> |
| 217 | In this example, the empty string is used on the first line, to |
| 218 | allow better alignment of the <KBD>H</KBD> from the word <SAMP>`Here'</SAMP> |
| 219 | over the <KBD>f</KBD> from the word <SAMP>`for'</SAMP>. In this example, the |
| 220 | <CODE>msgid</CODE> keyword is followed by three strings, which are meant |
| 221 | to be concatenated. Concatenating the empty string does not change |
| 222 | the resulting overall string, but it is a way for us to comply with |
| 223 | the necessity of <CODE>msgid</CODE> to be followed by a string on the same |
| 224 | line, while keeping the multi-line presentation left-justified, as |
| 225 | we find this to be a cleaner disposition. The empty string could have |
| 226 | been omitted, but only if the string starting with <SAMP>`Here'</SAMP> was |
| 227 | promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF1" HREF="gettext_foot.html#FOOT1">(1)</A> It was not really necessary |
| 228 | either to switch between the two last quoted strings immediately after |
| 229 | the newline <SAMP>`\n'</SAMP>, the switch could have occurred after <EM>any</EM> |
| 230 | other character, we just did it this way because it is neater. |
| 231 | |
| 232 | </P> |
| 233 | <P> |
| 234 | One should carefully distinguish between end of lines marked as |
| 235 | <SAMP>`\n'</SAMP> <EM>inside</EM> quotes, which are part of the represented |
| 236 | string, and end of lines in the PO file itself, outside string quotes, |
| 237 | which have no incidence on the represented string. |
| 238 | |
| 239 | </P> |
| 240 | <P> |
| 241 | Outside strings, white lines and comments may be used freely. |
| 242 | Comments start at the beginning of a line with <SAMP>`#'</SAMP> and extend |
| 243 | until the end of the PO file line. Comments written by translators |
| 244 | should have the initial <SAMP>`#'</SAMP> immediately followed by some white |
| 245 | space. If the <SAMP>`#'</SAMP> is not immediately followed by white space, |
| 246 | this comment is most likely generated and managed by specialized GNU |
| 247 | tools, and might disappear or be replaced unexpectedly when the PO |
| 248 | file is given to <CODE>msgmerge</CODE>. |
| 249 | |
| 250 | </P> |
| 251 | |
| 252 | |
| 253 | <H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">Main PO mode Commands</A></H2> |
| 254 | |
| 255 | <P> |
| 256 | After setting up Emacs with something similar to the lines in |
| 257 | section <A HREF="gettext_2.html#SEC8">Completing GNU <CODE>gettext</CODE> Installation</A>, PO mode is activated for a window when Emacs finds a |
| 258 | PO file in that window. This puts the window read-only and establishes a |
| 259 | po-mode-map, which is a genuine Emacs mode, in a way that is not derived |
| 260 | from text mode in any way. Functions found on <CODE>po-mode-hook</CODE>, |
| 261 | if any, will be executed. |
| 262 | |
| 263 | </P> |
| 264 | <P> |
| 265 | When PO mode is active in a window, the letters <SAMP>`PO'</SAMP> appear |
| 266 | in the mode line for that window. The mode line also displays how |
| 267 | many entries of each kind are held in the PO file. For example, |
| 268 | the string <SAMP>`132t+3f+10u+2o'</SAMP> would tell the translator that the |
| 269 | PO mode contains 132 translated entries (see section <A HREF="gettext_5.html#SEC25">Translated Entries</A>, |
| 270 | 3 fuzzy entries (see section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>), 10 untranslated entries |
| 271 | (see section <A HREF="gettext_5.html#SEC27">Untranslated Entries</A>) and 2 obsolete entries (see section <A HREF="gettext_5.html#SEC28">Obsolete Entries</A>). Zero-coefficients items are not shown. So, in this example, if |
| 272 | the fuzzy entries were unfuzzied, the untranslated entries were translated |
| 273 | and the obsolete entries were deleted, the mode line would merely display |
| 274 | <SAMP>`145t'</SAMP> for the counters. |
| 275 | |
| 276 | </P> |
| 277 | <P> |
| 278 | The main PO commands are those which do not fit into the other categories of |
| 279 | subsequent sections. These allow for quitting PO mode or for managing windows |
| 280 | in special ways. |
| 281 | |
| 282 | </P> |
| 283 | <DL COMPACT> |
| 284 | |
| 285 | <DT><KBD>U</KBD> |
| 286 | <DD> |
| 287 | Undo last modification to the PO file. |
| 288 | |
| 289 | <DT><KBD>Q</KBD> |
| 290 | <DD> |
| 291 | Quit processing and save the PO file. |
| 292 | |
| 293 | <DT><KBD>q</KBD> |
| 294 | <DD> |
| 295 | Quit processing, possibly after confirmation. |
| 296 | |
| 297 | <DT><KBD>O</KBD> |
| 298 | <DD> |
| 299 | Temporary leave the PO file window. |
| 300 | |
| 301 | <DT><KBD>?</KBD> |
| 302 | <DD> |
| 303 | <DT><KBD>h</KBD> |
| 304 | <DD> |
| 305 | Show help about PO mode. |
| 306 | |
| 307 | <DT><KBD>=</KBD> |
| 308 | <DD> |
| 309 | Give some PO file statistics. |
| 310 | |
| 311 | <DT><KBD>V</KBD> |
| 312 | <DD> |
| 313 | Batch validate the format of the whole PO file. |
| 314 | |
| 315 | </DL> |
| 316 | |
| 317 | <P> |
| 318 | The command <KBD>U</KBD> (<CODE>po-undo</CODE>) interfaces to the GNU Emacs |
| 319 | <EM>undo</EM> facility. See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>. Each time <KBD>U</KBD> is typed, modifications which the translator |
| 320 | did to the PO file are undone a little more. For the purpose of |
| 321 | undoing, each PO mode command is atomic. This is especially true for |
| 322 | the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single |
| 323 | use of this command is undone at once, even if the edition itself |
| 324 | implied several actions. However, while in the editing window, one |
| 325 | can undo the edition work quite parsimoniously. |
| 326 | |
| 327 | </P> |
| 328 | <P> |
| 329 | The commands <KBD>Q</KBD> (<CODE>po-quit</CODE>) and <KBD>q</KBD> |
| 330 | (<CODE>po-confirm-and-quit</CODE>) are used when the translator is done with the |
| 331 | PO file. The former is a bit less verbose than the latter. If the file |
| 332 | has been modified, it is saved to disk first. In both cases, and prior to |
| 333 | all this, the commands check if some untranslated message remains in the |
| 334 | PO file and, if yes, the translator is asked if she really wants to leave |
| 335 | off working with this PO file. This is the preferred way of getting rid |
| 336 | of an Emacs PO file buffer. Merely killing it through the usual command |
| 337 | <KBD>C-x k</KBD> (<CODE>kill-buffer</CODE>) is not the tidiest way to proceed. |
| 338 | |
| 339 | </P> |
| 340 | <P> |
| 341 | The command <KBD>O</KBD> (<CODE>po-other-window</CODE>) is another, softer way, |
| 342 | to leave PO mode, temporarily. It just moves the cursor to some other |
| 343 | Emacs window, and pops one if necessary. For example, if the translator |
| 344 | just got PO mode to show some source context in some other, she might |
| 345 | discover some apparent bug in the program source that needs correction. |
| 346 | This command allows the translator to change sex, become a programmer, |
| 347 | and have the cursor right into the window containing the program she |
| 348 | (or rather <EM>he</EM>) wants to modify. By later getting the cursor back |
| 349 | in the PO file window, or by asking Emacs to edit this file once again, |
| 350 | PO mode is then recovered. |
| 351 | |
| 352 | </P> |
| 353 | <P> |
| 354 | The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all available PO |
| 355 | mode commands. The translator should then type any character to resume |
| 356 | normal PO mode operations. The command <KBD>?</KBD> has the same effect |
| 357 | as <KBD>h</KBD>. |
| 358 | |
| 359 | </P> |
| 360 | <P> |
| 361 | The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number of |
| 362 | entries in the PO file, the ordinal of the current entry (counted from |
| 363 | 1), the number of untranslated entries, the number of obsolete entries, |
| 364 | and displays all these numbers. |
| 365 | |
| 366 | </P> |
| 367 | <P> |
| 368 | The command <KBD>V</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in verbose |
| 369 | mode over the current PO file. This command first offers to save the |
| 370 | current PO file on disk. The <CODE>msgfmt</CODE> tool, from GNU <CODE>gettext</CODE>, |
| 371 | has the purpose of creating a MO file out of a PO file, and PO mode uses |
| 372 | the features of this program for checking the overall format of a PO file, |
| 373 | as well as all individual entries. |
| 374 | |
| 375 | </P> |
| 376 | <P> |
| 377 | The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so the |
| 378 | translator regains control immediately while her PO file is being studied. |
| 379 | Error output is collected in the GNU Emacs <SAMP>`*compilation*'</SAMP> buffer, |
| 380 | displayed in another window. The regular GNU Emacs command <KBD>C-x`</KBD> |
| 381 | (<CODE>next-error</CODE>), as well as other usual compile commands, allow the |
| 382 | translator to reposition quickly to the offending parts of the PO file. |
| 383 | Once the cursor is on the line in error, the translator may decide on |
| 384 | any PO mode action which would help correcting the error. |
| 385 | |
| 386 | </P> |
| 387 | |
| 388 | |
| 389 | <H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">Entry Positioning</A></H2> |
| 390 | |
| 391 | <P> |
| 392 | The cursor in a PO file window is almost always part of |
| 393 | an entry. The only exceptions are the special case when the cursor |
| 394 | is after the last entry in the file, or when the PO file is |
| 395 | empty. The entry where the cursor is found to be is said to be the |
| 396 | current entry. Many PO mode commands operate on the current entry, |
| 397 | so moving the cursor does more than allowing the translator to browse |
| 398 | the PO file, this also selects on which entry commands operate. |
| 399 | |
| 400 | </P> |
| 401 | <P> |
| 402 | Some PO mode commands alter the position of the cursor in a specialized |
| 403 | way. A few of those special purpose positioning are described here, |
| 404 | the others are described in following sections. |
| 405 | |
| 406 | </P> |
| 407 | <DL COMPACT> |
| 408 | |
| 409 | <DT><KBD>.</KBD> |
| 410 | <DD> |
| 411 | Redisplay the current entry. |
| 412 | |
| 413 | <DT><KBD>n</KBD> |
| 414 | <DD> |
| 415 | <DT><KBD>n</KBD> |
| 416 | <DD> |
| 417 | Select the entry after the current one. |
| 418 | |
| 419 | <DT><KBD>p</KBD> |
| 420 | <DD> |
| 421 | <DT><KBD>p</KBD> |
| 422 | <DD> |
| 423 | Select the entry before the current one. |
| 424 | |
| 425 | <DT><KBD><</KBD> |
| 426 | <DD> |
| 427 | Select the first entry in the PO file. |
| 428 | |
| 429 | <DT><KBD>></KBD> |
| 430 | <DD> |
| 431 | Select the last entry in the PO file. |
| 432 | |
| 433 | <DT><KBD>m</KBD> |
| 434 | <DD> |
| 435 | Record the location of the current entry for later use. |
| 436 | |
| 437 | <DT><KBD>l</KBD> |
| 438 | <DD> |
| 439 | Return to a previously saved entry location. |
| 440 | |
| 441 | <DT><KBD>x</KBD> |
| 442 | <DD> |
| 443 | Exchange the current entry location with the previously saved one. |
| 444 | |
| 445 | </DL> |
| 446 | |
| 447 | <P> |
| 448 | Any GNU Emacs command able to reposition the cursor may be used |
| 449 | to select the current entry in PO mode, including commands which |
| 450 | move by characters, lines, paragraphs, screens or pages, and search |
| 451 | commands. However, there is a kind of standard way to display the |
| 452 | current entry in PO mode, which usual GNU Emacs commands moving |
| 453 | the cursor do not especially try to enforce. The command <KBD>.</KBD> |
| 454 | (<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the |
| 455 | current entry properly, after the current entry has been changed by |
| 456 | means external to PO mode, or the Emacs screen otherwise altered. |
| 457 | |
| 458 | </P> |
| 459 | <P> |
| 460 | It is yet to be decided if PO mode helps the translator, or otherwise |
| 461 | irritates her, by forcing a rigid window disposition while she |
| 462 | is doing her work. We originally had quite precise ideas about |
| 463 | how windows should behave, but on the other hand, anyone used to |
| 464 | GNU Emacs is often happy to keep full control. Maybe a fixed window |
| 465 | disposition might be offered as a PO mode option that the translator |
| 466 | might activate or deactivate at will, so it could be offered on an |
| 467 | experimental basis. If nobody feels a real need for using it, or |
| 468 | a compulsion for writing it, we should drop this whole idea. |
| 469 | The incentive for doing it should come from translators rather than |
| 470 | programmers, as opinions from an experienced translator are surely |
| 471 | more worth to me than opinions from programmers <EM>thinking</EM> about |
| 472 | how <EM>others</EM> should do translation. |
| 473 | |
| 474 | </P> |
| 475 | <P> |
| 476 | The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD> |
| 477 | (<CODE>po-previous-entry</CODE>) move the cursor the entry following, |
| 478 | or preceding, the current one. If <KBD>n</KBD> is given while the |
| 479 | cursor is on the last entry of the PO file, or if <KBD>p</KBD> |
| 480 | is given while the cursor is on the first entry, no move is done. |
| 481 | |
| 482 | </P> |
| 483 | <P> |
| 484 | The commands <KBD><</KBD> (<CODE>po-first-entry</CODE>) and <KBD>></KBD> |
| 485 | (<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last |
| 486 | entry, of the PO file. When the cursor is located past the last |
| 487 | entry in a PO file, most PO mode commands will return an error saying |
| 488 | <SAMP>`After last entry'</SAMP>. Moreover, the commands <KBD><</KBD> and <KBD>></KBD> |
| 489 | have the special property of being able to work even when the cursor |
| 490 | is not into some PO file entry, and one may use them for nicely |
| 491 | correcting this situation. But even these commands will fail on a |
| 492 | truly empty PO file. There are development plans for the PO mode for it |
| 493 | to interactively fill an empty PO file from sources. See section <A HREF="gettext_3.html#SEC16">Marking Translatable Strings</A>. |
| 494 | |
| 495 | </P> |
| 496 | <P> |
| 497 | The translator may decide, before working at the translation of |
| 498 | a particular entry, that she needs to browse the remainder of the |
| 499 | PO file, maybe for finding the terminology or phraseology used |
| 500 | in related entries. She can of course use the standard Emacs idioms |
| 501 | for saving the current cursor location in some register, and use that |
| 502 | register for getting back, or else, use the location ring. |
| 503 | |
| 504 | </P> |
| 505 | <P> |
| 506 | PO mode offers another approach, by which cursor locations may be saved |
| 507 | onto a special stack. The command <KBD>m</KBD> (<CODE>po-push-location</CODE>) |
| 508 | merely adds the location of current entry to the stack, pushing |
| 509 | the already saved locations under the new one. The command |
| 510 | <KBD>r</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and |
| 511 | reposition the cursor to the entry associated with that top element. |
| 512 | This position is then lost, for the next <KBD>r</KBD> will move the cursor |
| 513 | to the previously saved location, and so on until no locations remain |
| 514 | on the stack. |
| 515 | |
| 516 | </P> |
| 517 | <P> |
| 518 | If the translator wants the position to be kept on the location stack, |
| 519 | maybe for taking a look at the entry associated with the top |
| 520 | element, then go elsewhere with the intent of getting back later, she |
| 521 | ought to use <KBD>m</KBD> immediately after <KBD>r</KBD>. |
| 522 | |
| 523 | </P> |
| 524 | <P> |
| 525 | The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously |
| 526 | reposition the cursor to the entry associated with the top element of |
| 527 | the stack of saved locations, and replace that top element with the |
| 528 | location of the current entry before the move. Consequently, repeating |
| 529 | the <KBD>x</KBD> command toggles alternatively between two entries. |
| 530 | For achieving this, the translator will position the cursor on the |
| 531 | first entry, use <KBD>m</KBD>, then position to the second entry, and |
| 532 | merely use <KBD>x</KBD> for making the switch. |
| 533 | |
| 534 | </P> |
| 535 | |
| 536 | |
| 537 | <H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">Normalizing Strings in Entries</A></H2> |
| 538 | |
| 539 | <P> |
| 540 | There are many different ways for encoding a particular string into a |
| 541 | PO file entry, because there are so many different ways to split and |
| 542 | quote multi-line strings, and even, to represent special characters |
| 543 | by backslashed escaped sequences. Some features of PO mode rely on |
| 544 | the ability for PO mode to scan an already existing PO file for a |
| 545 | particular string encoded into the <CODE>msgid</CODE> field of some entry. |
| 546 | Even if PO mode has internally all the built-in machinery for |
| 547 | implementing this recognition easily, doing it fast is technically |
| 548 | difficult. To facilitate a solution to this efficiency problem, |
| 549 | we decided on a canonical representation for strings. |
| 550 | |
| 551 | </P> |
| 552 | <P> |
| 553 | A conventional representation of strings in a PO file is currently |
| 554 | under discussion, and PO mode experiments with a canonical representation. |
| 555 | Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform |
| 556 | way of representing equivalent strings would be useful, as the internal |
| 557 | normalization needed by PO mode could be automatically satisfied |
| 558 | when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>. An explicit |
| 559 | PO mode normalization should then be only necessary for PO files |
| 560 | imported from elsewhere, or for when the convention itself evolves. |
| 561 | |
| 562 | </P> |
| 563 | <P> |
| 564 | So, for achieving normalization of at least the strings of a given |
| 565 | PO file needing a canonical representation, the following PO mode |
| 566 | command is available: |
| 567 | |
| 568 | </P> |
| 569 | <DL COMPACT> |
| 570 | |
| 571 | <DT><KBD>M-x po-normalize</KBD> |
| 572 | <DD> |
| 573 | Tidy the whole PO file by making entries more uniform. |
| 574 | |
| 575 | </DL> |
| 576 | |
| 577 | <P> |
| 578 | The special command <KBD>M-x po-normalize</KBD>, which has no associate |
| 579 | keys, revises all entries, ensuring that strings of both original |
| 580 | and translated entries use uniform internal quoting in the PO file. |
| 581 | It also removes any crumb after the last entry. This command may be |
| 582 | useful for PO files freshly imported from elsewhere, or if we ever |
| 583 | improve on the canonical quoting format we use. This canonical format |
| 584 | is not only meant for getting cleaner PO files, but also for greatly |
| 585 | speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands. |
| 586 | |
| 587 | </P> |
| 588 | <P> |
| 589 | <KBD>M-x po-normalize</KBD> presently makes three passes over the entries. |
| 590 | The first implements heuristics for converting PO files for GNU |
| 591 | <CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE> |
| 592 | fields were using K&R style C string syntax for multi-line strings. |
| 593 | These heuristics may fail for comments not related to obsolete |
| 594 | entries and ending with a backslash; they also depend on subsequent |
| 595 | passes for finalizing the proper commenting of continued lines for |
| 596 | obsolete entries. This first pass might disappear once all oldish PO |
| 597 | files would have been adjusted. The second and third pass normalize |
| 598 | all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively. They also |
| 599 | clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE> |
| 600 | for continued lines. |
| 601 | |
| 602 | </P> |
| 603 | <P> |
| 604 | Having such an explicit normalizing command allows for importing PO |
| 605 | files from other sources, but also eases the evolution of the current |
| 606 | convention, evolution driven mostly by aesthetic concerns, as of now. |
| 607 | It is easy to make suggested adjustments at a later time, as the |
| 608 | normalizing command and eventually, other GNU <CODE>gettext</CODE> tools |
| 609 | should greatly automate conformance. A description of the canonical |
| 610 | string format is given below, for the particular benefit of those not |
| 611 | having GNU Emacs handy, and who would nevertheless want to handcraft |
| 612 | their PO files in nice ways. |
| 613 | |
| 614 | </P> |
| 615 | <P> |
| 616 | Right now, in PO mode, strings are single line or multi-line. A string |
| 617 | goes multi-line if and only if it has <EM>embedded</EM> newlines, that |
| 618 | is, if it matches <SAMP>`[^\n]\n+[^\n]'</SAMP>. So, we would have: |
| 619 | |
| 620 | </P> |
| 621 | |
| 622 | <PRE> |
| 623 | msgstr "\n\nHello, world!\n\n\n" |
| 624 | </PRE> |
| 625 | |
| 626 | <P> |
| 627 | but, replacing the space by a newline, this becomes: |
| 628 | |
| 629 | </P> |
| 630 | |
| 631 | <PRE> |
| 632 | msgstr "" |
| 633 | "\n" |
| 634 | "\n" |
| 635 | "Hello,\n" |
| 636 | "world!\n" |
| 637 | "\n" |
| 638 | "\n" |
| 639 | </PRE> |
| 640 | |
| 641 | <P> |
| 642 | We are deliberately using a caricatural example, here, to make the |
| 643 | point clearer. Usually, multi-lines are not that bad looking. |
| 644 | It is probable that we will implement the following suggestion. |
| 645 | We might lump together all initial newlines into the empty string, |
| 646 | and also all newlines introducing empty lines (that is, for <VAR>n</VAR> |
| 647 | > 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate |
| 648 | string), so making the previous example appear: |
| 649 | |
| 650 | </P> |
| 651 | |
| 652 | <PRE> |
| 653 | msgstr "\n\n" |
| 654 | "Hello,\n" |
| 655 | "world!\n" |
| 656 | "\n\n" |
| 657 | </PRE> |
| 658 | |
| 659 | <P> |
| 660 | There are a few yet undecided little points about string normalization, |
| 661 | to be documented in this manual, once these questions settle. |
| 662 | |
| 663 | </P> |
| 664 | <P><HR><P> |
| 665 | <p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. |
| 666 | </BODY> |
| 667 | </HTML> |