]>
Commit | Line | Data |
---|---|---|
f6bcfd97 BP |
1 | <HTML> |
2 | <HEAD> | |
3 | <!-- This HTML file has been created by texi2html 1.54 | |
4 | from gettext.texi on 25 January 1999 --> | |
5 | ||
6 | <TITLE>GNU gettext utilities - PO Files and PO Mode Basics</TITLE> | |
7 | <link href="gettext_3.html" rel=Next> | |
8 | <link href="gettext_1.html" rel=Previous> | |
9 | <link href="gettext_toc.html" rel=ToC> | |
10 | ||
11 | </HEAD> | |
12 | <BODY> | |
13 | <p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. | |
14 | <P><HR><P> | |
15 | ||
16 | ||
17 | <H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">PO Files and PO Mode Basics</A></H1> | |
18 | ||
19 | <P> | |
20 | The GNU <CODE>gettext</CODE> toolset helps programmers and translators | |
21 | at producing, updating and using translation files, mainly those | |
22 | PO files which are textual, editable files. This chapter stresses | |
23 | the format of PO files, and contains a PO mode starter. PO mode | |
24 | description is spread throughout this manual instead of being concentrated | |
25 | in one place. Here we present only the basics of PO mode. | |
26 | ||
27 | </P> | |
28 | ||
29 | ||
30 | ||
31 | <H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">Completing GNU <CODE>gettext</CODE> Installation</A></H2> | |
32 | ||
33 | <P> | |
34 | Once you have received, unpacked, configured and compiled the GNU | |
35 | <CODE>gettext</CODE> distribution, the <SAMP>`make install'</SAMP> command puts in | |
36 | place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and | |
37 | <CODE>msgmerge</CODE>, as well as their available message catalogs. To | |
38 | top off a comfortable installation, you might also want to make the | |
39 | PO mode available to your GNU Emacs users. | |
40 | ||
41 | </P> | |
42 | <P> | |
43 | During the installation of the PO mode, you might want modify your | |
44 | file <TT>`.emacs'</TT>, once and for all, so it contains a few lines looking | |
45 | like: | |
46 | ||
47 | </P> | |
48 | ||
49 | <PRE> | |
50 | (setq auto-mode-alist | |
51 | (cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist)) | |
52 | (autoload 'po-mode "po-mode") | |
53 | </PRE> | |
54 | ||
55 | <P> | |
56 | Later, whenever you edit some <TT>`.po'</TT>, <TT>`.pot'</TT> or <TT>`.pox'</TT> | |
57 | file, or any file having the string <SAMP>`.po.'</SAMP> within its name, | |
58 | Emacs loads <TT>`po-mode.elc'</TT> (or <TT>`po-mode.el'</TT>) as needed, and | |
59 | automatically activates PO mode commands for the associated buffer. | |
60 | The string <EM>PO</EM> appears in the mode line for any buffer for | |
61 | which PO mode is active. Many PO files may be active at once in a | |
62 | single Emacs session. | |
63 | ||
64 | </P> | |
65 | <P> | |
66 | If you are using Emacs version 20 or better, and have already installed | |
67 | the appropriate international fonts on your system, you may also manage | |
68 | for the these fonts to be automatically loaded and used for displaying | |
69 | the translations on your Emacs screen, whenever necessary. For this to | |
70 | happen, you might want to add the lines: | |
71 | ||
72 | </P> | |
73 | ||
74 | <PRE> | |
75 | (autoload 'po-find-file-coding-system "po-mode") | |
76 | (modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\." | |
77 | 'po-find-file-coding-system) | |
78 | </PRE> | |
79 | ||
80 | <P> | |
81 | to your <TT>`.emacs'</TT> file. | |
82 | ||
83 | </P> | |
84 | ||
85 | ||
86 | <H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">The Format of PO Files</A></H2> | |
87 | ||
88 | <P> | |
89 | A PO file is made up of many entries, each entry holding the relation | |
90 | between an original untranslated string and its corresponding | |
91 | translation. All entries in a given PO file usually pertain | |
92 | to a single project, and all translations are expressed in a single | |
93 | target language. One PO file <STRONG>entry</STRONG> has the following schematic | |
94 | structure: | |
95 | ||
96 | </P> | |
97 | ||
98 | <PRE> | |
99 | <VAR>white-space</VAR> | |
100 | # <VAR>translator-comments</VAR> | |
101 | #. <VAR>automatic-comments</VAR> | |
102 | #: <VAR>reference</VAR>... | |
103 | #, <VAR>flag</VAR>... | |
104 | msgid <VAR>untranslated-string</VAR> | |
105 | msgstr <VAR>translated-string</VAR> | |
106 | </PRE> | |
107 | ||
108 | <P> | |
109 | The general structure of a PO file should be well understood by | |
110 | the translator. When using PO mode, very little has to be known | |
111 | about the format details, as PO mode takes care of them for her. | |
112 | ||
113 | </P> | |
114 | <P> | |
115 | Entries begin with some optional white space. Usually, when generated | |
116 | through GNU <CODE>gettext</CODE> tools, there is exactly one blank line | |
117 | between entries. Then comments follow, on lines all starting with the | |
118 | character <KBD>#</KBD>. There are two kinds of comments: those which have | |
119 | some white space immediately following the <KBD>#</KBD>, which comments are | |
120 | created and maintained exclusively by the translator, and those which | |
121 | have some non-white character just after the <KBD>#</KBD>, which comments | |
122 | are created and maintained automatically by GNU <CODE>gettext</CODE> tools. | |
123 | All comments, of either kind, are optional. | |
124 | ||
125 | </P> | |
126 | <P> | |
127 | After white space and comments, entries show two strings, giving | |
128 | first the untranslated string as it appears in the original program | |
129 | sources, and then, the translation of this string. The original | |
130 | string is introduced by the keyword <CODE>msgid</CODE>, and the translation, | |
131 | by <CODE>msgstr</CODE>. The two strings, untranslated and translated, | |
132 | are quoted in various ways in the PO file, using <KBD>"</KBD> | |
133 | delimiters and <KBD>\</KBD> escapes, but the translator does not really | |
134 | have to pay attention to the precise quoting format, as PO mode fully | |
135 | intend to take care of quoting for her. | |
136 | ||
137 | </P> | |
138 | <P> | |
139 | The <CODE>msgid</CODE> strings, as well as automatic comments, are produced | |
140 | and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not | |
141 | provide means for the translator to alter these. The most she can | |
142 | do is merely deleting them, and only by deleting the whole entry. | |
143 | On the other hand, the <CODE>msgstr</CODE> string, as well as translator | |
144 | comments, are really meant for the translator, and PO mode gives her | |
145 | the full control she needs. | |
146 | ||
147 | </P> | |
148 | <P> | |
149 | The comment lines beginning with <KBD>#,</KBD> are special because they are | |
150 | not completely ignored by the programs as comments generally are. The | |
151 | comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE> | |
152 | program to give the user some better disgnostic messages. Currently | |
153 | there are two forms of flags defined: | |
154 | ||
155 | </P> | |
156 | <DL COMPACT> | |
157 | ||
158 | <DT><KBD>fuzzy</KBD> | |
159 | <DD> | |
160 | This flag can be generated by the <CODE>msgmerge</CODE> program or it can be | |
161 | inserted by the translator herself. It shows that the <CODE>msgstr</CODE> | |
162 | string might not be a correct translation (anymore). Only the translator | |
163 | can judge if the translation requires further modification, or is | |
164 | acceptable as is. Once satisfied with the translation, she then removes | |
165 | this <KBD>fuzzy</KBD> attribute. The <CODE>msgmerge</CODE> programs inserts this | |
166 | when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy | |
167 | search only. See section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>. | |
168 | ||
169 | <DT><KBD>c-format</KBD> | |
170 | <DD> | |
171 | <DT><KBD>no-c-format</KBD> | |
172 | <DD> | |
173 | These flags should not be added by a human. Instead only the | |
174 | <CODE>xgettext</CODE> program adds them. In an automatized PO file processing | |
175 | system as proposed here the user changes would be thrown away again as | |
176 | soon as the <CODE>xgettext</CODE> program generates a new template file. | |
177 | ||
178 | In case the <KBD>c-format</KBD> flag is given for a string the <CODE>msgfmt</CODE> | |
179 | does some more tests to check to validity of the translation. | |
180 | See section <A HREF="gettext_6.html#SEC33">Invoking the <CODE>msgfmt</CODE> Program</A>. | |
181 | ||
182 | </DL> | |
183 | ||
184 | <P> | |
185 | It happens that some lines, usually whitespace or comments, follow the | |
186 | very last entry of a PO file. Such lines are not part of any entry, | |
187 | and PO mode is unable to take action on those lines. By using the | |
188 | PO mode function <KBD>M-x po-normalize</KBD>, the translator may get | |
189 | rid of those spurious lines. See section <A HREF="gettext_2.html#SEC12">Normalizing Strings in Entries</A>. | |
190 | ||
191 | </P> | |
192 | <P> | |
193 | The remainder of this section may be safely skipped by those using | |
194 | PO mode, yet it may be interesting for everybody to have a better | |
195 | idea of the precise format of a PO file. On the other hand, those | |
196 | not having GNU Emacs handy should carefully continue reading on. | |
197 | ||
198 | </P> | |
199 | <P> | |
200 | Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects | |
201 | the C syntax for a character string, including the surrounding quotes | |
202 | and imbedded backslashed escape sequences. When the time comes | |
203 | to write multi-line strings, one should not use escaped newlines. | |
204 | Instead, a closing quote should follow the last character on the | |
205 | line to be continued, and an opening quote should resume the string | |
206 | at the beginning of the following PO file line. For example: | |
207 | ||
208 | </P> | |
209 | ||
210 | <PRE> | |
211 | msgid "" | |
212 | "Here is an example of how one might continue a very long string\n" | |
213 | "for the common case the string represents multi-line output.\n" | |
214 | </PRE> | |
215 | ||
216 | <P> | |
217 | In this example, the empty string is used on the first line, to | |
218 | allow better alignment of the <KBD>H</KBD> from the word <SAMP>`Here'</SAMP> | |
219 | over the <KBD>f</KBD> from the word <SAMP>`for'</SAMP>. In this example, the | |
220 | <CODE>msgid</CODE> keyword is followed by three strings, which are meant | |
221 | to be concatenated. Concatenating the empty string does not change | |
222 | the resulting overall string, but it is a way for us to comply with | |
223 | the necessity of <CODE>msgid</CODE> to be followed by a string on the same | |
224 | line, while keeping the multi-line presentation left-justified, as | |
225 | we find this to be a cleaner disposition. The empty string could have | |
226 | been omitted, but only if the string starting with <SAMP>`Here'</SAMP> was | |
227 | promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF1" HREF="gettext_foot.html#FOOT1">(1)</A> It was not really necessary | |
228 | either to switch between the two last quoted strings immediately after | |
229 | the newline <SAMP>`\n'</SAMP>, the switch could have occurred after <EM>any</EM> | |
230 | other character, we just did it this way because it is neater. | |
231 | ||
232 | </P> | |
233 | <P> | |
234 | One should carefully distinguish between end of lines marked as | |
235 | <SAMP>`\n'</SAMP> <EM>inside</EM> quotes, which are part of the represented | |
236 | string, and end of lines in the PO file itself, outside string quotes, | |
237 | which have no incidence on the represented string. | |
238 | ||
239 | </P> | |
240 | <P> | |
241 | Outside strings, white lines and comments may be used freely. | |
242 | Comments start at the beginning of a line with <SAMP>`#'</SAMP> and extend | |
243 | until the end of the PO file line. Comments written by translators | |
244 | should have the initial <SAMP>`#'</SAMP> immediately followed by some white | |
245 | space. If the <SAMP>`#'</SAMP> is not immediately followed by white space, | |
246 | this comment is most likely generated and managed by specialized GNU | |
247 | tools, and might disappear or be replaced unexpectedly when the PO | |
248 | file is given to <CODE>msgmerge</CODE>. | |
249 | ||
250 | </P> | |
251 | ||
252 | ||
253 | <H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">Main PO mode Commands</A></H2> | |
254 | ||
255 | <P> | |
256 | After setting up Emacs with something similar to the lines in | |
257 | section <A HREF="gettext_2.html#SEC8">Completing GNU <CODE>gettext</CODE> Installation</A>, PO mode is activated for a window when Emacs finds a | |
258 | PO file in that window. This puts the window read-only and establishes a | |
259 | po-mode-map, which is a genuine Emacs mode, in a way that is not derived | |
260 | from text mode in any way. Functions found on <CODE>po-mode-hook</CODE>, | |
261 | if any, will be executed. | |
262 | ||
263 | </P> | |
264 | <P> | |
265 | When PO mode is active in a window, the letters <SAMP>`PO'</SAMP> appear | |
266 | in the mode line for that window. The mode line also displays how | |
267 | many entries of each kind are held in the PO file. For example, | |
268 | the string <SAMP>`132t+3f+10u+2o'</SAMP> would tell the translator that the | |
269 | PO mode contains 132 translated entries (see section <A HREF="gettext_5.html#SEC25">Translated Entries</A>, | |
270 | 3 fuzzy entries (see section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>), 10 untranslated entries | |
271 | (see section <A HREF="gettext_5.html#SEC27">Untranslated Entries</A>) and 2 obsolete entries (see section <A HREF="gettext_5.html#SEC28">Obsolete Entries</A>). Zero-coefficients items are not shown. So, in this example, if | |
272 | the fuzzy entries were unfuzzied, the untranslated entries were translated | |
273 | and the obsolete entries were deleted, the mode line would merely display | |
274 | <SAMP>`145t'</SAMP> for the counters. | |
275 | ||
276 | </P> | |
277 | <P> | |
278 | The main PO commands are those which do not fit into the other categories of | |
279 | subsequent sections. These allow for quitting PO mode or for managing windows | |
280 | in special ways. | |
281 | ||
282 | </P> | |
283 | <DL COMPACT> | |
284 | ||
285 | <DT><KBD>U</KBD> | |
286 | <DD> | |
287 | Undo last modification to the PO file. | |
288 | ||
289 | <DT><KBD>Q</KBD> | |
290 | <DD> | |
291 | Quit processing and save the PO file. | |
292 | ||
293 | <DT><KBD>q</KBD> | |
294 | <DD> | |
295 | Quit processing, possibly after confirmation. | |
296 | ||
297 | <DT><KBD>O</KBD> | |
298 | <DD> | |
299 | Temporary leave the PO file window. | |
300 | ||
301 | <DT><KBD>?</KBD> | |
302 | <DD> | |
303 | <DT><KBD>h</KBD> | |
304 | <DD> | |
305 | Show help about PO mode. | |
306 | ||
307 | <DT><KBD>=</KBD> | |
308 | <DD> | |
309 | Give some PO file statistics. | |
310 | ||
311 | <DT><KBD>V</KBD> | |
312 | <DD> | |
313 | Batch validate the format of the whole PO file. | |
314 | ||
315 | </DL> | |
316 | ||
317 | <P> | |
318 | The command <KBD>U</KBD> (<CODE>po-undo</CODE>) interfaces to the GNU Emacs | |
319 | <EM>undo</EM> facility. See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>. Each time <KBD>U</KBD> is typed, modifications which the translator | |
320 | did to the PO file are undone a little more. For the purpose of | |
321 | undoing, each PO mode command is atomic. This is especially true for | |
322 | the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single | |
323 | use of this command is undone at once, even if the edition itself | |
324 | implied several actions. However, while in the editing window, one | |
325 | can undo the edition work quite parsimoniously. | |
326 | ||
327 | </P> | |
328 | <P> | |
329 | The commands <KBD>Q</KBD> (<CODE>po-quit</CODE>) and <KBD>q</KBD> | |
330 | (<CODE>po-confirm-and-quit</CODE>) are used when the translator is done with the | |
331 | PO file. The former is a bit less verbose than the latter. If the file | |
332 | has been modified, it is saved to disk first. In both cases, and prior to | |
333 | all this, the commands check if some untranslated message remains in the | |
334 | PO file and, if yes, the translator is asked if she really wants to leave | |
335 | off working with this PO file. This is the preferred way of getting rid | |
336 | of an Emacs PO file buffer. Merely killing it through the usual command | |
337 | <KBD>C-x k</KBD> (<CODE>kill-buffer</CODE>) is not the tidiest way to proceed. | |
338 | ||
339 | </P> | |
340 | <P> | |
341 | The command <KBD>O</KBD> (<CODE>po-other-window</CODE>) is another, softer way, | |
342 | to leave PO mode, temporarily. It just moves the cursor to some other | |
343 | Emacs window, and pops one if necessary. For example, if the translator | |
344 | just got PO mode to show some source context in some other, she might | |
345 | discover some apparent bug in the program source that needs correction. | |
346 | This command allows the translator to change sex, become a programmer, | |
347 | and have the cursor right into the window containing the program she | |
348 | (or rather <EM>he</EM>) wants to modify. By later getting the cursor back | |
349 | in the PO file window, or by asking Emacs to edit this file once again, | |
350 | PO mode is then recovered. | |
351 | ||
352 | </P> | |
353 | <P> | |
354 | The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all available PO | |
355 | mode commands. The translator should then type any character to resume | |
356 | normal PO mode operations. The command <KBD>?</KBD> has the same effect | |
357 | as <KBD>h</KBD>. | |
358 | ||
359 | </P> | |
360 | <P> | |
361 | The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number of | |
362 | entries in the PO file, the ordinal of the current entry (counted from | |
363 | 1), the number of untranslated entries, the number of obsolete entries, | |
364 | and displays all these numbers. | |
365 | ||
366 | </P> | |
367 | <P> | |
368 | The command <KBD>V</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in verbose | |
369 | mode over the current PO file. This command first offers to save the | |
370 | current PO file on disk. The <CODE>msgfmt</CODE> tool, from GNU <CODE>gettext</CODE>, | |
371 | has the purpose of creating a MO file out of a PO file, and PO mode uses | |
372 | the features of this program for checking the overall format of a PO file, | |
373 | as well as all individual entries. | |
374 | ||
375 | </P> | |
376 | <P> | |
377 | The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so the | |
378 | translator regains control immediately while her PO file is being studied. | |
379 | Error output is collected in the GNU Emacs <SAMP>`*compilation*'</SAMP> buffer, | |
380 | displayed in another window. The regular GNU Emacs command <KBD>C-x`</KBD> | |
381 | (<CODE>next-error</CODE>), as well as other usual compile commands, allow the | |
382 | translator to reposition quickly to the offending parts of the PO file. | |
383 | Once the cursor is on the line in error, the translator may decide on | |
384 | any PO mode action which would help correcting the error. | |
385 | ||
386 | </P> | |
387 | ||
388 | ||
389 | <H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">Entry Positioning</A></H2> | |
390 | ||
391 | <P> | |
392 | The cursor in a PO file window is almost always part of | |
393 | an entry. The only exceptions are the special case when the cursor | |
394 | is after the last entry in the file, or when the PO file is | |
395 | empty. The entry where the cursor is found to be is said to be the | |
396 | current entry. Many PO mode commands operate on the current entry, | |
397 | so moving the cursor does more than allowing the translator to browse | |
398 | the PO file, this also selects on which entry commands operate. | |
399 | ||
400 | </P> | |
401 | <P> | |
402 | Some PO mode commands alter the position of the cursor in a specialized | |
403 | way. A few of those special purpose positioning are described here, | |
404 | the others are described in following sections. | |
405 | ||
406 | </P> | |
407 | <DL COMPACT> | |
408 | ||
409 | <DT><KBD>.</KBD> | |
410 | <DD> | |
411 | Redisplay the current entry. | |
412 | ||
413 | <DT><KBD>n</KBD> | |
414 | <DD> | |
415 | <DT><KBD>n</KBD> | |
416 | <DD> | |
417 | Select the entry after the current one. | |
418 | ||
419 | <DT><KBD>p</KBD> | |
420 | <DD> | |
421 | <DT><KBD>p</KBD> | |
422 | <DD> | |
423 | Select the entry before the current one. | |
424 | ||
425 | <DT><KBD><</KBD> | |
426 | <DD> | |
427 | Select the first entry in the PO file. | |
428 | ||
429 | <DT><KBD>></KBD> | |
430 | <DD> | |
431 | Select the last entry in the PO file. | |
432 | ||
433 | <DT><KBD>m</KBD> | |
434 | <DD> | |
435 | Record the location of the current entry for later use. | |
436 | ||
437 | <DT><KBD>l</KBD> | |
438 | <DD> | |
439 | Return to a previously saved entry location. | |
440 | ||
441 | <DT><KBD>x</KBD> | |
442 | <DD> | |
443 | Exchange the current entry location with the previously saved one. | |
444 | ||
445 | </DL> | |
446 | ||
447 | <P> | |
448 | Any GNU Emacs command able to reposition the cursor may be used | |
449 | to select the current entry in PO mode, including commands which | |
450 | move by characters, lines, paragraphs, screens or pages, and search | |
451 | commands. However, there is a kind of standard way to display the | |
452 | current entry in PO mode, which usual GNU Emacs commands moving | |
453 | the cursor do not especially try to enforce. The command <KBD>.</KBD> | |
454 | (<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the | |
455 | current entry properly, after the current entry has been changed by | |
456 | means external to PO mode, or the Emacs screen otherwise altered. | |
457 | ||
458 | </P> | |
459 | <P> | |
460 | It is yet to be decided if PO mode helps the translator, or otherwise | |
461 | irritates her, by forcing a rigid window disposition while she | |
462 | is doing her work. We originally had quite precise ideas about | |
463 | how windows should behave, but on the other hand, anyone used to | |
464 | GNU Emacs is often happy to keep full control. Maybe a fixed window | |
465 | disposition might be offered as a PO mode option that the translator | |
466 | might activate or deactivate at will, so it could be offered on an | |
467 | experimental basis. If nobody feels a real need for using it, or | |
468 | a compulsion for writing it, we should drop this whole idea. | |
469 | The incentive for doing it should come from translators rather than | |
470 | programmers, as opinions from an experienced translator are surely | |
471 | more worth to me than opinions from programmers <EM>thinking</EM> about | |
472 | how <EM>others</EM> should do translation. | |
473 | ||
474 | </P> | |
475 | <P> | |
476 | The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD> | |
477 | (<CODE>po-previous-entry</CODE>) move the cursor the entry following, | |
478 | or preceding, the current one. If <KBD>n</KBD> is given while the | |
479 | cursor is on the last entry of the PO file, or if <KBD>p</KBD> | |
480 | is given while the cursor is on the first entry, no move is done. | |
481 | ||
482 | </P> | |
483 | <P> | |
484 | The commands <KBD><</KBD> (<CODE>po-first-entry</CODE>) and <KBD>></KBD> | |
485 | (<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last | |
486 | entry, of the PO file. When the cursor is located past the last | |
487 | entry in a PO file, most PO mode commands will return an error saying | |
488 | <SAMP>`After last entry'</SAMP>. Moreover, the commands <KBD><</KBD> and <KBD>></KBD> | |
489 | have the special property of being able to work even when the cursor | |
490 | is not into some PO file entry, and one may use them for nicely | |
491 | correcting this situation. But even these commands will fail on a | |
492 | truly empty PO file. There are development plans for the PO mode for it | |
493 | to interactively fill an empty PO file from sources. See section <A HREF="gettext_3.html#SEC16">Marking Translatable Strings</A>. | |
494 | ||
495 | </P> | |
496 | <P> | |
497 | The translator may decide, before working at the translation of | |
498 | a particular entry, that she needs to browse the remainder of the | |
499 | PO file, maybe for finding the terminology or phraseology used | |
500 | in related entries. She can of course use the standard Emacs idioms | |
501 | for saving the current cursor location in some register, and use that | |
502 | register for getting back, or else, use the location ring. | |
503 | ||
504 | </P> | |
505 | <P> | |
506 | PO mode offers another approach, by which cursor locations may be saved | |
507 | onto a special stack. The command <KBD>m</KBD> (<CODE>po-push-location</CODE>) | |
508 | merely adds the location of current entry to the stack, pushing | |
509 | the already saved locations under the new one. The command | |
510 | <KBD>r</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and | |
511 | reposition the cursor to the entry associated with that top element. | |
512 | This position is then lost, for the next <KBD>r</KBD> will move the cursor | |
513 | to the previously saved location, and so on until no locations remain | |
514 | on the stack. | |
515 | ||
516 | </P> | |
517 | <P> | |
518 | If the translator wants the position to be kept on the location stack, | |
519 | maybe for taking a look at the entry associated with the top | |
520 | element, then go elsewhere with the intent of getting back later, she | |
521 | ought to use <KBD>m</KBD> immediately after <KBD>r</KBD>. | |
522 | ||
523 | </P> | |
524 | <P> | |
525 | The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously | |
526 | reposition the cursor to the entry associated with the top element of | |
527 | the stack of saved locations, and replace that top element with the | |
528 | location of the current entry before the move. Consequently, repeating | |
529 | the <KBD>x</KBD> command toggles alternatively between two entries. | |
530 | For achieving this, the translator will position the cursor on the | |
531 | first entry, use <KBD>m</KBD>, then position to the second entry, and | |
532 | merely use <KBD>x</KBD> for making the switch. | |
533 | ||
534 | </P> | |
535 | ||
536 | ||
537 | <H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">Normalizing Strings in Entries</A></H2> | |
538 | ||
539 | <P> | |
540 | There are many different ways for encoding a particular string into a | |
541 | PO file entry, because there are so many different ways to split and | |
542 | quote multi-line strings, and even, to represent special characters | |
543 | by backslahsed escaped sequences. Some features of PO mode rely on | |
544 | the ability for PO mode to scan an already existing PO file for a | |
545 | particular string encoded into the <CODE>msgid</CODE> field of some entry. | |
546 | Even if PO mode has internally all the built-in machinery for | |
547 | implementing this recognition easily, doing it fast is technically | |
548 | difficult. To facilitate a solution to this efficiency problem, | |
549 | we decided on a canonical representation for strings. | |
550 | ||
551 | </P> | |
552 | <P> | |
553 | A conventional representation of strings in a PO file is currently | |
554 | under discussion, and PO mode experiments with a canonical representation. | |
555 | Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform | |
556 | way of representing equivalent strings would be useful, as the internal | |
557 | normalization needed by PO mode could be automatically satisfied | |
558 | when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>. An explicit | |
559 | PO mode normalization should then be only necessary for PO files | |
560 | imported from elsewhere, or for when the convention itself evolves. | |
561 | ||
562 | </P> | |
563 | <P> | |
564 | So, for achieving normalization of at least the strings of a given | |
565 | PO file needing a canonical representation, the following PO mode | |
566 | command is available: | |
567 | ||
568 | </P> | |
569 | <DL COMPACT> | |
570 | ||
571 | <DT><KBD>M-x po-normalize</KBD> | |
572 | <DD> | |
573 | Tidy the whole PO file by making entries more uniform. | |
574 | ||
575 | </DL> | |
576 | ||
577 | <P> | |
578 | The special command <KBD>M-x po-normalize</KBD>, which has no associate | |
579 | keys, revises all entries, ensuring that strings of both original | |
580 | and translated entries use uniform internal quoting in the PO file. | |
581 | It also removes any crumb after the last entry. This command may be | |
582 | useful for PO files freshly imported from elsewhere, or if we ever | |
583 | improve on the canonical quoting format we use. This canonical format | |
584 | is not only meant for getting cleaner PO files, but also for greatly | |
585 | speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands. | |
586 | ||
587 | </P> | |
588 | <P> | |
589 | <KBD>M-x po-normalize</KBD> presently makes three passes over the entries. | |
590 | The first implements heuristics for converting PO files for GNU | |
591 | <CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE> | |
592 | fields were using K&R style C string syntax for multi-line strings. | |
593 | These heuristics may fail for comments not related to obsolete | |
594 | entries and ending with a backslash; they also depend on subsequent | |
595 | passes for finalizing the proper commenting of continued lines for | |
596 | obsolete entries. This first pass might disappear once all oldish PO | |
597 | files would have been adjusted. The second and third pass normalize | |
598 | all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively. They also | |
599 | clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE> | |
600 | for continued lines. | |
601 | ||
602 | </P> | |
603 | <P> | |
604 | Having such an explicit normalizing command allows for importing PO | |
605 | files from other sources, but also eases the evolution of the current | |
606 | convention, evolution driven mostly by aesthetic concerns, as of now. | |
607 | It is easy to make suggested adjustments at a later time, as the | |
608 | normalizing command and eventually, other GNU <CODE>gettext</CODE> tools | |
609 | should greatly automate conformance. A description of the canonical | |
610 | string format is given below, for the particular benefit of those not | |
611 | having GNU Emacs handy, and who would nevertheless want to handcraft | |
612 | their PO files in nice ways. | |
613 | ||
614 | </P> | |
615 | <P> | |
616 | Right now, in PO mode, strings are single line or multi-line. A string | |
617 | goes multi-line if and only if it has <EM>embedded</EM> newlines, that | |
618 | is, if it matches <SAMP>`[^\n]\n+[^\n]'</SAMP>. So, we would have: | |
619 | ||
620 | </P> | |
621 | ||
622 | <PRE> | |
623 | msgstr "\n\nHello, world!\n\n\n" | |
624 | </PRE> | |
625 | ||
626 | <P> | |
627 | but, replacing the space by a newline, this becomes: | |
628 | ||
629 | </P> | |
630 | ||
631 | <PRE> | |
632 | msgstr "" | |
633 | "\n" | |
634 | "\n" | |
635 | "Hello,\n" | |
636 | "world!\n" | |
637 | "\n" | |
638 | "\n" | |
639 | </PRE> | |
640 | ||
641 | <P> | |
642 | We are deliberately using a caricatural example, here, to make the | |
643 | point clearer. Usually, multi-lines are not that bad looking. | |
644 | It is probable that we will implement the following suggestion. | |
645 | We might lump together all initial newlines into the empty string, | |
646 | and also all newlines introducing empty lines (that is, for <VAR>n</VAR> | |
647 | > 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate | |
648 | string), so making the previous example appear: | |
649 | ||
650 | </P> | |
651 | ||
652 | <PRE> | |
653 | msgstr "\n\n" | |
654 | "Hello,\n" | |
655 | "world!\n" | |
656 | "\n\n" | |
657 | </PRE> | |
658 | ||
659 | <P> | |
660 | There are a few yet undecided little points about string normalization, | |
661 | to be documented in this manual, once these questions settle. | |
662 | ||
663 | </P> | |
664 | <P><HR><P> | |
665 | <p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. | |
666 | </BODY> | |
667 | </HTML> |