X-Git-Url: https://git.saurik.com/wxWidgets.git/blobdiff_plain/fc2171bd4c660b8554dae2a1cbf34ff09f3032a6..d68a2a24d1d25542974045f0bff3f035c192e5bb:/docs/html/gettext/gettext_6.html diff --git a/docs/html/gettext/gettext_6.html b/docs/html/gettext/gettext_6.html deleted file mode 100644 index b106043570..0000000000 --- a/docs/html/gettext/gettext_6.html +++ /dev/null @@ -1,258 +0,0 @@ - - - - -GNU gettext utilities - Producing Binary MO Files - - - - - - -

Go to the first, previous, next, last section, table of contents. -


- - -

Producing Binary MO Files

- - - -

Invoking the msgfmt Program

- - -
-Usage: msgfmt [option] filename.po ...
-
- -
- -
`-a number' -
-
`--alignment=number' -
-Align strings to number bytes (default: 1). - -
`-h' -
-
`--help' -
-Display this help and exit. - -
`--no-hash' -
-Binary file will not include the hash table. - -
`-o file' -
-
`--output-file=file' -
-Specify output file name as file. - -
`--strict' -
-Direct the program to work strictly following the Uniforum/Sun -implementation. Currently this only affects the naming of the output -file. If this option is not given the name of the output file is the -same as the domain name. If the strict Uniforum mode is enable the -suffix `.mo' is added to the file name if it is not already -present. - -We find this behaviour of Sun's implementation rather silly and so by -default this mode is not selected. - -
`-v' -
-
`--verbose' -
-Detect and diagnose input file anomalies which might represent -translation errors. The msgid and msgstr strings are -studied and compared. It is considered abnormal that one string -starts or ends with a newline while the other does not. - -Also, if the string represents a format string used in a -printf-like function both strings should have the same number of -`%' format specifiers, with matching types. If the flag -c-format or possible-c-format appears in the special -comment #, for this entry a check is performed. For example, the -check will diagnose using `%.*s' against `%s', or `%d' -against `%s', or `%d' against `%x'. It can even handle -positional parameters. - -Normally the xgettext program automatically decides whether a -string is a format string or not. This algorithm is not perfect, -though. It might regard a string as a format string though it is not -used in a printf-like function and so msgfmt might report -errors where there are none. Or the other way round: a string is not -regarded as a format string but it is used in a printf-like -function. - -So solve this problem the programmer can dictate the decision to the -xgettext program (see section Special Comments preceding Keywords). The translator should not -consider removing the flag from the #, line. This "fix" would be -reversed again as soon as msgmerge is called the next time. - -
`-V' -
-
`--version' -
-Output version information and exit. - -
- -

-If input file is `-', standard input is read. If output file -is `-', output is written to standard output. - -

- - -

The Format of GNU MO Files

- -

-The format of the generated MO files is best described by a picture, -which appears below. - -

-

-The first two words serve the identification of the file. The magic -number will always signal GNU MO files. The number is stored in the -byte order of the generating machine, so the magic number really is -two numbers: 0x950412de and 0xde120495. The second -word describes the current revision of the file format. For now the -revision is 0. This might change in future versions, and ensures -that the readers of MO files can distinguish new formats from old -ones, so that both can be handled correctly. The version is kept -separate from the magic number, instead of using different magic -numbers for different formats, mainly because `/etc/magic' is -not updated often. It might be better to have magic separated from -internal format version identification. - -

-

-Follow a number of pointers to later tables in the file, allowing -for the extension of the prefix part of MO files without having to -recompile programs reading them. This might become useful for later -inserting a few flag bits, indication about the charset used, new -tables, or other things. - -

-

-Then, at offset O and offset T in the picture, two tables -of string descriptors can be found. In both tables, each string -descriptor uses two 32 bits integers, one for the string length, -another for the offset of the string in the MO file, counting in bytes -from the start of the file. The first table contains descriptors -for the original strings, and is sorted so the original strings -are in increasing lexicographical order. The second table contains -descriptors for the translated strings, and is parallel to the first -table: to find the corresponding translation one has to access the -array slot in the second array with the same index. - -

-

-Having the original strings sorted enables the use of simple binary -search, for when the MO file does not contain an hashing table, or -for when it is not practical to use the hashing table provided in -the MO file. This also has another advantage, as the empty string -in a PO file GNU gettext is usually translated into -some system information attached to that particular MO file, and the -empty string necessarily becomes the first in both the original and -translated tables, making the system information very easy to find. - -

-

-The size S of the hash table can be zero. In this case, the -hash table itself is not contained in the MO file. Some people might -prefer this because a precomputed hashing table takes disk space, and -does not win that much speed. The hash table contains indices -to the sorted array of strings in the MO file. Conflict resolution is -done by double hashing. The precise hashing algorithm used is fairly -dependent of GNU gettext code, and is not documented here. - -

-

-As for the strings themselves, they follow the hash file, and each -is terminated with a NUL, and this NUL is not counted in -the length which appears in the string descriptor. The msgfmt -program has an option selecting the alignment for MO file strings. -With this option, each string is separately aligned so it starts at -an offset which is a multiple of the alignment value. On some RISC -machines, a correct alignment will speed things up. - -

-

-Nothing prevents a MO file from having embedded NULs in strings. -However, the program interface currently used already presumes -that strings are NUL terminated, so embedded NULs are -somewhat useless. But MO file format is general enough so other -interfaces would be later possible, if for example, we ever want to -implement wide characters right in MO files, where NUL bytes may -accidently appear. - -

-

-This particular issue has been strongly debated in the GNU -gettext development forum, and it is expectable that MO file -format will evolve or change over time. It is even possible that many -formats may later be supported concurrently. But surely, we have to -start somewhere, and the MO file format described here is a good start. -Nothing is cast in concrete, and the format may later evolve fairly -easily, so we should feel comfortable with the current approach. - -

- -
-        byte
-             +------------------------------------------+
-          0  | magic number = 0x950412de                |
-             |                                          |
-          4  | file format revision = 0                 |
-             |                                          |
-          8  | number of strings                        |  == N
-             |                                          |
-         12  | offset of table with original strings    |  == O
-             |                                          |
-         16  | offset of table with translation strings |  == T
-             |                                          |
-         20  | size of hashing table                    |  == S
-             |                                          |
-         24  | offset of hashing table                  |  == H
-             |                                          |
-             .                                          .
-             .    (possibly more entries later)         .
-             .                                          .
-             |                                          |
-          O  | length & offset 0th string  ----------------.
-      O + 8  | length & offset 1st string  ------------------.
-              ...                                    ...   | |
-O + ((N-1)*8)| length & offset (N-1)th string           |  | |
-             |                                          |  | |
-          T  | length & offset 0th translation  ---------------.
-      T + 8  | length & offset 1st translation  -----------------.
-              ...                                    ...   | | | |
-T + ((N-1)*8)| length & offset (N-1)th translation      |  | | | |
-             |                                          |  | | | |
-          H  | start hash table                         |  | | | |
-              ...                                    ...   | | | |
-  H + S * 4  | end hash table                           |  | | | |
-             |                                          |  | | | |
-             | NUL terminated 0th string  <----------------' | | |
-             |                                          |    | | |
-             | NUL terminated 1st string  <------------------' | |
-             |                                          |      | |
-              ...                                    ...       | |
-             |                                          |      | |
-             | NUL terminated 0th translation  <---------------' |
-             |                                          |        |
-             | NUL terminated 1st translation  <-----------------'
-             |                                          |
-              ...                                    ...
-             |                                          |
-             +------------------------------------------+
-
- -


-

Go to the first, previous, next, last section, table of contents. - -