]>
Commit | Line | Data |
---|---|---|
5b2abdfb | 1 | .\" Copyright (c) 1993 |
e9ce8d39 A |
2 | .\" The Regents of the University of California. All rights reserved. |
3 | .\" | |
5b2abdfb A |
4 | .\" This code is derived from software contributed to Berkeley by |
5 | .\" Paul Borman at Krystal Technologies. | |
6 | .\" | |
e9ce8d39 A |
7 | .\" Redistribution and use in source and binary forms, with or without |
8 | .\" modification, are permitted provided that the following conditions | |
9 | .\" are met: | |
10 | .\" 1. Redistributions of source code must retain the above copyright | |
11 | .\" notice, this list of conditions and the following disclaimer. | |
12 | .\" 2. Redistributions in binary form must reproduce the above copyright | |
13 | .\" notice, this list of conditions and the following disclaimer in the | |
14 | .\" documentation and/or other materials provided with the distribution. | |
e9ce8d39 A |
15 | .\" 4. Neither the name of the University nor the names of its contributors |
16 | .\" may be used to endorse or promote products derived from this software | |
17 | .\" without specific prior written permission. | |
18 | .\" | |
19 | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND | |
20 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | |
21 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | |
22 | .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE | |
23 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | |
24 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS | |
25 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) | |
26 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | |
27 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY | |
28 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF | |
29 | .\" SUCH DAMAGE. | |
30 | .\" | |
3d9156a7 | 31 | .\" @(#)euc.4 8.1 (Berkeley) 6/4/93 |
1f2f436a | 32 | .\" $FreeBSD: src/lib/libc/locale/euc.5,v 1.13 2007/01/09 00:28:00 imp Exp $ |
e9ce8d39 | 33 | .\" |
3d9156a7 A |
34 | .Dd November 8, 2003 |
35 | .Dt EUC 5 | |
e9ce8d39 A |
36 | .Os |
37 | .Sh NAME | |
3d9156a7 A |
38 | .Nm euc |
39 | .Nd EUC encoding of wide characters | |
e9ce8d39 | 40 | .Sh SYNOPSIS |
5b2abdfb | 41 | .Nm ENCODING |
3d9156a7 A |
42 | .Qq EUC |
43 | .Pp | |
44 | .Nm VARIABLE | |
45 | .Ar len1 | |
46 | .Ar mask1 | |
47 | .Ar len2 | |
48 | .Ar mask2 | |
49 | .Ar len3 | |
50 | .Ar mask3 | |
51 | .Ar len4 | |
52 | .Ar mask4 | |
53 | .Ar mask | |
e9ce8d39 | 54 | .Sh DESCRIPTION |
3d9156a7 A |
55 | .\"The |
56 | .\".Nm EUC | |
57 | .\"encoding is provided for compatibility with | |
58 | .\".Ux | |
59 | .\"based systems. | |
60 | .\"See | |
61 | .\".Xr mklocale 1 | |
62 | .\"for a complete description of the | |
63 | .\".Ev LC_CTYPE | |
64 | .\"source file format. | |
65 | .\".Pp | |
66 | .Nm EUC | |
67 | implements a system of 4 multibyte codesets. | |
68 | A multibyte character in the first codeset consists of | |
69 | .Ar len1 | |
70 | bytes starting with a byte in the range of 0x00 to 0x7f. | |
71 | To allow use of | |
72 | .Tn ASCII , | |
73 | .Ar len1 | |
74 | is always 1. | |
75 | A multibyte character in the second codeset consists of | |
76 | .Ar len2 | |
77 | bytes starting with a byte in the range of 0x80-0xff excluding 0x8e and 0x8f. | |
78 | A multibyte character in the third codeset consists of | |
79 | .Ar len3 | |
80 | bytes starting with the byte 0x8e. | |
81 | A multibyte character in the fourth codeset consists of | |
82 | .Ar len4 | |
83 | bytes starting with the byte 0x8f. | |
9385eb3d | 84 | .Pp |
e9ce8d39 | 85 | The |
3d9156a7 A |
86 | .Vt wchar_t |
87 | encoding of | |
88 | .Nm EUC | |
89 | multibyte characters is dependent on the | |
90 | .Ar len | |
91 | and | |
92 | .Ar mask | |
93 | arguments. | |
94 | First, the bytes are moved into a | |
95 | .Vt wchar_t | |
96 | as follows: | |
e9ce8d39 | 97 | .Bd -literal |
3d9156a7 | 98 | byte0 << ((\fIlen\fPN-1) * 8) | byte1 << ((\fIlen\fPN-2) * 8) | ... | byte\fIlen\fPN-1 |
e9ce8d39 | 99 | .Ed |
5b2abdfb | 100 | .Pp |
3d9156a7 A |
101 | The result is then ANDed with |
102 | .Ar ~mask | |
103 | and ORed with | |
104 | .Ar maskN . | |
105 | Codesets 2 and 3 are special in that the leading byte (0x8e or 0x8f) is | |
106 | first removed and the | |
107 | .Ar lenN | |
108 | argument is reduced by 1. | |
5b2abdfb | 109 | .Pp |
3d9156a7 A |
110 | For example, the |
111 | .Li ja_JP.eucJP | |
112 | locale has the following | |
113 | .Va VARIABLE | |
114 | line: | |
5b2abdfb | 115 | .Bd -literal |
3d9156a7 | 116 | VARIABLE 1 0x0000 2 0x8080 2 0x0080 3 0x8000 0x8080 |
5b2abdfb A |
117 | .Ed |
118 | .Pp | |
3d9156a7 A |
119 | Codeset 1 consists of the values 0x0000 - 0x007f. |
120 | .Pp | |
121 | Codeset 2 consists of the values who have the bits 0x8080 set. | |
122 | .Pp | |
123 | Codeset 3 consists of the values 0x0080 - 0x00ff. | |
124 | .Pp | |
125 | Codeset 4 consists of the values 0x8000 - 0xff7f excluding the values | |
126 | which have the 0x0080 bit set. | |
127 | .Pp | |
128 | Notice that the global | |
129 | .Ar mask | |
130 | is set to 0x8080, this implies that from those 2 bits the codeset can | |
131 | be determined. | |
132 | .Sh SEE ALSO | |
5b2abdfb | 133 | .Xr mklocale 1 , |
3d9156a7 | 134 | .Xr setlocale 3 |