]>
Commit | Line | Data |
---|---|---|
3d9156a7 | 1 | .\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. |
5b2abdfb A |
2 | .\" Copyright (c) 1993 |
3 | .\" The Regents of the University of California. All rights reserved. | |
4 | .\" | |
5 | .\" This code is derived from software contributed to Berkeley by | |
6 | .\" Donn Seeley of BSDI. | |
7 | .\" | |
8 | .\" Redistribution and use in source and binary forms, with or without | |
9 | .\" modification, are permitted provided that the following conditions | |
10 | .\" are met: | |
11 | .\" 1. Redistributions of source code must retain the above copyright | |
12 | .\" notice, this list of conditions and the following disclaimer. | |
13 | .\" 2. Redistributions in binary form must reproduce the above copyright | |
14 | .\" notice, this list of conditions and the following disclaimer in the | |
15 | .\" documentation and/or other materials provided with the distribution. | |
16 | .\" 3. All advertising materials mentioning features or use of this software | |
17 | .\" must display the following acknowledgement: | |
18 | .\" This product includes software developed by the University of | |
19 | .\" California, Berkeley and its contributors. | |
20 | .\" 4. Neither the name of the University nor the names of its contributors | |
21 | .\" may be used to endorse or promote products derived from this software | |
22 | .\" without specific prior written permission. | |
23 | .\" | |
24 | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND | |
25 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | |
26 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | |
27 | .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE | |
28 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | |
29 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS | |
30 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) | |
31 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | |
32 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY | |
33 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF | |
34 | .\" SUCH DAMAGE. | |
35 | .\" | |
36 | .\" @(#)multibyte.3 8.1 (Berkeley) 6/4/93 | |
3d9156a7 | 37 | .\" $FreeBSD: src/lib/libc/locale/multibyte.3,v 1.27 2004/10/17 02:29:15 tjr Exp $ |
5b2abdfb | 38 | .\" |
3d9156a7 | 39 | .Dd April 8, 2004 |
5b2abdfb A |
40 | .Dt MULTIBYTE 3 |
41 | .Os | |
42 | .Sh NAME | |
3d9156a7 A |
43 | .Nm multibyte |
44 | .Nd multibyte and wide character manipulation functions | |
5b2abdfb A |
45 | .Sh LIBRARY |
46 | .Lb libc | |
47 | .Sh SYNOPSIS | |
3d9156a7 | 48 | .In limits.h |
5b2abdfb | 49 | .In stdlib.h |
3d9156a7 | 50 | .In wchar.h |
5b2abdfb | 51 | .Sh DESCRIPTION |
3d9156a7 | 52 | The basic elements of some written natural languages, such as Chinese, |
5b2abdfb | 53 | cannot be represented uniquely with single C |
3d9156a7 | 54 | .Vt char Ns s . |
5b2abdfb | 55 | The C standard supports two different ways of dealing with |
3d9156a7 A |
56 | extended natural language encodings: |
57 | wide characters and | |
58 | multibyte characters. | |
5b2abdfb A |
59 | Wide characters are an internal representation |
60 | which allows each basic element to map | |
61 | to a single object of type | |
3d9156a7 | 62 | .Vt wchar_t . |
5b2abdfb A |
63 | Multibyte characters are used for input and output |
64 | and code each basic element as a sequence of C | |
3d9156a7 | 65 | .Vt char Ns s . |
5b2abdfb A |
66 | Individual basic elements may map into one or more |
67 | (up to | |
9385eb3d | 68 | .Dv MB_LEN_MAX ) |
5b2abdfb A |
69 | bytes in a multibyte character. |
70 | .Pp | |
71 | The current locale | |
72 | .Pq Xr setlocale 3 | |
73 | governs the interpretation of wide and multibyte characters. | |
74 | The locale category | |
75 | .Dv LC_CTYPE | |
76 | specifically controls this interpretation. | |
77 | The | |
3d9156a7 | 78 | .Vt wchar_t |
5b2abdfb A |
79 | type is wide enough to hold the largest value |
80 | in the wide character representations for all locales. | |
81 | .Pp | |
82 | Multibyte strings may contain | |
83 | .Sq shift | |
84 | indicators to switch to and from | |
85 | particular modes within the given representation. | |
86 | If explicit bytes are used to signal shifting, | |
87 | these are not recognized as separate characters | |
88 | but are lumped with a neighboring character. | |
89 | There is always a distinguished | |
90 | .Sq initial | |
91 | shift state. | |
3d9156a7 A |
92 | Some functions (e.g., |
93 | .Xr mblen 3 , | |
94 | .Xr mbtowc 3 | |
5b2abdfb | 95 | and |
3d9156a7 A |
96 | .Xr wctomb 3 ) |
97 | maintain static shift state internally, whereas | |
98 | others store it in an | |
99 | .Vt mbstate_t | |
100 | object passed by the caller. | |
101 | Shift states are undefined after a call to | |
102 | .Xr setlocale 3 | |
5b2abdfb A |
103 | with the |
104 | .Dv LC_CTYPE | |
105 | or | |
106 | .Dv LC_ALL | |
107 | categories. | |
108 | .Pp | |
109 | For convenience in processing, | |
110 | the wide character with value 0 | |
111 | (the null wide character) | |
112 | is recognized as the wide character string terminator, | |
113 | and the character with value 0 | |
114 | (the null byte) | |
115 | is recognized as the multibyte character string terminator. | |
116 | Null bytes are not permitted within multibyte characters. | |
117 | .Pp | |
3d9156a7 A |
118 | The C library provides the following functions for dealing with |
119 | multibyte characters: | |
120 | .Bl -column "Description" | |
121 | .It Sy "Function Description" | |
122 | .It Xr mblen 3 Ta "get number of bytes in a character" | |
123 | .It Xr mbrlen 3 Ta "get number of bytes in a character (restartable)" | |
124 | .It Xr mbrtowc 3 Ta "convert a character to a wide-character code (restartable)" | |
125 | .It Xr mbsrtowcs 3 Ta "convert a character string to a wide-character string (restartable)" | |
126 | .It Xr mbstowcs 3 Ta "convert a character string to a wide-character string" | |
127 | .It Xr mbtowc 3 Ta "convert a character to a wide-character code" | |
128 | .It Xr wcrtomb 3 Ta "convert a wide-character code to a character (restartable)" | |
129 | .It Xr wcstombs 3 Ta "convert a wide-character string to a character string" | |
130 | .It Xr wcsrtombs 3 Ta "convert a wide-character string to a character string (restartable)" | |
131 | .It Xr wctomb 3 Ta "convert a wide-character code to a character" | |
132 | .El | |
9385eb3d | 133 | .Sh SEE ALSO |
3d9156a7 | 134 | .Xr mklocale 1 , |
5b2abdfb | 135 | .Xr setlocale 3 , |
3d9156a7 A |
136 | .Xr stdio 3 , |
137 | .Xr big5 5 , | |
138 | .Xr euc 5 , | |
139 | .Xr gb18030 5 , | |
140 | .Xr gb2312 5 , | |
141 | .Xr gbk 5 , | |
142 | .Xr mskanji 5 , | |
9385eb3d | 143 | .Xr utf8 5 |
5b2abdfb | 144 | .Sh STANDARDS |
3d9156a7 A |
145 | These functions conform to |
146 | .St -isoC-99 . |