]>
Commit | Line | Data |
---|---|---|
eec47cc6 VZ |
1 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
2 | %% Name: mbconv.tex | |
3 | %% Purpose: wxMBConv documentation | |
4 | %% Author: Ove Kaaven, Vadim Zeitlin | |
5 | %% Created: 2000-03-25 | |
6 | %% RCS-ID: $Id$ | |
7 | %% Copyright: (c) 2000 Ove Kaaven | |
8 | %% (c) 2003-2006 Vadim Zeitlin | |
9 | %% License: wxWindows license | |
10 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | |
11 | ||
f6bcfd97 BP |
12 | |
13 | \section{\class{wxMBConv}}\label{wxmbconv} | |
14 | ||
15 | This class is the base class of a hierarchy of classes capable of converting | |
eec47cc6 VZ |
16 | text strings between multibyte (SBCS or DBCS) encodings and Unicode. |
17 | ||
18 | In the documentation for this and related classes please notice that | |
19 | \emph{length} of the string refers to the number of characters in the string | |
20 | not counting the terminating \NUL, if any. While the \emph{size} of the string | |
5e51fb4c | 21 | is the total number of bytes in the string, including any trailing \NUL. |
eec47cc6 VZ |
22 | Thus, length of wide character string \texttt{L"foo"} is $3$ while its size can |
23 | be either $8$ or $16$ depending on whether \texttt{wchar\_t} is $2$ bytes (as | |
24 | under Windows) or $4$ (Unix). | |
25 | ||
26 | \wxheading{Global variables} | |
27 | ||
28 | There are several predefined instances of this class: | |
29 | \begin{twocollist} | |
30 | \twocolitem{\textbf{wxConvLibc}}{Uses the standard ANSI C \texttt{mbstowcs()} and | |
31 | \texttt{wcstombs()} functions to perform the conversions; thus depends on the | |
32 | current locale.} | |
ef95ce41 VZ |
33 | \twocolitem{\textbf{wxConvLocal}}{Another conversion corresponding to the |
34 | current locale but this one uses the best available conversion.} | |
d5bef0a3 VZ |
35 | \twocolitem{\textbf{wxConvUI}}{The conversion used for hte standard UI elements |
36 | such as menu items and buttons. This is a pointer which is initially set to | |
37 | \texttt{wxConvLocal} as the program uses the current locale by default but can | |
38 | be set to some specific conversion if the program needs to use a specific | |
39 | encoding for its UI.} | |
ef95ce41 VZ |
40 | \twocolitem{\textbf{wxConvISO8859\_1}}{Conversion to and from ISO-8859-1 (Latin I) |
41 | encoding.} | |
42 | \twocolitem{\textbf{wxConvUTF8}}{Conversion to and from UTF-8 encoding.} | |
eec47cc6 VZ |
43 | \twocolitem{\textbf{wxConvFile}}{The appropriate conversion for the file names, |
44 | depends on the system.} | |
ef95ce41 | 45 | % \twocolitem{\textbf{wxConvCurrent}}{Not really clear what is it for...} |
eec47cc6 | 46 | \end{twocollist} |
f6bcfd97 | 47 | |
483b0434 VZ |
48 | |
49 | \wxheading{Constants} | |
50 | ||
51 | \texttt{wxCONV\_FAILED} value is defined as \texttt{(size\_t)$-1$} and is | |
52 | returned by the conversion functions instead of the length of the converted | |
53 | string if the conversion fails. | |
54 | ||
55 | ||
f6bcfd97 BP |
56 | \wxheading{Derived from} |
57 | ||
58 | No base class | |
59 | ||
60 | \wxheading{Include files} | |
61 | ||
62 | <wx/strconv.h> | |
63 | ||
a7af285d VZ |
64 | \wxheading{Library} |
65 | ||
66 | \helpref{wxBase}{librarieslist} | |
67 | ||
f6bcfd97 BP |
68 | \wxheading{See also} |
69 | ||
70 | \helpref{wxCSConv}{wxcsconv}, | |
71 | \helpref{wxEncodingConverter}{wxencodingconverter}, | |
72 | \helpref{wxMBConv classes overview}{mbconvclasses} | |
73 | ||
483b0434 | 74 | |
f6bcfd97 BP |
75 | \latexignore{\rtfignore{\wxheading{Members}}} |
76 | ||
77 | ||
78 | \membersection{wxMBConv::wxMBConv}\label{wxmbconvwxmbconv} | |
79 | ||
80 | \func{}{wxMBConv}{\void} | |
81 | ||
483b0434 VZ |
82 | Trivial default constructor. |
83 | ||
f6bcfd97 BP |
84 | |
85 | \membersection{wxMBConv::MB2WC}\label{wxmbconvmb2wc} | |
86 | ||
eec47cc6 VZ |
87 | \constfunc{virtual size\_t}{MB2WC}{\param{wchar\_t *}{out}, \param{const char *}{in}, \param{size\_t }{outLen}} |
88 | ||
483b0434 VZ |
89 | \deprecated{\helpref{ToWChar}{wxmbconvtowchar}} |
90 | ||
eec47cc6 VZ |
91 | Converts from a string \arg{in} in multibyte encoding to Unicode putting up to |
92 | \arg{outLen} characters into the buffer \arg{out}. | |
f6bcfd97 | 93 | |
eec47cc6 VZ |
94 | If \arg{out} is \NULL, only the length of the string which would result from |
95 | the conversion is calculated and returned. Note that this is the length and not | |
96 | size, i.e. the returned value does \emph{not} include the trailing \NUL. But | |
97 | when the function is called with a non-\NULL \arg{out} buffer, the \arg{outLen} | |
98 | parameter should be one more to allow to properly \NUL-terminate the string. | |
2b5f62a0 VZ |
99 | |
100 | \wxheading{Parameters} | |
101 | ||
eec47cc6 | 102 | \docparam{out}{The output buffer, may be \NULL if the caller is only |
2b5f62a0 VZ |
103 | interested in the length of the resulting string} |
104 | ||
eec47cc6 | 105 | \docparam{in}{The \NUL-terminated input string, cannot be \NULL} |
2b5f62a0 | 106 | |
eec47cc6 VZ |
107 | \docparam{outLen}{The length of the output buffer but \emph{including} |
108 | \NUL, ignored if \arg{out} is \NULL} | |
2b5f62a0 VZ |
109 | |
110 | \wxheading{Return value} | |
111 | ||
5e51fb4c | 112 | The length of the converted string \emph{excluding} the trailing \NUL. |
eec47cc6 | 113 | |
f6bcfd97 BP |
114 | |
115 | \membersection{wxMBConv::WC2MB}\label{wxmbconvwc2mb} | |
116 | ||
117 | \constfunc{virtual size\_t}{WC2MB}{\param{char* }{buf}, \param{const wchar\_t* }{psz}, \param{size\_t }{n}} | |
118 | ||
483b0434 VZ |
119 | \deprecated{\helpref{FromWChar}{wxmbconvfromwchar}} |
120 | ||
2b5f62a0 VZ |
121 | Converts from Unicode to multibyte encoding. The semantics of this function |
122 | (including the return value meaning) is the same as for | |
123 | \helpref{MB2WC}{wxmbconvmb2wc}. | |
124 | ||
eec47cc6 VZ |
125 | Notice that when the function is called with a non-\NULL buffer, the |
126 | {\it n} parameter should be the size of the buffer and so it \emph{should} take | |
5e51fb4c | 127 | into account the trailing \NUL, which might take two or four bytes for some |
eec47cc6 VZ |
128 | encodings (UTF-16 and UTF-32) and not one. |
129 | ||
f6bcfd97 BP |
130 | |
131 | \membersection{wxMBConv::cMB2WC}\label{wxmbconvcmb2wc} | |
132 | ||
eec47cc6 VZ |
133 | \constfunc{const wxWCharBuffer}{cMB2WC}{\param{const char *}{in}} |
134 | ||
135 | \constfunc{const wxWCharBuffer}{cMB2WC}{\param{const char *}{in}, \param{size\_t }{inLen}, \param{size\_t }{*outLen}} | |
136 | ||
137 | Converts from multibyte encoding to Unicode by calling | |
138 | \helpref{MB2WC}{wxmbconvmb2wc}, allocating a temporary wxWCharBuffer to hold | |
139 | the result. | |
140 | ||
141 | The first overload takes a \NUL-terminated input string. The second one takes a | |
142 | string of exactly the specified length and the string may include or not the | |
5e51fb4c | 143 | trailing \NUL character(s). If the string is not \NUL-terminated, a temporary |
eec47cc6 VZ |
144 | \NUL-terminated copy of it suitable for passing to \helpref{MB2WC}{wxmbconvmb2wc} |
145 | is made, so it is more efficient to ensure that the string is does have the | |
146 | appropriate number of \NUL bytes (which is usually $1$ but may be $2$ or $4$ | |
7ef3ab50 VZ |
147 | for UTF-16 or UTF-32, see \helpref{GetMBNulLen}{wxmbconvgetmbnullen}), |
148 | especially for long strings. | |
eec47cc6 VZ |
149 | |
150 | If \arg{outLen} is not-\NULL, it receives the length of the converted | |
151 | string. | |
f6bcfd97 | 152 | |
f6bcfd97 BP |
153 | |
154 | \membersection{wxMBConv::cWC2MB}\label{wxmbconvcwc2mb} | |
155 | ||
eec47cc6 VZ |
156 | \constfunc{const wxCharBuffer}{cWC2MB}{\param{const wchar\_t* }{in}} |
157 | ||
158 | \constfunc{const wxCharBuffer}{cWC2MB}{\param{const wchar\_t* }{in}, \param{size\_t }{inLen}, \param{size\_t }{*outLen}} | |
f6bcfd97 BP |
159 | |
160 | Converts from Unicode to multibyte encoding by calling WC2MB, | |
161 | allocating a temporary wxCharBuffer to hold the result. | |
162 | ||
eec47cc6 VZ |
163 | The second overload of this function allows to convert a string of the given |
164 | length \arg{inLen}, whether it is \NUL-terminated or not (for wide character | |
165 | strings, unlike for the multibyte ones, a single \NUL is always enough). | |
166 | But notice that just as with \helpref{cMB2WC}{wxmbconvmb2wc}, it is more | |
167 | efficient to pass an already terminated string to this function as otherwise a | |
168 | copy is made internally. | |
169 | ||
170 | If \arg{outLen} is not-\NULL, it receives the length of the converted | |
171 | string. | |
172 | ||
173 | ||
f6bcfd97 BP |
174 | \membersection{wxMBConv::cMB2WX}\label{wxmbconvcmb2wx} |
175 | ||
176 | \constfunc{const char*}{cMB2WX}{\param{const char* }{psz}} | |
177 | ||
178 | \constfunc{const wxWCharBuffer}{cMB2WX}{\param{const char* }{psz}} | |
179 | ||
180 | Converts from multibyte encoding to the current wxChar type | |
181 | (which depends on whether wxUSE\_UNICODE is set to 1). If wxChar is char, | |
182 | it returns the parameter unaltered. If wxChar is wchar\_t, it returns the | |
183 | result in a wxWCharBuffer. The macro wxMB2WXbuf is defined as the correct | |
184 | return type (without const). | |
185 | ||
eec47cc6 | 186 | |
f6bcfd97 BP |
187 | \membersection{wxMBConv::cWX2MB}\label{wxmbconvcwx2mb} |
188 | ||
189 | \constfunc{const char*}{cWX2MB}{\param{const wxChar* }{psz}} | |
190 | ||
191 | \constfunc{const wxCharBuffer}{cWX2MB}{\param{const wxChar* }{psz}} | |
192 | ||
193 | Converts from the current wxChar type to multibyte encoding. If wxChar is char, | |
194 | it returns the parameter unaltered. If wxChar is wchar\_t, it returns the | |
195 | result in a wxCharBuffer. The macro wxWX2MBbuf is defined as the correct | |
196 | return type (without const). | |
197 | ||
eec47cc6 | 198 | |
f6bcfd97 BP |
199 | \membersection{wxMBConv::cWC2WX}\label{wxmbconvcwc2wx} |
200 | ||
201 | \constfunc{const wchar\_t*}{cWC2WX}{\param{const wchar\_t* }{psz}} | |
202 | ||
203 | \constfunc{const wxCharBuffer}{cWC2WX}{\param{const wchar\_t* }{psz}} | |
204 | ||
205 | Converts from Unicode to the current wxChar type. If wxChar is wchar\_t, | |
206 | it returns the parameter unaltered. If wxChar is char, it returns the | |
207 | result in a wxCharBuffer. The macro wxWC2WXbuf is defined as the correct | |
208 | return type (without const). | |
209 | ||
eec47cc6 | 210 | |
f6bcfd97 BP |
211 | \membersection{wxMBConv::cWX2WC}\label{wxmbconvcwx2wc} |
212 | ||
213 | \constfunc{const wchar\_t*}{cWX2WC}{\param{const wxChar* }{psz}} | |
214 | ||
215 | \constfunc{const wxWCharBuffer}{cWX2WC}{\param{const wxChar* }{psz}} | |
216 | ||
217 | Converts from the current wxChar type to Unicode. If wxChar is wchar\_t, | |
218 | it returns the parameter unaltered. If wxChar is char, it returns the | |
219 | result in a wxWCharBuffer. The macro wxWX2WCbuf is defined as the correct | |
220 | return type (without const). | |
221 | ||
7ef3ab50 | 222 | |
483b0434 VZ |
223 | \membersection{wxMBConv::FromWChar}\label{wxmbconvfromwchar} |
224 | ||
13b55525 | 225 | \constfunc{virtual size\_t}{FromWChar}{\param{char\_t *}{dst}, \param{size\_t }{dstLen}, \param{const wchar\_t *}{src}, \param{size\_t }{srcLen = wxNO\_LEN}} |
483b0434 | 226 | |
13b55525 JS |
227 | This function has the same semantics as \helpref{ToWChar}{wxmbconvtowchar} |
228 | except that it converts a wide string to multibyte one. | |
483b0434 VZ |
229 | |
230 | \membersection{wxMBConv::GetMaxMBNulLen}\label{wxmbconvgetmaxmbnullen} | |
231 | ||
232 | \func{const size\_t}{GetMaxMBNulLen}{\void} | |
233 | ||
234 | Returns the maximal value which can be returned by | |
235 | \helpref{GetMBNulLen}{wxmbconvgetmbnullen} for any conversion object. Currently | |
236 | this value is $4$. | |
237 | ||
238 | This method can be used to allocate the buffer with enough space for the | |
239 | trailing \NUL characters for any encoding. | |
240 | ||
241 | ||
7ef3ab50 VZ |
242 | \membersection{wxMBConv::GetMBNulLen}\label{wxmbconvgetmbnullen} |
243 | ||
244 | \constfunc{size\_t}{GetMBNulLen}{\void} | |
245 | ||
246 | This function returns $1$ for most of the multibyte encodings in which the | |
247 | string is terminated by a single \NUL, $2$ for UTF-16 and $4$ for UTF-32 for | |
248 | which the string is terminated with $2$ and $4$ \NUL characters respectively. | |
21de14b3 VZ |
249 | The other cases are not currently supported and \texttt{wxCONV\_FAILED} |
250 | (defined as $-1$) is returned for them. | |
7ef3ab50 VZ |
251 | |
252 | ||
483b0434 VZ |
253 | \membersection{wxMBConv::ToWChar}\label{wxmbconvtowchar} |
254 | ||
13b55525 | 255 | \constfunc{virtual size\_t}{ToWChar}{\param{wchar\_t *}{dst}, \param{size\_t }{dstLen}, \param{const char *}{src}, \param{size\_t }{srcLen = wxNO\_LEN}} |
483b0434 | 256 | |
13b55525 JS |
257 | The most general function for converting a multibyte string to a wide string. |
258 | The main case is when \arg{dst} is not \NULL and \arg{srcLen} is not | |
259 | \texttt{wxNO\_LEN} (which is defined as \texttt{(size\_t)$-1$}): then | |
260 | the function converts exactly \arg{srcLen} bytes starting at \arg{src} into | |
261 | wide string which it output to \arg{dst}. If the length of the resulting wide | |
262 | string is greater than \arg{dstLen}, an error is returned. Note that if | |
263 | \arg{srcLen} bytes don't include \NUL characters, the resulting wide string is | |
264 | not \NUL-terminated neither. | |
483b0434 | 265 | |
13b55525 JS |
266 | If \arg{srcLen} is \texttt{wxNO\_LEN}, the function supposes that the string is |
267 | properly (i.e. as necessary for the encoding handled by this conversion) | |
268 | \NUL-terminated and converts the entire string, including any trailing \NUL | |
269 | bytes. In this case the wide string is also \NUL-terminated. | |
270 | ||
271 | Finally, if \arg{dst} is \NULL, the function returns the length of the needed | |
272 | buffer. | |
273 | ||
274 | \wxheading{Return value} | |
275 | ||
276 | The number of characters written to \arg{dst} (or the number of characters | |
277 | which would have been written to it if it were non-\NULL) on success or | |
278 | \texttt{wxCONV\_FAILED} on error. | |
483b0434 | 279 | |
f1e589cd | 280 |