X-Git-Url: https://git.saurik.com/wxWidgets.git/blobdiff_plain/2f365fcbd591ef4da63f5ca44d1f4b22ab20d287..4a2d030adfa836f6ada1830c9057170d053bcc64:/docs/doxygen/overviews/string.h diff --git a/docs/doxygen/overviews/string.h b/docs/doxygen/overviews/string.h index 3829548e3c..e493eaf9dc 100644 --- a/docs/doxygen/overviews/string.h +++ b/docs/doxygen/overviews/string.h @@ -2,30 +2,14 @@ // Name: string.h // Purpose: topic overview // Author: wxWidgets team -// RCS-ID: $Id$ -// Licence: wxWindows license +// Licence: wxWindows licence ///////////////////////////////////////////////////////////////////////////// /** @page overview_string wxString Overview -Classes: wxString, wxArrayString, wxStringTokenizer - -@li @ref overview_string_intro -@li @ref overview_string_internal -@li @ref overview_string_binary -@li @ref overview_string_comparison -@li @ref overview_string_advice -@li @ref overview_string_related -@li @ref overview_string_tuning -@li @ref overview_string_settings - - -
- - -@section overview_string_intro Introduction +@tableofcontents wxString is a class which represents a Unicode string of arbitrary length and containing arbitrary Unicode characters. @@ -44,7 +28,7 @@ has been done to make existing code using ANSI string literals work as it did in previous versions. -@section overview_string_internal Internal wxString encoding +@section overview_string_internal Internal wxString Encoding Since wxWidgets 3.0 wxString internally uses UTF-16 (with Unicode code units stored in @c wchar_t) under Windows and UTF-8 (with Unicode @@ -56,7 +40,7 @@ see the @ref overview_unicode_encodings paragraph. For simplicity of implementation, wxString when wxUSE_UNICODE_WCHAR==1 (e.g. on Windows) uses per code unit indexing instead of per code point indexing and doesn't know anything about surrogate pairs; -in other words it always considers code points to be composed by 1 code point, +in other words it always considers code points to be composed by 1 code unit, while this is really true only for characters in the @e BMP (Basic Multilingual Plane). Thus when iterating over a UTF-16 string stored in a wxString under Windows, the user code has to take care of surrogate pairs himself. @@ -66,7 +50,9 @@ such as for drawing strings on screen.) @remarks Note that while the behaviour of wxString when wxUSE_UNICODE_WCHAR==1 resembles UCS-2 encoding, it's not completely correct to refer to wxString as -UCS-2 encoded since you can encode characters outside the @e BMP in a wxString. +UCS-2 encoded since you can encode code points outside the @e BMP in a wxString +as two code units (i.e. as a surrogate pair; as already mentioned however wxString +will "see" them as two different code points) When instead wxUSE_UNICODE_UTF8==1 (e.g. on Linux and Mac OS X) wxString handles UTF8 multi-bytes sequences just fine also for characters outside @@ -244,7 +230,7 @@ For this conversion, the @a wxConvLibc class instance is used. See wxCSConv and wxMBConv. -@subsection overview_string_iterating Iterating wxString's characters +@subsection overview_string_iterating Iterating wxString Characters As previously described, when wxUSE_UNICODE_UTF8==1, wxString internally uses the variable-length UTF8 encoding. @@ -286,7 +272,7 @@ these problems: wxIsEmpty() verifies whether the string is empty (returning case-insensitive string comparison function known either as @c stricmp() or @c strcasecmp() on different platforms. -The @ header also defines ::wxSnprintf and ::wxVsnprintf +The @ header also defines wxSnprintf() and wxVsnprintf() functions which should be used instead of the inherently dangerous standard @c sprintf() and which use @c snprintf() instead which does buffer size checks whenever possible. Of course, you may also use wxString::Printf which is also @@ -389,4 +375,3 @@ also defined, otherwise @c wxUSE_UNICODE_WCHAR is. See also @ref page_wxusedef_important. */ -