+The wxString class has been completely rewritten for wxWidgets 3.0 but much work
+has been done to make existing code using ANSI string literals work as it did
+in previous versions.
+
+
+@section overview_string_internal Internal wxString encoding
+
+Since wxWidgets 3.0 wxString internally uses <b>UCS-2</b> (with Unicode
+code units stored in @c wchar_t) under Windows and <b>UTF-8</b> (with Unicode
+code units stored in @c char) under Unix, Linux and Mac OS X to store its content.
+
+For definitions of <em>code units</em> and <em>code points</em> terms, please
+see the @ref overview_unicode_encodings paragraph.
+
+Note that there is a difference about UCS-2 and UTF-16: the first is a fixed-length
+encoding, without <em>surrogate pairs</em>, while the latter is a
+variable-length encoding. Except for this the two encodings are identical.
+
+For simplicity of implementation, wxString when <tt>wxUSE_UNICODE_WCHAR==1</tt>
+(e.g. on Windows) uses UCS-2 and thus doesn't know anything about surrogate pairs;
+it always consider 1 code unit per 1 code point, while this is really true only for
+characters in the @e BMP (Basic Multilingual Plane).
+Thus when iterating over a UTF-16 string stored in a wxString under Windows, the user
+code has to take care of <em>surrogate pair</em> handling himself.
+(Note however that Windows itself has built-in support for surrogate pairs in UTF-16,
+such as for drawing strings on screen.)
+
+When instead <tt>wxUSE_UNICODE_UTF8==1</tt> (e.g. on Linux and Mac OS X)
+wxString handles UTF8 multi-bytes sequences just fine, so that you can use
+UTF8 in a completely transparent way:
+
+Example:
+@code
+ // first test, using exotic characters outside of the Unicode BMP:
+
+ wxString test = wxString::FromUTF8("\xF0\x90\x8C\x80");
+ // U+10300 is "OLD ITALIC LETTER A" and is part of Unicode Plane 1
+ // in UTF8 it's encoded as 0xF0 0x90 0x8C 0x80
+
+ // it's a single Unicode code-point encoded as:
+ // - a UTF16 surrogate pair under Windows
+ // - a UTF8 multiple-bytes sequence under Linux
+ // (without considering the final NULL)
+
+ wxPrintf("wxString reports a length of %d character(s)", test.length());
+ // prints "wxString reports a length of 1 character(s)" on Linux
+ // prints "wxString reports a length of 2 character(s)" on Windows
+ // since Windows doesn't have surrogate pairs support!
+
+
+ // second test, this time using characters part of the Unicode BMP:
+
+ wxString test2 = wxString::FromUTF8("\x41\xC3\xA0\xE2\x82\xAC");
+ // this is the UTF8 encoding of capital letter A followed by
+ // 'small case letter a with grave' followed by the 'euro sign'
+
+ // they are 3 Unicode code-points encoded as:
+ // - 3 UTF16 code units under Windows
+ // - 6 UTF8 code units under Linux
+ // (without considering the final NULL)
+
+ wxPrintf("wxString reports a length of %d character(s)", test2.length());
+ // prints "wxString reports a length of 3 character(s)" on Linux
+ // prints "wxString reports a length of 3 character(s)" on Windows
+@endcode
+
+To better explain what stated above, consider the second string of the example
+above; it's composed by 3 characters and the final @c NULL:
+
+@image html overview_wxstring_encoding.png
+
+As you can see, UCS2/UTF16 encoding is straightforward (for characters in the @e BMP)
+and in this example the UCS2-encoded wxString takes 8 bytes.
+UTF8 encoding is more elaborated and in this example takes 7 bytes.
+
+The type used by wxString to store Unicode code units is called wxStringCharType.
+
+In general, for strings containing many latin characters UTF8 provides a big
+advantage in memory footprint respect UTF16, but requires some more processing
+for common operations like e.g. length calculation.
+
+