+In a similar way, wxString provides access to its contents as either @c wchar_t or
+@c char character buffer. Of course, the latter only works if the string contains
+data representable in the current locale encoding. This will always be the case
+if the string had been initially constructed from a narrow string or if it
+contains only 7-bit ASCII data but otherwise this conversion is not guaranteed
+to succeed. And as with wxString::FromUTF8() example above, you can always use
+wxString::ToUTF8() to retrieve the string contents in UTF-8 encoding -- this,
+unlike converting to @c char* using the current locale, never fails.
+
+For more info about how wxString works, please see the @ref overview_string.
+
+To summarize, Unicode support in wxWidgets is mostly @b transparent for the
+application and if you use wxString objects for storing all the character data
+in your program there is really nothing special to do. However you should be
+aware of the potential problems covered by the following section.
+
+
+@subsection overview_unicode_support_utf Choosing Unicode Representation
+
+wxWidgets uses the system @c wchar_t in wxString implementation by default
+under all systems. Thus, under Microsoft Windows, UCS-2 (simplified version of
+UTF-16 without support for surrogate characters) is used as @c wchar_t is 2
+bytes on this platform. Under Unix systems, including Mac OS X, UCS-4 (also
+known as UTF-32) is used by default, however it is also possible to build
+wxWidgets to use UTF-8 internally by passing @c --enable-utf8 option to
+configure.
+
+The interface provided by wxString is the same independently of the format used
+internally. However different formats have specific advantages and
+disadvantages. Notably, under Unix, the underlying graphical toolkit (e.g.
+GTK+) usually uses UTF-8 encoded strings and using the same representations for
+the strings in wxWidgets allows to avoid conversion from UTF-32 to UTF-8 and
+vice versa each time a string is shown in the UI or retrieved from it. The
+overhead of such conversions is usually negligible for small strings but may be
+important for some programs. If you believe that it would be advantageous to
+use UTF-8 for the strings in your particular application, you may rebuild
+wxWidgets to use UTF-8 as explained above (notice that this is currently not
+supported under Microsoft Windows and arguably doesn't make much sense there as
+Windows itself uses UTF-16 and not UTF-8) but be sure to be aware of the
+performance implications (see @ref overview_unicode_performance) of using UTF-8
+in wxString before doing this!
+
+Generally speaking you should only use non-default UTF-8 build in specific
+circumstances e.g. building for resource-constrained systems where the overhead
+of conversions (and also reduced memory usage of UTF-8 compared to UTF-32 for
+the European languages) can be important. If the environment in which your
+program is running is under your control -- as is quite often the case in such
+scenarios -- consider ensuring that the system always uses UTF-8 locale and
+use @c --enable-utf8only configure option to disable support for the other
+locales and consider all strings to be in UTF-8. This further reduces the code
+size and removes the need for conversions in more cases.
+
+
+@subsection overview_unicode_settings Unicode Related Preprocessor Symbols
+
+@c wxUSE_UNICODE is defined as 1 now to indicate Unicode support. It can be
+explicitly set to 0 in @c setup.h under MSW or you can use @c --disable-unicode
+under Unix but doing this is strongly discouraged. By default, @c
+wxUSE_UNICODE_WCHAR is also defined as 1, however in UTF-8 build (described in
+the previous section), it is set to 0 and @c wxUSE_UNICODE_UTF8, which is
+usually 0, is set to 1 instead. In the latter case, @c wxUSE_UTF8_LOCALE_ONLY
+can also be set to 1 to indicate that all strings are considered to be in UTF-8.
+
+
+
+@section overview_unicode_pitfalls Potential Unicode Pitfalls
+
+The problems can be separated into three broad classes:
+
+@subsection overview_unicode_compilation_errors Unicode-Related Compilation Errors
+
+Because of the need to support implicit conversions to both @c char and
+@c wchar_t, wxString implementation is rather involved and many of its operators
+don't return the types which they could be naively expected to return.
+For example, the @c operator[] doesn't return neither a @c char nor a @c wchar_t
+but an object of a helper class wxUniChar or wxUniCharRef which is implicitly
+convertible to either. Usually you don't need to worry about this as the
+conversions do their work behind the scenes however in some cases it doesn't
+work. Here are some examples, using a wxString object @c s and some integer @c
+n:
+
+ - Writing @code switch ( s[n] ) @endcode doesn't work because the argument of
+ the switch statement must be an integer expression so you need to replace
+ @c s[n] with @code s[n].GetValue() @endcode. You may also force the
+ conversion to @c char or @c wchar_t by using an explicit cast but beware that
+ converting the value to char uses the conversion to current locale and may
+ return 0 if it fails. Finally notice that writing @code (wxChar)s[n] @endcode
+ works both with wxWidgets 3.0 and previous library versions and so should be
+ used for writing code which should be compatible with both 2.8 and 3.0.
+
+ - Similarly, @code &s[n] @endcode doesn't yield a pointer to char so you may
+ not pass it to functions expecting @c char* or @c wchar_t*. Consider using
+ string iterators instead if possible or replace this expression with
+ @code s.c_str() + n @endcode otherwise.
+
+Another class of problems is related to the fact that the value returned by
+@c c_str() itself is also not just a pointer to a buffer but a value of helper
+class wxCStrData which is implicitly convertible to both narrow and wide
+strings. Again, this mostly will be unnoticeable but can result in some
+problems:
+
+ - You shouldn't pass @c c_str() result to vararg functions such as standard
+ @c printf(). Some compilers (notably g++) warn about this but even if they
+ don't, this @code printf("Hello, %s", s.c_str()) @endcode is not going to
+ work. It can be corrected in one of the following ways:
+
+ - Preferred: @code wxPrintf("Hello, %s", s) @endcode (notice the absence
+ of @c c_str(), it is not needed at all with wxWidgets functions)
+ - Compatible with wxWidgets 2.8: @code wxPrintf("Hello, %s", s.c_str()) @endcode
+ - Using an explicit conversion to narrow, multibyte, string:
+ @code printf("Hello, %s", (const char *)s.mb_str()) @endcode
+ - Using a cast to force the issue (listed only for completeness):
+ @code printf("Hello, %s", (const char *)s.c_str()) @endcode
+
+ - The result of @c c_str() cannot be cast to @c char* but only to @c const @c
+ @c char*. Of course, modifying the string via the pointer returned by this
+ method has never been possible but unfortunately it was occasionally useful
+ to use a @c const_cast here to pass the value to const-incorrect functions.
+ This can be done either using new wxString::char_str() (and matching
+ wchar_str()) method or by writing a double cast:
+ @code (char *)(const char *)s.c_str() @endcode
+
+ - One of the unfortunate consequences of the possibility to pass wxString to
+ @c wxPrintf() without using @c c_str() is that it is now impossible to pass
+ the elements of unnamed enumerations to @c wxPrintf() and other similar
+ vararg functions, i.e.
+ @code
+ enum { Red, Green, Blue };
+ wxPrintf("Red is %d", Red);
+ @endcode
+ doesn't compile. The easiest workaround is to give a name to the enum.
+
+Other unexpected compilation errors may arise but they should happen even more
+rarely than the above-mentioned ones and the solution should usually be quite
+simple: just use the explicit methods of wxUniChar and wxCStrData classes
+instead of relying on their implicit conversions if the compiler can't choose
+among them.
+
+
+@subsection overview_unicode_data_loss Data Loss due To Unicode Conversion Errors
+
+wxString API provides implicit conversion of the internal Unicode string
+contents to narrow, char strings. This can be very convenient and is absolutely
+necessary for backwards compatibility with the existing code using wxWidgets
+however it is a rather dangerous operation as it can easily give unexpected
+results if the string contents isn't convertible to the current locale.
+
+To be precise, the conversion will always succeed if the string was created
+from a narrow string initially. It will also succeed if the current encoding is
+UTF-8 as all Unicode strings are representable in this encoding. However
+initializing the string using wxString::FromUTF8() method and then accessing it
+as a char string via its wxString::c_str() method is a recipe for disaster as the
+program may work perfectly well during testing on Unix systems using UTF-8 locale
+but completely fail under Windows where UTF-8 locales are never used because
+wxString::c_str() would return an empty string.
+
+The simplest way to ensure that this doesn't happen is to avoid conversions to
+@c char* completely by using wxString throughout your program. However if the
+program never manipulates 8 bit strings internally, using @c char* pointers is
+safe as well. So the existing code needs to be reviewed when upgrading to
+wxWidgets 3.0 and the new code should be used with this in mind and ideally
+avoiding implicit conversions to @c char*.
+
+
+@subsection overview_unicode_performance Performance Implications of Using UTF-8
+
+As mentioned above, under Unix systems wxString class can use variable-width
+UTF-8 encoding for internal representation. In this case it can't guarantee
+constant-time access to N-th element of the string any longer as to find the
+position of this character in the string we have to examine all the preceding
+ones. Usually this doesn't matter much because most algorithms used on the
+strings examine them sequentially anyhow and because wxString implements a
+cache for iterating over the string by index but it can have serious
+consequences for algorithms using random access to string elements as they
+typically acquire O(N^2) time complexity instead of O(N) where N is the length
+of the string.
+
+Even despite caching the index, indexed access should be replaced with
+sequential access using string iterators. For example a typical loop: