[wxWidgets.git] / docs / latex / wx / tokenizr.tex

\section{\class{wxStringTokenizer}}\label{wxstringtokenizer}

wxStringTokenizer helps you to break a string up into a number of tokens. It
replaces the standard C function {\tt strtok()} and also extends it in a
number of ways.

To use this class, you should create a wxStringTokenizer object, give it the
string to tokenize and also the delimiters which separate tokens in the string
(by default, white space characters will be used).

Then \helpref{GetNextToken}{wxstringtokenizergetnexttoken} may be called
repeatedly until it \helpref{HasMoreTokens}{wxstringtokenizerhasmoretokens} 
returns \false.

For example:

\begin{verbatim}

wxStringTokenizer tkz(wxT("first:second:third:fourth"), wxT(":"));
while ( tkz.HasMoreTokens() )
{
    wxString token = tkz.GetNextToken();

    // process token here
}
\end{verbatim}

By default, wxStringTokenizer will behave in the same way as {\tt strtok()} if
the delimiters string only contains white space characters but, unlike the
standard function, it will return empty tokens if this is not the case. This
is helpful for parsing strictly formatted data where the number of fields is
fixed but some of them may be empty (i.e. {\tt TAB} or comma delimited text
files).

The behaviour is governed by the last 
\helpref{constructor}{wxstringtokenizerwxstringtokenizer}/\helpref{SetString}{wxstringtokenizersetstring} 
parameter {\tt mode} which may be one of the following:

\twocolwidtha{5cm}%
\begin{twocollist}\itemsep=0pt
\twocolitem{{\tt wxTOKEN\_DEFAULT}}{Default behaviour (as described above):
same as {\tt wxTOKEN\_STRTOK} if the delimiter string contains only
whitespaces, same as {\tt wxTOKEN\_RET\_EMPTY} otherwise}
\twocolitem{{\tt wxTOKEN\_RET\_EMPTY}}{In this mode, the empty tokens in the
middle of the string will be returned, i.e. {\tt "a::b:"} will be tokenized in
three tokens `a', `' and `b'. Notice that all trailing delimiters are ignored
in this mode, not just the last one, i.e. a string \texttt{"a::b::"} would
still result in the same set of tokens.}
\twocolitem{{\tt wxTOKEN\_RET\_EMPTY\_ALL}}{In this mode, empty trailing tokens
(including the one after the last delimiter character) will be returned as
well. The string \texttt{"a::b:"} will be tokenized in four tokens: the already
mentioned ones and another empty one as the last one and a string 
\texttt{"a::b::"} will have five tokens.}
\twocolitem{{\tt wxTOKEN\_RET\_DELIMS}}{In this mode, the delimiter character
after the end of the current token (there may be none if this is the last
token) is returned appended to the token. Otherwise, it is the same mode as 
\texttt{wxTOKEN\_RET\_EMPTY}. Notice that there is no mode like this one but
behaving like \texttt{wxTOKEN\_RET\_EMPTY\_ALL} instead of 
\texttt{wxTOKEN\_RET\_EMPTY}, use \texttt{wxTOKEN\_RET\_EMPTY\_ALL} and 
\helpref{GetLastDelimiter()}{wxstringtokenizergetlastdelimiter} to emulate it.}
\twocolitem{{\tt wxTOKEN\_STRTOK}}{In this mode the class behaves exactly like
the standard {\tt strtok()} function: the empty tokens are never returned.}
\end{twocollist}

\wxheading{Derived from}

\helpref{wxObject}{wxobject}

\wxheading{See also}

\helpref{wxStringTokenize}{wxstringtokenize}

\wxheading{Include files}

<wx/tokenzr.h>

\wxheading{Library}

\helpref{wxBase}{librarieslist}

\latexignore{\rtfignore{\wxheading{Members}}}


\membersection{wxStringTokenizer::wxStringTokenizer}\label{wxstringtokenizerwxstringtokenizer}

\func{}{wxStringTokenizer}{\void}

Default constructor. You must call 
\helpref{SetString}{wxstringtokenizersetstring} before calling any other
methods.

\func{}{wxStringTokenizer}{\param{const wxString\& }{str}, \param{const wxString\& }{delims = " $\backslash$t$\backslash$r$\backslash$n"}, \param{wxStringTokenizerMode }{mode = wxTOKEN\_DEFAULT}}

Constructor. Pass the string to tokenize, a string containing delimiters
and the mode specifying how the string should be tokenized.


\membersection{wxStringTokenizer::CountTokens}\label{wxstringtokenizercounttokens}

\constfunc{int}{CountTokens}{\void}

Returns the number of tokens remaining in the input string. The number of
tokens returned by this function is decremented each time 
\helpref{GetNextToken}{wxstringtokenizergetnexttoken} is called and when it
reaches $0$ \helpref{HasMoreTokens}{wxstringtokenizerhasmoretokens} returns
\false.


\membersection{wxStringTokenizer::HasMoreTokens}\label{wxstringtokenizerhasmoretokens}

\constfunc{bool}{HasMoreTokens}{\void}

Returns \true if the tokenizer has further tokens, \false if none are left.


\membersection{wxStringTokenizer::GetLastDelimiter}\label{wxstringtokenizergetlastdelimiter}

\func{wxChar}{GetLastDelimiter}{\void}

Returns the delimiter which ended scan for the last token returned by 
\helpref{GetNextToken()}{wxstringtokenizergetnexttoken} or \texttt{NUL} if
there had been no calls to this function yet or if it returned the trailing
empty token in \texttt{wxTOKEN\_RET\_EMPTY\_ALL} mode.

\newsince{2.7.0}


\membersection{wxStringTokenizer::GetNextToken}\label{wxstringtokenizergetnexttoken}

\constfunc{wxString}{GetNextToken}{\void}

Returns the next token or empty string if the end of string was reached.


\membersection{wxStringTokenizer::GetPosition}\label{wxstringtokenizergetposition}

\constfunc{size\_t}{GetPosition}{\void}

Returns the current position (i.e. one index after the last returned
token or 0 if GetNextToken() has never been called) in the original
string.


\membersection{wxStringTokenizer::GetString}\label{wxstringtokenizergetstring}

\constfunc{wxString}{GetString}{\void}

Returns the part of the starting string without all token already extracted.


\membersection{wxStringTokenizer::SetString}\label{wxstringtokenizersetstring}

\func{void}{SetString}{\param{const wxString\& }{to\_tokenize}, \param{const wxString\& }{delims = " $\backslash$t$\backslash$r$\backslash$n"}, \param{wxStringTokenizerMode }{mode = wxTOKEN\_DEFAULT}}

Initializes the tokenizer.

Pass the string to tokenize, a string containing delimiters,
and the mode specifying how the string should be tokenized.
Commit	Line	Data
d134d2d4 JS	1	\section{\class{wxStringTokenizer}}\label{wxstringtokenizer}
d134d2d4 JS	2
7c968cee VZ	3	wxStringTokenizer helps you to break a string up into a number of tokens. It
	4	replaces the standard C function {\tt strtok()} and also extends it in a
	5	number of ways.
d134d2d4	6
bbf8fc53 VZ	7	To use this class, you should create a wxStringTokenizer object, give it the
	8	string to tokenize and also the delimiters which separate tokens in the string
	9	(by default, white space characters will be used).
	10
	11	Then \helpref{GetNextToken}{wxstringtokenizergetnexttoken} may be called
	12	repeatedly until it \helpref{HasMoreTokens}{wxstringtokenizerhasmoretokens}
719ee7c4	13	returns \false.
bbf8fc53 VZ	14
	15	For example:
	16
	17	\begin{verbatim}
	18
7e34b934	19	wxStringTokenizer tkz(wxT("first:second:third:fourth"), wxT(":"));
bbf8fc53 VZ	20	while ( tkz.HasMoreTokens() )
	21	{
	22	wxString token = tkz.GetNextToken();
	23
	24	// process token here
	25	}
	26	\end{verbatim}
	27
7c968cee VZ	28	By default, wxStringTokenizer will behave in the same way as {\tt strtok()} if
	29	the delimiters string only contains white space characters but, unlike the
	30	standard function, it will return empty tokens if this is not the case. This
	31	is helpful for parsing strictly formatted data where the number of fields is
	32	fixed but some of them may be empty (i.e. {\tt TAB} or comma delimited text
	33	files).
	34
	35	The behaviour is governed by the last
	36	\helpref{constructor}{wxstringtokenizerwxstringtokenizer}/\helpref{SetString}{wxstringtokenizersetstring}
	37	parameter {\tt mode} which may be one of the following:
	38
	39	\twocolwidtha{5cm}%
	40	\begin{twocollist}\itemsep=0pt
	41	\twocolitem{{\tt wxTOKEN\_DEFAULT}}{Default behaviour (as described above):
	42	same as {\tt wxTOKEN\_STRTOK} if the delimiter string contains only
	43	whitespaces, same as {\tt wxTOKEN\_RET\_EMPTY} otherwise}
	44	\twocolitem{{\tt wxTOKEN\_RET\_EMPTY}}{In this mode, the empty tokens in the
	45	middle of the string will be returned, i.e. {\tt "a::b:"} will be tokenized in
4626c57c VZ	46	three tokens `a', `' and `b'. Notice that all trailing delimiters are ignored
	47	in this mode, not just the last one, i.e. a string \texttt{"a::b::"} would
	48	still result in the same set of tokens.}
	49	\twocolitem{{\tt wxTOKEN\_RET\_EMPTY\_ALL}}{In this mode, empty trailing tokens
	50	(including the one after the last delimiter character) will be returned as
	51	well. The string \texttt{"a::b:"} will be tokenized in four tokens: the already
	52	mentioned ones and another empty one as the last one and a string
	53	\texttt{"a::b::"} will have five tokens.}
7c968cee VZ	54	\twocolitem{{\tt wxTOKEN\_RET\_DELIMS}}{In this mode, the delimiter character
	55	after the end of the current token (there may be none if this is the last
	56	token) is returned appended to the token. Otherwise, it is the same mode as
4626c57c VZ	57	\texttt{wxTOKEN\_RET\_EMPTY}. Notice that there is no mode like this one but
	58	behaving like \texttt{wxTOKEN\_RET\_EMPTY\_ALL} instead of
	59	\texttt{wxTOKEN\_RET\_EMPTY}, use \texttt{wxTOKEN\_RET\_EMPTY\_ALL} and
	60	\helpref{GetLastDelimiter()}{wxstringtokenizergetlastdelimiter} to emulate it.}
7c968cee	61	\twocolitem{{\tt wxTOKEN\_STRTOK}}{In this mode the class behaves exactly like
4626c57c	62	the standard {\tt strtok()} function: the empty tokens are never returned.}
7c968cee	63	\end{twocollist}
bbf8fc53	64
d134d2d4 JS	65	\wxheading{Derived from}
	66
	67	\helpref{wxObject}{wxobject}
	68
bf00c875 VZ	69	\wxheading{See also}
	70
	71	\helpref{wxStringTokenize}{wxstringtokenize}
	72
954b8ae6 JS	73	\wxheading{Include files}
	74
	75	<wx/tokenzr.h>
	76
a7af285d VZ	77	\wxheading{Library}
	78
	79	\helpref{wxBase}{librarieslist}
	80
d134d2d4 JS	81	\latexignore{\rtfignore{\wxheading{Members}}}
d134d2d4 JS	82
719ee7c4	83
d134d2d4 JS	84	\membersection{wxStringTokenizer::wxStringTokenizer}\label{wxstringtokenizerwxstringtokenizer}
d134d2d4 JS	85
dbdb39b2 JS	86	\func{}{wxStringTokenizer}{\void}
dbdb39b2 JS	87
7c968cee VZ	88	Default constructor. You must call
	89	\helpref{SetString}{wxstringtokenizersetstring} before calling any other
	90	methods.
dbdb39b2	91
7c968cee	92	\func{}{wxStringTokenizer}{\param{const wxString\& }{str}, \param{const wxString\& }{delims = " $\backslash$t$\backslash$r$\backslash$n"}, \param{wxStringTokenizerMode }{mode = wxTOKEN\_DEFAULT}}
d134d2d4	93
7c968cee VZ	94	Constructor. Pass the string to tokenize, a string containing delimiters
7c968cee VZ	95	and the mode specifying how the string should be tokenized.
d134d2d4	96
719ee7c4	97
d134d2d4 JS	98	\membersection{wxStringTokenizer::CountTokens}\label{wxstringtokenizercounttokens}
d134d2d4 JS	99
ad813b00	100	\constfunc{int}{CountTokens}{\void}
d134d2d4	101
719ee7c4 VZ	102	Returns the number of tokens remaining in the input string. The number of
	103	tokens returned by this function is decremented each time
	104	\helpref{GetNextToken}{wxstringtokenizergetnexttoken} is called and when it
	105	reaches $0$ \helpref{HasMoreTokens}{wxstringtokenizerhasmoretokens} returns
	106	\false.
	107
d134d2d4	108
ad813b00	109	\membersection{wxStringTokenizer::HasMoreTokens}\label{wxstringtokenizerhasmoretokens}
d134d2d4	110
ad813b00	111	\constfunc{bool}{HasMoreTokens}{\void}
d134d2d4	112
719ee7c4 VZ	113	Returns \true if the tokenizer has further tokens, \false if none are left.
719ee7c4 VZ	114
d134d2d4	115
4626c57c VZ	116	\membersection{wxStringTokenizer::GetLastDelimiter}\label{wxstringtokenizergetlastdelimiter}
	117
	118	\func{wxChar}{GetLastDelimiter}{\void}
	119
	120	Returns the delimiter which ended scan for the last token returned by
	121	\helpref{GetNextToken()}{wxstringtokenizergetnexttoken} or \texttt{NUL} if
	122	there had been no calls to this function yet or if it returned the trailing
	123	empty token in \texttt{wxTOKEN\_RET\_EMPTY\_ALL} mode.
	124
b2458f31 MR	125	\newsince{2.7.0}
b2458f31 MR	126
4626c57c	127
ad813b00	128	\membersection{wxStringTokenizer::GetNextToken}\label{wxstringtokenizergetnexttoken}
d134d2d4	129
4626c57c	130	\constfunc{wxString}{GetNextToken}{\void}
d134d2d4	131
bbf8fc53 VZ	132	Returns the next token or empty string if the end of string was reached.
bbf8fc53 VZ	133
719ee7c4	134
bbf8fc53 VZ	135	\membersection{wxStringTokenizer::GetPosition}\label{wxstringtokenizergetposition}
	136
	137	\constfunc{size\_t}{GetPosition}{\void}
	138
	139	Returns the current position (i.e. one index after the last returned
	140	token or 0 if GetNextToken() has never been called) in the original
	141	string.
d134d2d4	142
719ee7c4	143
d134d2d4 JS	144	\membersection{wxStringTokenizer::GetString}\label{wxstringtokenizergetstring}
d134d2d4 JS	145
ad813b00	146	\constfunc{wxString}{GetString}{\void}
d134d2d4	147
bbf8fc53	148	Returns the part of the starting string without all token already extracted.
d134d2d4	149
719ee7c4	150
dbdb39b2 JS	151	\membersection{wxStringTokenizer::SetString}\label{wxstringtokenizersetstring}
dbdb39b2 JS	152
7c968cee	153	\func{void}{SetString}{\param{const wxString\& }{to\_tokenize}, \param{const wxString\& }{delims = " $\backslash$t$\backslash$r$\backslash$n"}, \param{wxStringTokenizerMode }{mode = wxTOKEN\_DEFAULT}}
d134d2d4	154
dbdb39b2 JS	155	Initializes the tokenizer.
	156
	157	Pass the string to tokenize, a string containing delimiters,
7c968cee	158	and the mode specifying how the string should be tokenized.
d134d2d4	159