2 % automatically generated by HelpGen from
3 % htmlparser.tex at 14/Mar/99 20:13:37
7 \section{\class{wxHtmlParser
}}\label{wxhtmlparser
}
9 This class handles
{\bf generic
} parsing of HTML
document : it scans
10 the
document and divide it into blocks of tags (where one block
11 consists of begining and ending tag and of text between these
14 It is independent from wxHtmlWindow and can be used as stand-alone parser
15 (Julian Smart's idea of speech-only HTML viewer or wget-like utility -
16 see InetGet sample for example).
18 It uses system of tag handlers to parse the HTML
document. Tag handlers
19 are not staticaly shared by all instances but are created for each
20 wxHtmlParser instance. The reason is that the handler may contain
21 document-specific temporary data used during parsing (e.g. complicated
22 structures like tables)
24 Typically the user calls only
\helpref{Parse
}{wxhtmlparserparse
} method.
26 \wxheading{Derived from
}
32 \helpref{Cells Overview
}{cells
},
33 \helpref{Tag Handlers Overview
}{handlers
},
34 \helpref{wxHtmlTag
}{wxhtmltag
}
36 \latexignore{\rtfignore{\wxheading{Members
}}}
39 \membersection{wxHtmlParser::wxHtmlParser
}\label{wxhtmlparserwxhtmlparser
}
41 \func{}{wxHtmlParser
}{\void}
46 \membersection{wxHtmlParser::SetFS
}\label{wxhtmlparsersetfs
}
48 \func{void
}{SetFS
}{\param{wxFileSystem
}{*fs
}}
50 Sets the virtual file system that will be used to request additional
51 files. (For example
{\tt <IMG>
} tag handler requests wxFSFile with the
54 \membersection{wxHtmlParser::GetFS
}\label{wxhtmlparsergetfs
}
56 \constfunc{wxFileSystem*
}{GetFS
}{\void}
58 Returns pointer to the file system. Because each tag handler has
59 reference to it's parent parser it can easily request the file by
63 wxFSFile *f = m_Parser -> GetFS() -> OpenFile("image.jpg");
67 \membersection{wxHtmlParser::Parse
}\label{wxhtmlparserparse
}
69 \func{wxObject*
}{Parse
}{\param{const wxString\&
}{source
}}
71 Proceeds parsing of the
document. This is end-user method. You can simply
72 call it when you need to obtain parsed output (which is parser-specific)
74 The method does these things:
77 \item calls
\helpref{InitParser(source)
}{wxhtmlparserinitparser
}
78 \item calls
\helpref{DoParsing
}{wxhtmlparserdoparsing
}
79 \item calls
\helpref{GetProduct
}{wxhtmlparsergetproduct
}
80 \item calls
\helpref{DoneParser
}{wxhtmlparserdoneparser
}
81 \item returns value returned by GetProduct
84 You shouldn't use InitParser, DoParsing, GetProduct or DoneParser directly.
86 \membersection{wxHtmlParser::InitParser
}\label{wxhtmlparserinitparser
}
88 \func{virtual void
}{InitParser
}{\param{const wxString\&
}{source
}}
90 Setups the parser for parsing the
{\it source
} string. (Should be overriden
94 \membersection{wxHtmlParser::DoneParser
}\label{wxhtmlparserdoneparser
}
96 \func{virtual void
}{DoneParser
}{\void}
98 This must be called after DoParsing().
101 \membersection{wxHtmlParser::DoParsing
}\label{wxhtmlparserdoparsing
}
103 \func{void
}{DoParsing
}{\param{int
}{begin
\_pos},
\param{int
}{end
\_pos}}
105 \func{void
}{DoParsing
}{\void}
107 Parses the m
\_Source from begin
\_pos to end
\_pos-
1.
108 (in noparams version it parses whole m
\_Source)
110 \membersection{wxHtmlParser::GetProduct
}\label{wxhtmlparsergetproduct
}
112 \func{virtual wxObject*
}{GetProduct
}{\void}
114 Returns product of parsing. Returned value is result of parsing
115 of the
document. The type of this result depends on internal
116 representation in derived parser (but it must be derived from wxObject!).
118 See wxHtmlWinParser for details.
121 \membersection{wxHtmlParser::AddTagHandler
}\label{wxhtmlparseraddtaghandler
}
123 \func{virtual void
}{AddTagHandler
}{\param{wxHtmlTagHandler
}{*handler
}}
125 Adds handler to the internal list (\& hash table) of handlers. This
126 method should not be called directly by user but rather by derived class'
129 This adds the handler to this
{\bf instance
} of wxHtmlParser not to
130 all objects of this class!!! (Static front-end to AddTagHandler is provided
133 All handlers are deleted on object deletion.
135 \membersection{wxHtmlParser::GetSource
}\label{wxhtmlparsergetsource
}
137 \func{wxString*
}{GetSource
}{\void}
139 Returns pointer to the source being parsed.
142 \membersection{wxHtmlParser::GetTempData
}\label{wxhtmlparsergettempdata
}
144 \func{virtual wxList*
}{GetTempData
}{\void}
146 This method returns list of wxObjects that represents
147 all data allocated by the parser. These can't be freeded
148 by destructor because they must be valid as long as
149 GetProduct's return value is valid - the caller must
153 delete (MyParser -> GetTempData());
156 to free the memory (this method always sets the list to delete its contents)
160 Why is this neccessary? Imagine wxHtmlWinParser : when handling
161 FONT tag it creates some fonts. These fonts are then used by wxHtmlWindow
162 to display the text. But wxHtmWinParser object is needed only when parsing
163 the
document - it may be deleted then. But fonts CAN'T be deleted - they
164 must exist as long as the window is displaying text.
166 GetTempData() solves the problem.
168 \membersection{wxHtmlParser::AddText
}\label{wxhtmlparseraddword
}
170 \func{virtual void
}{AddWord
}{\param{const char*
}{txt
}}
172 Must be overwriten in derived class.
174 This method is called by
\helpref{DoParsing
}{wxhtmlparserdoparsing
}
175 each time a part of text is parsed.
{\it txt
} is NOT only one word, it is
176 substring of input. It is not formatted or preprocessed (so white spaces are
179 \membersection{wxHtmlParser::AddTag
}\label{wxhtmlparseraddtag
}
181 \func{void
}{AddTag
}{\param{const wxHtmlTag\&
}{tag
}}
183 This may (and may not) be overwriten in derived class.
185 This method is called each time new tag is about to be added.
186 {\it tag
} contains information about the tag. (See
\helpref{wxHtmlTag
}{wxhtmltag
}
189 Default (wxHtmlParser) behaviour is this :
190 First it finds a handler capable of handling this tag and then it calls
191 handler's HandleTag method.