Debugging Your Parser
* Understanding:: Understanding the structure of your parser.
+* Graphviz:: Getting a visual representation of the parser.
* Tracing:: Tracing the execution of your parser.
Tracing Your Parser
@findex %define api.location.type
@itemize @bullet
-@item Language(s): C++
+@item Language(s): C++, Java
@item Purpose: Define the location type.
@xref{User Defined Location Type}.
+@c ================================================== api.token.constructor
+@item api.token.constructor
+@findex %define api.token.constructor
+
+@itemize @bullet
+@item Language(s):
+C++
+
+@item Purpose:
+When variant-based semantic values are enabled (@pxref{C++ Variants}),
+request that symbols be handled as a whole (type, value, and possibly
+location) in the scanner. @xref{Complete Symbols}, for details.
+
+@item Accepted Values:
+Boolean.
+
+@item Default Value:
+@code{false}
+@item History:
+introduced in Bison 2.8
+@end itemize
+@c api.token.constructor
+
+
@c ================================================== api.token.prefix
@item api.token.prefix
@findex %define api.token.prefix
@c api.token.prefix
-@c ================================================== lex_symbol
-@item lex_symbol
-@findex %define lex_symbol
+@c ================================================== lr.default-reduction
-@itemize @bullet
-@item Language(s):
-C++
-
-@item Purpose:
-When variant-based semantic values are enabled (@pxref{C++ Variants}),
-request that symbols be handled as a whole (type, value, and possibly
-location) in the scanner. @xref{Complete Symbols}, for details.
-
-@item Accepted Values:
-Boolean.
-
-@item Default Value:
-@code{false}
-@end itemize
-@c lex_symbol
-
-
-@c ================================================== lr.default-reductions
-
-@item lr.default-reductions
-@findex %define lr.default-reductions
+@item lr.default-reduction
+@findex %define lr.default-reduction
@itemize @bullet
@item Language(s): all
@item @code{accepting} if @code{lr.type} is @code{canonical-lr}.
@item @code{most} otherwise.
@end itemize
+@item History:
+introduced as @code{lr.default-reduction} in 2.5, renamed as
+@code{lr.default-reduction} in 2.8.
@end itemize
-@c ============================================ lr.keep-unreachable-states
+@c ============================================ lr.keep-unreachable-state
-@item lr.keep-unreachable-states
-@findex %define lr.keep-unreachable-states
+@item lr.keep-unreachable-state
+@findex %define lr.keep-unreachable-state
@itemize @bullet
@item Language(s): all
@item Accepted Values: Boolean
@item Default Value: @code{false}
@end itemize
-@c lr.keep-unreachable-states
+introduced as @code{lr.keep_unreachable_states} in 2.3b, renamed as
+@code{lr.keep-unreachable-state} in 2.5, and as
+@code{lr.keep-unreachable-state} in 2.8.
+@c lr.keep-unreachable-state
@c ================================================== lr.type
@node Default Reductions
@subsection Default Reductions
@cindex default reductions
-@findex %define lr.default-reductions
+@findex %define lr.default-reduction
@findex %nonassoc
After parser table construction, Bison identifies the reduction with the
split the parse instead.
To adjust which states have default reductions enabled, use the
-@code{%define lr.default-reductions} directive.
+@code{%define lr.default-reduction} directive.
-@deffn {Directive} {%define lr.default-reductions @var{WHERE}}
+@deffn {Directive} {%define lr.default-reduction @var{WHERE}}
Specify the kind of states that are permitted to contain default reductions.
The accepted values of @var{WHERE} are:
@itemize
@node Unreachable States
@subsection Unreachable States
-@findex %define lr.keep-unreachable-states
+@findex %define lr.keep-unreachable-state
@cindex unreachable states
If there exists no sequence of transitions from the parser's start state to
keeping unreachable states is sometimes useful when trying to understand the
relationship between the parser and the grammar.
-@deffn {Directive} {%define lr.keep-unreachable-states @var{VALUE}}
+@deffn {Directive} {%define lr.keep-unreachable-state @var{VALUE}}
Request that Bison allow unreachable states to remain in the parser tables.
@var{VALUE} must be a Boolean. The default is @code{false}.
@end deffn
@menu
* Understanding:: Understanding the structure of your parser.
+* Graphviz:: Getting a visual representation of the parser.
* Tracing:: Tracing the execution of your parser.
@end menu
@samp{*}, but also because the
associativity of @samp{/} is not specified.
+@c ================================================= Graphical Representation
+
+@node Graphviz
+@section Visualizing Your Parser
+@cindex dot
+
+As another means to gain better understanding of the shift/reduce
+automaton corresponding to the Bison parser, a DOT file can be generated. Note
+that debugging a real grammar with this is tedious at best, and impractical
+most of the times, because the generated files are huge (the generation of
+a PDF or PNG file from it will take very long, and more often than not it will
+fail due to memory exhaustion). This option was rather designed for beginners,
+to help them understand LR parsers.
+
+This file is generated when the @option{--graph} option is specified (see
+@pxref{Invocation, , Invoking Bison}). Its name is made by removing
+@samp{.tab.c} or @samp{.c} from the parser implementation file name, and
+adding @samp{.dot} instead. If the grammar file is @file{foo.y}, the
+Graphviz output file is called @file{foo.dot}.
+
+The following grammar file, @file{rr.y}, will be used in the sequel:
+
+@example
+%%
+@group
+exp: a ";" | b ".";
+a: "0";
+b: "0";
+@end group
+@end example
+
+The graphical output is very similar to the textual one, and as such it is
+easier understood by making direct comparisons between them. See
+@ref{Debugging, , Debugging Your Parser} for a detailled analysis of the
+textual report.
+
+@subheading Graphical Representation of States
+
+The items (pointed rules) for each state are grouped together in graph nodes.
+Their numbering is the same as in the verbose file. See the following points,
+about transitions, for examples
+
+When invoked with @option{--report=lookaheads}, the lookahead tokens, when
+needed, are shown next to the relevant rule between square brackets as a
+comma separated list. This is the case in the figure for the representation of
+reductions, below.
+
+@sp 1
+
+The transitions are represented as directed edges between the current and
+the target states.
+
+@subheading Graphical Representation of Shifts
+
+Shifts are shown as solid arrows, labelled with the lookahead token for that
+shift. The following describes a reduction in the @file{rr.output} file:
+
+@example
+@group
+state 3
+
+ 1 exp: a . ";"
+
+ ";" shift, and go to state 6
+@end group
+@end example
+
+A Graphviz rendering of this portion of the graph could be:
+
+@center @image{figs/example-shift, 100pt}
+
+@subheading Graphical Representation of Reductions
+
+Reductions are shown as solid arrows, leading to a diamond-shaped node
+bearing the number of the reduction rule. The arrow is labelled with the
+appropriate comma separated lookahead tokens. If the reduction is the default
+action for the given state, there is no such label.
+
+This is how reductions are represented in the verbose file @file{rr.output}:
+@example
+state 1
+
+ 3 a: "0" . [";"]
+ 4 b: "0" . ["."]
+
+ "." reduce using rule 4 (b)
+ $default reduce using rule 3 (a)
+@end example
+
+A Graphviz rendering of this portion of the graph could be:
+
+@center @image{figs/example-reduce, 120pt}
+
+When unresolved conflicts are present, because in deterministic parsing
+a single decision can be made, Bison can arbitrarily choose to disable a
+reduction, see @ref{Shift/Reduce, , Shift/Reduce Conflicts}. Discarded actions
+are distinguished by a red filling color on these nodes, just like how they are
+reported between square brackets in the verbose file.
+
+The reduction corresponding to the rule number 0 is the acceptation state. It
+is shown as a blue diamond, labelled "Acc".
+
+@subheading Graphical representation of go tos
+
+The @samp{go to} jump transitions are represented as dotted lines bearing
+the name of the rule being jumped to.
+
+@c ================================================= Tracing
@node Tracing
@section Tracing Your Parser
files, reused by other parsers as follows:
@example
-%define location_type "master::location"
+%define api.location.type "master::location"
%code requires @{ #include <master/location.hh> @}
@end example
@node Complete Symbols
@subsubsection Complete Symbols
-If you specified both @code{%define variant} and @code{%define lex_symbol},
+If you specified both @code{%define variant} and
+@code{%define api.token.constructor},
the @code{parser} class also defines the class @code{parser::symbol_type}
which defines a @emph{complete} symbol, aggregating its type (i.e., the
traditional value returned by @code{yylex}), its semantic value (i.e., the
@end example
@noindent
+@findex %define api.token.constructor
@findex %define variant
-@findex %define lex_symbol
This example will use genuine C++ objects as semantic values, therefore, we
require the variant-based interface. To make sure we properly use it, we
enable assertions. To fully benefit from type-safety and more natural
-definition of ``symbol'', we enable @code{lex_symbol}.
+definition of ``symbol'', we enable @code{api.token.constructor}.
@comment file: calc++-parser.yy
@example
-%define variant
+%define api.token.constructor
%define parse.assert
-%define lex_symbol
+%define variant
@end example
@noindent
defines a class representing a @dfn{location}, a range composed of a pair of
positions (possibly spanning several files). The location class is an inner
class of the parser; the name is @code{Location} by default, and may also be
-renamed using @samp{%define location_type "@var{class-name}"}.
+renamed using @code{%define api.location.type "@var{class-name}"}.
The location class treats the position as a completely opaque value.
By default, the class name is @code{Position}, but this can be changed
-with @samp{%define position_type "@var{class-name}"}. This class must
+with @code{%define api.position.type "@var{class-name}"}. This class must
be supplied by the user.
@deftypemethod {Lexer} {void} yyerror (Location @var{loc}, String @var{msg})
This method is defined by the user to emit an error message. The first
parameter is omitted if location tracking is not active. Its type can be
-changed using @samp{%define location_type "@var{class-name}".}
+changed using @code{%define api.location.type "@var{class-name}".}
@end deftypemethod
@deftypemethod {Lexer} {int} yylex ()
@code{yylex} returned, and the first position beyond it. These
methods are not needed unless location tracking is active.
-The return type can be changed using @samp{%define position_type
+The return type can be changed using @code{%define api.position.type
"@var{class-name}".}
@end deftypemethod
@xref{Java Scanner Interface}.
@end deffn
-@deffn {Directive} {%define location_type} "@var{class}"
+@deffn {Directive} {%define api.location.type} "@var{class}"
The name of the class used for locations (a range between two
positions). This class is generated as an inner class of the parser
class by @command{bison}. Default is @code{Location}.
+Formerly named @code{location_type}.
@xref{Java Location Values}.
@end deffn
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define position_type} "@var{class}"
+@deffn {Directive} {%define api.position.type} "@var{class}"
The name of the class used for positions. This class must be supplied by
the user. Default is @code{Position}.
+Formerly named @code{position_type}.
@xref{Java Location Values}.
@end deffn
@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett LALR's
@c LocalWords: subdirectory Solaris nonassociativity perror schemas Malloy ints
@c LocalWords: Scannerless ispell american ChangeLog smallexample CSTYPE CLTYPE
-@c LocalWords: clval CDEBUG cdebug deftypeopx yyterminate
+@c LocalWords: clval CDEBUG cdebug deftypeopx yyterminate LocationType
+@c LocalWords: errorVerbose
@c Local Variables:
@c ispell-dictionary: "american"