X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/f3ead217b8636f623399e66bd937b1c51774d4af..219458e22f163ede84daa71dfa98bab82a766954:/doc/bison.texi diff --git a/doc/bison.texi b/doc/bison.texi index 28af1ae6..39e84e7c 100644 --- a/doc/bison.texi +++ b/doc/bison.texi @@ -33,7 +33,7 @@ This manual (@value{UPDATED}) is for GNU Bison (version @value{VERSION}), the GNU parser generator. -Copyright @copyright{} 1988-1993, 1995, 1998-2012 Free Software +Copyright @copyright{} 1988-1993, 1995, 1998-2013 Free Software Foundation, Inc. @quotation @@ -211,6 +211,12 @@ Defining Language Semantics This says when, why and how to use the exceptional action in the middle of a rule. +Actions in Mid-Rule + +* Using Mid-Rule Actions:: Putting an action in the middle of a rule. +* Mid-Rule Action Translation:: How mid-rule actions are actually processed. +* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts. + Tracking Locations * Location Type:: Specifying a data type for locations. @@ -1532,6 +1538,7 @@ calculator. As in C, comments are placed between @samp{/*@dots{}*/}. @example /* Reverse polish notation calculator. */ +@group %@{ #define YYSTYPE double #include @@ -1539,6 +1546,7 @@ calculator. As in C, comments are placed between @samp{/*@dots{}*/}. int yylex (void); void yyerror (char const *); %@} +@end group %token NUM @@ -2405,7 +2413,7 @@ Here are the C and Bison declarations for the multi-function calculator. %type exp @group -%right '=' +%precedence '=' %left '-' '+' %left '*' '/' %precedence NEG /* negation--unary minus */ @@ -2791,6 +2799,9 @@ The Bison grammar file conventionally has a name ending in @samp{.y}. @node Grammar Outline @section Outline of a Bison Grammar +@cindex comment +@findex // @dots{} +@findex /* @dots{} */ A Bison grammar file has four main sections, shown here with the appropriate delimiters: @@ -2810,8 +2821,8 @@ appropriate delimiters: @end example Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections. -As a GNU extension, @samp{//} introduces a comment that -continues until end of line. +As a GNU extension, @samp{//} introduces a comment that continues until end +of line. @menu * Prologue:: Syntax and usage of the prologue. @@ -2848,21 +2859,27 @@ can be done with two @var{Prologue} blocks, one before and one after the @code{%union} declaration. @example +@group %@{ #define _GNU_SOURCE #include #include "ptypes.h" %@} +@end group +@group %union @{ long int n; tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ @} +@end group +@group %@{ static void print_token_value (FILE *, int, YYSTYPE); #define YYPRINT(F, N, L) print_token_value (F, N, L) %@} +@end group @dots{} @end example @@ -2894,21 +2911,27 @@ location, or it can be one of @code{requires}, @code{provides}, Look again at the example of the previous section: @example +@group %@{ #define _GNU_SOURCE #include #include "ptypes.h" %@} +@end group +@group %union @{ long int n; tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ @} +@end group +@group %@{ static void print_token_value (FILE *, int, YYSTYPE); #define YYPRINT(F, N, L) print_token_value (F, N, L) %@} +@end group @dots{} @end example @@ -2960,16 +2983,20 @@ Let's go ahead and add the new @code{YYLTYPE} definition and the @} YYLTYPE; @} +@group %union @{ long int n; tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ @} +@end group +@group %code @{ static void print_token_value (FILE *, int, YYSTYPE); #define YYPRINT(F, N, L) print_token_value (F, N, L) static void trace_token (enum yytokentype token, YYLTYPE loc); @} +@end group @dots{} @end example @@ -3799,6 +3826,15 @@ Occasionally it is useful to put an action in the middle of a rule. These actions are written just like usual end-of-rule actions, but they are executed before the parser even recognizes the following components. +@menu +* Using Mid-Rule Actions:: Putting an action in the middle of a rule. +* Mid-Rule Action Translation:: How mid-rule actions are actually processed. +* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts. +@end menu + +@node Using Mid-Rule Actions +@subsubsection Using Mid-Rule Actions + A mid-rule action may refer to the components preceding it using @code{$@var{n}}, but it may not refer to subsequent components because it is run before they are parsed. @@ -3831,10 +3867,16 @@ remove it afterward. Here is how it is done: @example @group stmt: - LET '(' var ')' - @{ $$ = push_context (); declare_variable ($3); @} + "let" '(' var ')' + @{ + $$ = push_context (); + declare_variable ($3); + @} stmt - @{ $$ = $6; pop_context ($5); @} + @{ + $$ = $6; + pop_context ($5); + @} @end group @end example @@ -3845,8 +3887,27 @@ list of accessible variables) as its semantic value, using alternative @code{context} in the data-type union. Then it calls @code{declare_variable} to add the new variable to that list. Once the first action is finished, the embedded statement @code{stmt} can be -parsed. Note that the mid-rule action is component number 5, so the -@samp{stmt} is component number 6. +parsed. + +Note that the mid-rule action is component number 5, so the @samp{stmt} is +component number 6. Named references can be used to improve the readability +and maintainability (@pxref{Named References}): + +@example +@group +stmt: + "let" '(' var ')' + @{ + $let = push_context (); + declare_variable ($3); + @}[let] + stmt + @{ + $$ = $6; + pop_context ($let); + @} +@end group +@end example After the embedded statement is parsed, its semantic value becomes the value of the entire @code{let}-statement. Then the semantic value from the @@ -3880,13 +3941,13 @@ stmt: let stmt @{ $$ = $2; - pop_context ($1); + pop_context ($let); @}; let: - LET '(' var ')' + "let" '(' var ')' @{ - $$ = push_context (); + $let = push_context (); declare_variable ($3); @}; @@ -3898,6 +3959,76 @@ Note that the action is now at the end of its rule. Any mid-rule action can be converted to an end-of-rule action in this way, and this is what Bison actually does to implement mid-rule actions. +@node Mid-Rule Action Translation +@subsubsection Mid-Rule Action Translation +@vindex $@@@var{n} +@vindex @@@var{n} + +As hinted earlier, mid-rule actions are actually transformed into regular +rules and actions. The various reports generated by Bison (textual, +graphical, etc., see @ref{Understanding, , Understanding Your Parser}) +reveal this translation, best explained by means of an example. The +following rule: + +@example +exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @}; +@end example + +@noindent +is translated into: + +@example +$@@1: /* empty */ @{ a(); @}; +$@@2: /* empty */ @{ c(); @}; +$@@3: /* empty */ @{ d(); @}; +exp: $@@1 "b" $@@2 $@@3 "e" @{ f(); @}; +@end example + +@noindent +with new nonterminal symbols @code{$@@@var{n}}, where @var{n} is a number. + +A mid-rule action is expected to generate a value if it uses @code{$$}, or +the (final) action uses @code{$@var{n}} where @var{n} denote the mid-rule +action. In that case its nonterminal is rather named @code{@@@var{n}}: + +@example +exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @}; +@end example + +@noindent +is translated into + +@example +@@1: /* empty */ @{ a(); @}; +@@2: /* empty */ @{ $$ = c(); @}; +$@@3: /* empty */ @{ d(); @}; +exp: @@1 "b" @@2 $@@3 "e" @{ f = $1; @} +@end example + +There are probably two errors in the above example: the first mid-rule +action does not generate a value (it does not use @code{$$} although the +final action uses it), and the value of the second one is not used (the +final action does not use @code{$3}). Bison reports these errors when the +@code{midrule-value} warnings are enabled (@pxref{Invocation, ,Invoking +Bison}): + +@example +$ bison -fcaret -Wmidrule-value mid.y +@group +mid.y:2.6-13: warning: unset value: $$ + exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @}; + ^^^^^^^^ +@end group +@group +mid.y:2.19-31: warning: unused value: $3 + exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @}; + ^^^^^^^^^^^^^ +@end group +@end example + + +@node Mid-Rule Conflicts +@subsubsection Conflicts due to Mid-Rule Actions Taking action before a rule is completely recognized often leads to conflicts since the parser must commit to a parse in order to execute the action. For example, the following two rules, without mid-rule actions, @@ -3995,6 +4126,7 @@ compound: Now Bison can execute the action in the rule for @code{subroutine} without deciding which rule for @code{compound} it will eventually use. + @node Tracking Locations @section Tracking Locations @cindex location @@ -4963,7 +5095,7 @@ calling convention is used for the lexical analyzer function @code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure Parsers}, for the details of this. The variable @code{yynerrs} becomes local in @code{yyparse} in pull mode but it becomes a member -of yypstate in push mode. (@pxref{Error Reporting, ,The Error +of @code{yypstate} in push mode. (@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}). The convention for calling @code{yyparse} itself is unchanged. @@ -5407,12 +5539,9 @@ values, depend on the selected target language and/or the parser skeleton (@pxref{Decl Summary,,%language}, @pxref{Decl Summary,,%skeleton}). Unaccepted @var{variable}s produce an error. -Some of the accepted @var{variable}s are: +Some of the accepted @var{variable}s are described below. -@table @code -@c ================================================== api.namespace -@item api.namespace -@findex %define api.namespace +@deffn Directive {%define api.namespace} "@var{namespace}" @itemize @item Languages(s): C++ @@ -5460,11 +5589,11 @@ lexical analyzer function. For example, if you specify: The parser namespace is @code{foo} and @code{yylex} is referenced as @code{bar::lex}. @end itemize -@c namespace +@end deffn +@c api.namespace @c ================================================== api.location.type -@item @code{api.location.type} -@findex %define api.location.type +@deffn {Directive} {%define api.location.type} @var{type} @itemize @bullet @item Language(s): C++, Java @@ -5476,12 +5605,14 @@ The parser namespace is @code{foo} and @code{yylex} is referenced as @item Default Value: none -@item History: introduced in Bison 2.7 +@item History: +Introduced in Bison 2.7 for C, C++ and Java. Introduced under the name +@code{location_type} for C++ in Bison 2.5 and for Java in Bison 2.4. @end itemize +@end deffn @c ================================================== api.prefix -@item api.prefix -@findex %define api.prefix +@deffn {Directive} {%define api.prefix} @var{prefix} @itemize @bullet @item Language(s): All @@ -5495,10 +5626,10 @@ The parser namespace is @code{foo} and @code{yylex} is referenced as @item History: introduced in Bison 2.6 @end itemize +@end deffn @c ================================================== api.pure -@item api.pure -@findex %define api.pure +@deffn Directive {%define api.pure} @itemize @bullet @item Language(s): C @@ -5524,8 +5655,8 @@ I.e., if @samp{%locations %define api.pure} is passed then the prototypes for @code{yyerror} are: @example -void yyerror (char const *msg); /* Yacc parsers. */ -void yyerror (YYLTYPE *locp, char const *msg); /* GLR parsers. */ +void yyerror (char const *msg); // Yacc parsers. +void yyerror (YYLTYPE *locp, char const *msg); // GLR parsers. @end example But if @samp{%locations %define api.pure %parse-param @{int *nastiness@}} is @@ -5540,15 +5671,16 @@ Reporting Function @code{yyerror}}) @item Default Value: @code{false} -@item History: the @code{full} value was introduced in Bison 2.7 +@item History: +the @code{full} value was introduced in Bison 2.7 @end itemize +@end deffn @c api.pure @c ================================================== api.push-pull -@item api.push-pull -@findex %define api.push-pull +@deffn Directive {%define api.push-pull} @var{kind} @itemize @bullet @item Language(s): C (deterministic parsers only) @@ -5562,13 +5694,13 @@ More user feedback will help to stabilize it.) @item Default Value: @code{pull} @end itemize +@end deffn @c api.push-pull @c ================================================== api.token.constructor -@item api.token.constructor -@findex %define api.token.constructor +@deffn Directive {%define api.token.constructor} @itemize @bullet @item Language(s): @@ -5587,12 +5719,12 @@ Boolean. @item History: introduced in Bison 2.8 @end itemize +@end deffn @c api.token.constructor @c ================================================== api.token.prefix -@item api.token.prefix -@findex %define api.token.prefix +@deffn Directive {%define api.token.prefix} @var{prefix} @itemize @item Languages(s): all @@ -5627,13 +5759,39 @@ empty @item History: introduced in Bison 2.8 @end itemize +@end deffn @c api.token.prefix +@c ================================================== api.value.type +@deffn Directive {%define api.value.type} @var{type} +@itemize @bullet +@item Language(s): +C++ + +@item Purpose: +Request variant-based semantic values. +@xref{C++ Variants}. + +@item Default Value: +FIXME: +@item History: +introduced in Bison 2.8. Was introduced for Java only in 2.3b as +@code{stype}. +@end itemize +@end deffn +@c api.value.type + + +@c ================================================== location_type +@deffn Directive {%define location_type} +Obsoleted by @code{api.location.type} since Bison 2.7. +@end deffn + + @c ================================================== lr.default-reduction -@item lr.default-reduction -@findex %define lr.default-reduction +@deffn Directive {%define lr.default-reduction} @var{when} @itemize @bullet @item Language(s): all @@ -5653,11 +5811,11 @@ feedback will help to stabilize it.) introduced as @code{lr.default-reduction} in 2.5, renamed as @code{lr.default-reduction} in 2.8. @end itemize +@end deffn @c ============================================ lr.keep-unreachable-state -@item lr.keep-unreachable-state -@findex %define lr.keep-unreachable-state +@deffn Directive {%define lr.keep-unreachable-state} @itemize @bullet @item Language(s): all @@ -5665,16 +5823,17 @@ introduced as @code{lr.default-reduction} in 2.5, renamed as remain in the parser tables. @xref{Unreachable States}. @item Accepted Values: Boolean @item Default Value: @code{false} -@end itemize +@item History: introduced as @code{lr.keep_unreachable_states} in 2.3b, renamed as @code{lr.keep-unreachable-states} in 2.5, and as @code{lr.keep-unreachable-state} in 2.8. +@end itemize +@end deffn @c lr.keep-unreachable-state @c ================================================== lr.type -@item lr.type -@findex %define lr.type +@deffn Directive {%define lr.type} @var{type} @itemize @bullet @item Language(s): all @@ -5687,18 +5846,16 @@ More user feedback will help to stabilize it.) @item Default Value: @code{lalr} @end itemize - +@end deffn @c ================================================== namespace -@item namespace -@findex %define namespace +@deffn Directive %define namespace @var{namespace} Obsoleted by @code{api.namespace} @c namespace - +@end deffn @c ================================================== parse.assert -@item parse.assert -@findex %define parse.assert +@deffn Directive {%define parse.assert} @itemize @item Languages(s): C++ @@ -5712,12 +5869,12 @@ destroyed properly. This option checks these constraints. @item Default Value: @code{false} @end itemize +@end deffn @c parse.assert @c ================================================== parse.error -@item parse.error -@findex %define parse.error +@deffn Directive {%define parse.error} @itemize @item Languages(s): all @@ -5739,12 +5896,12 @@ However, this report can often be incorrect when LAC is not enabled @item Default Value: @code{simple} @end itemize +@end deffn @c parse.error @c ================================================== parse.lac -@item parse.lac -@findex %define parse.lac +@deffn Directive {%define parse.lac} @itemize @item Languages(s): C (deterministic parsers only) @@ -5754,11 +5911,11 @@ syntax error handling. @xref{LAC}. @item Accepted Values: @code{none}, @code{full} @item Default Value: @code{none} @end itemize +@end deffn @c parse.lac @c ================================================== parse.trace -@item parse.trace -@findex %define parse.trace +@deffn Directive {%define parse.trace} @itemize @item Languages(s): C, C++, Java @@ -5776,30 +5933,9 @@ compiled. @item Default Value: @code{false} @end itemize +@end deffn @c parse.trace -@c ================================================== variant -@item variant -@findex %define variant - -@itemize @bullet -@item Language(s): -C++ - -@item Purpose: -Request variant-based semantic values. -@xref{C++ Variants}. - -@item Accepted Values: -Boolean. - -@item Default Value: -@code{false} -@end itemize -@c variant -@end table - - @node %code Summary @subsection %code Summary @findex %code @@ -5979,7 +6115,7 @@ parsers. To comply with this tradition, when @code{api.prefix} is used, @code{YYDEBUG} (not renamed) is used as a default value: @example -/* Enabling traces. */ +/* Debug traces. */ #ifndef CDEBUG # if defined YYDEBUG # if YYDEBUG @@ -6136,7 +6272,7 @@ function is available if either the @samp{%define api.push-pull push} or @samp{%define api.push-pull both} declaration is used. @xref{Push Decl, ,A Push Parser}. -@deftypefun int yypush_parse (yypstate *yyps) +@deftypefun int yypush_parse (yypstate *@var{yyps}) The value returned by @code{yypush_parse} is the same as for yyparse with the following exception: it returns @code{YYPUSH_MORE} if more input is required to finish parsing the grammar. @@ -6154,7 +6290,7 @@ stream. This function is available if the @samp{%define api.push-pull both} declaration is used. @xref{Push Decl, ,A Push Parser}. -@deftypefun int yypull_parse (yypstate *yyps) +@deftypefun int yypull_parse (yypstate *@var{yyps}) The value returned by @code{yypull_parse} is the same as for @code{yyparse}. @end deftypefun @@ -6189,7 +6325,7 @@ function is available if either the @samp{%define api.push-pull push} or @samp{%define api.push-pull both} declaration is used. @xref{Push Decl, ,A Push Parser}. -@deftypefun void yypstate_delete (yypstate *yyps) +@deftypefun void yypstate_delete (yypstate *@var{yyps}) This function will reclaim the memory associated with a parser instance. After this call, you should no longer attempt to use the parser instance. @end deftypefun @@ -6662,7 +6798,6 @@ Actions}). @end deffn @deffn {Value} @@$ -@findex @@$ Acts like a structure variable containing information on the textual location of the grouping made by the current rule. @xref{Tracking Locations}. @@ -6721,7 +6856,7 @@ GNU Automake. @item @cindex bison-i18n.m4 Into the directory containing the GNU Autoconf macros used -by the package---often called @file{m4}---copy the +by the package ---often called @file{m4}--- copy the @file{bison-i18n.m4} file installed by Bison under @samp{share/aclocal/bison-i18n.m4} in Bison's installation directory. For example: @@ -8526,8 +8661,26 @@ clear the flag. Developing a parser can be a challenge, especially if you don't understand the algorithm (@pxref{Algorithm, ,The Bison Parser Algorithm}). This -chapter explains how to generate and read the detailed description of the -automaton, and how to enable and understand the parser run-time traces. +chapter explains how understand and debug a parser. + +The first sections focus on the static part of the parser: its structure. +They explain how to generate and read the detailed description of the +automaton. There are several formats available: +@itemize @minus +@item +as text, see @ref{Understanding, , Understanding Your Parser}; + +@item +as a graph, see @ref{Graphviz,, Visualizing Your Parser}; + +@item +or as a markup report that can be turned, for instance, into HTML, see +@ref{Xml,, Visualizing your parser in multiple formats}. +@end itemize + +The last section focuses on the dynamic part of the parser: how to enable +and understand the parser run-time traces (@pxref{Tracing, ,Tracing Your +Parser}). @menu * Understanding:: Understanding the structure of your parser. @@ -8542,8 +8695,7 @@ automaton, and how to enable and understand the parser run-time traces. As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm}) Bison parsers are @dfn{shift/reduce automata}. In some cases (much more frequent than one would hope), looking at this automaton is required to -tune or simply fix a parser. Bison provides two different -representation of it, either textually or graphically (as a DOT file). +tune or simply fix a parser. The textual file is generated when the options @option{--report} or @option{--verbose} are specified, see @ref{Invocation, , Invoking @@ -8557,9 +8709,12 @@ The following grammar file, @file{calc.y}, will be used in the sequel: @example %token NUM STR +@group %left '+' '-' %left '*' +@end group %% +@group exp: exp '+' exp | exp '-' exp @@ -8567,6 +8722,7 @@ exp: | exp '/' exp | NUM ; +@end group useless: STR; %% @end example @@ -8576,8 +8732,8 @@ useless: STR; @example calc.y: warning: 1 nonterminal useless in grammar calc.y: warning: 1 rule useless in grammar -calc.y:11.1-7: warning: nonterminal useless in grammar: useless -calc.y:11.10-12: warning: rule useless in grammar: useless: STR +calc.y:12.1-7: warning: nonterminal useless in grammar: useless +calc.y:12.10-12: warning: rule useless in grammar: useless: STR calc.y: conflicts: 7 shift/reduce @end example @@ -8671,7 +8827,7 @@ item is a production rule together with a point (@samp{.}) marking the location of the input cursor. @example -state 0 +State 0 0 $accept: . exp $end @@ -8701,7 +8857,7 @@ you want to see more detail you can invoke @command{bison} with @option{--report=itemset} to list the derived items as well: @example -state 0 +State 0 0 $accept: . exp $end 1 exp: . exp '+' exp @@ -8719,7 +8875,7 @@ state 0 In the state 1@dots{} @example -state 1 +State 1 5 exp: NUM . @@ -8729,11 +8885,11 @@ state 1 @noindent the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead token (@samp{$default}), the parser will reduce it. If it was coming from -state 0, then, after this reduction it will return to state 0, and will +State 0, then, after this reduction it will return to state 0, and will jump to state 2 (@samp{exp: go to state 2}). @example -state 2 +State 2 0 $accept: exp . $end 1 exp: exp . '+' exp @@ -8761,7 +8917,7 @@ The state 3 is named the @dfn{final state}, or the @dfn{accepting state}: @example -state 3 +State 3 0 $accept: exp $end . @@ -8776,7 +8932,7 @@ The interpretation of states 4 to 7 is straightforward, and is left to the reader. @example -state 4 +State 4 1 exp: exp '+' . exp @@ -8785,7 +8941,7 @@ state 4 exp go to state 8 -state 5 +State 5 2 exp: exp '-' . exp @@ -8794,7 +8950,7 @@ state 5 exp go to state 9 -state 6 +State 6 3 exp: exp '*' . exp @@ -8803,7 +8959,7 @@ state 6 exp go to state 10 -state 7 +State 7 4 exp: exp '/' . exp @@ -8816,7 +8972,7 @@ As was announced in beginning of the report, @samp{State 8 conflicts: 1 shift/reduce}: @example -state 8 +State 8 1 exp: exp . '+' exp 1 | exp '+' exp . @@ -8859,7 +9015,7 @@ with some set of possible lookahead tokens. When run with @option{--report=lookahead}, Bison specifies these lookahead tokens: @example -state 8 +State 8 1 exp: exp . '+' exp 1 | exp '+' exp . [$end, '+', '-', '/'] @@ -8891,7 +9047,7 @@ The remaining states are similar: @example @group -state 9 +State 9 1 exp: exp . '+' exp 2 | exp . '-' exp @@ -8907,7 +9063,7 @@ state 9 @end group @group -state 10 +State 10 1 exp: exp . '+' exp 2 | exp . '-' exp @@ -8922,7 +9078,7 @@ state 10 @end group @group -state 11 +State 11 1 exp: exp . '+' exp 2 | exp . '-' exp @@ -8945,12 +9101,11 @@ state 11 @noindent Observe that state 11 contains conflicts not only due to the lack of -precedence of @samp{/} with respect to @samp{+}, @samp{-}, and -@samp{*}, but also because the -associativity of @samp{/} is not specified. +precedence of @samp{/} with respect to @samp{+}, @samp{-}, and @samp{*}, but +also because the associativity of @samp{/} is not specified. -Note that Bison may also produce an HTML version of this output, via an XML -file and XSLT processing (@pxref{Xml}). +Bison may also produce an HTML version of this output, via an XML file and +XSLT processing (@pxref{Xml,,Visualizing your parser in multiple formats}). @c ================================================= Graphical Representation @@ -8970,7 +9125,10 @@ This file is generated when the @option{--graph} option is specified (@pxref{Invocation, , Invoking Bison}). Its name is made by removing @samp{.tab.c} or @samp{.c} from the parser implementation file name, and adding @samp{.dot} instead. If the grammar file is @file{foo.y}, the -Graphviz output file is called @file{foo.dot}. +Graphviz output file is called @file{foo.dot}. A DOT file may also be +produced via an XML file and XSLT processing (@pxref{Xml,,Visualizing your +parser in multiple formats}). + The following grammar file, @file{rr.y}, will be used in the sequel: @@ -8983,10 +9141,20 @@ b: "0"; @end group @end example -The graphical output is very similar to the textual one, and as such it is -easier understood by making direct comparisons between them. See -@ref{Debugging, , Debugging Your Parser} for a detailled analysis of the -textual report. +The graphical output +@ifnotinfo +(see @ref{fig:graph}) +@end ifnotinfo +is very similar to the textual one, and as such it is easier understood by +making direct comparisons between them. @xref{Debugging, , Debugging Your +Parser}, for a detailled analysis of the textual report. + +@ifnotinfo +@float Figure,fig:graph +@image{figs/example, 430pt} +@caption{A graphical rendering of the parser.} +@end float +@end ifnotinfo @subheading Graphical Representation of States @@ -9011,7 +9179,7 @@ shift. The following describes a reduction in the @file{rr.output} file: @example @group -state 3 +State 3 1 exp: a . ";" @@ -9032,7 +9200,7 @@ action for the given state, there is no such label. This is how reductions are represented in the verbose file @file{rr.output}: @example -state 1 +State 1 3 a: "0" . [";"] 4 b: "0" . ["."] @@ -9051,17 +9219,14 @@ reduction, see @ref{Shift/Reduce, , Shift/Reduce Conflicts}. Discarded actions are distinguished by a red filling color on these nodes, just like how they are reported between square brackets in the verbose file. -The reduction corresponding to the rule number 0 is the acceptation state. It -is shown as a blue diamond, labelled "Acc". +The reduction corresponding to the rule number 0 is the acceptation +state. It is shown as a blue diamond, labelled ``Acc''. @subheading Graphical representation of go tos The @samp{go to} jump transitions are represented as dotted lines bearing the name of the rule being jumped to. -Note that a DOT file may also be produced via an XML file and XSLT -processing (@pxref{Xml}). - @c ================================================= XML @node Xml @@ -9069,8 +9234,10 @@ processing (@pxref{Xml}). @cindex xml Bison supports two major report formats: textual output -(@pxref{Understanding}) when invoked with option @option{--verbose}, and DOT -(@pxref{Graphviz}) when invoked with option @option{--graph}. However, +(@pxref{Understanding, ,Understanding Your Parser}) when invoked +with option @option{--verbose}, and DOT +(@pxref{Graphviz,, Visualizing Your Parser}) when invoked with +option @option{--graph}. However, another alternative is to output an XML file that may then be, with @command{xsltproc}, rendered as either a raw text format equivalent to the verbose file, or as an HTML version of the same file, with clickable @@ -9078,7 +9245,7 @@ transitions, or even as a DOT. The @file{.output} and DOT files obtained via XSLT have no difference whatsoever with those obtained by invoking @command{bison} with options @option{--verbose} or @option{--graph}. -The textual file is generated when the options @option{-x} or +The XML file is generated when the options @option{-x} or @option{--xml[=FILE]} are specified, see @ref{Invocation,,Invoking Bison}. If not specified, its name is made by removing @samp{.tab.c} or @samp{.c} from the parser implementation file name, and adding @samp{.xml} instead. @@ -9092,19 +9259,19 @@ files to apply to the XML file. Their names are non-ambiguous: @item xml2dot.xsl Used to output a copy of the DOT visualization of the automaton. @item xml2text.xsl -Used to output a copy of the .output file. +Used to output a copy of the @samp{.output} file. @item xml2xhtml.xsl -Used to output an xhtml enhancement of the .output file. +Used to output an xhtml enhancement of the @samp{.output} file. @end table -Sample usage (requires @code{xsltproc}): +Sample usage (requires @command{xsltproc}): @example -$ bison -x input.y +$ bison -x gr.y @group $ bison --print-datadir /usr/local/share/bison @end group -$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl input.xml > input.html +$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl gr.xml >gr.html @end example @c ================================================= Tracing @@ -9302,7 +9469,7 @@ Entering state 24 @noindent The previous reduction demonstrates the @code{%printer} directive for -@code{}: both the token @code{NUM} and the resulting non-terminal +@code{}: both the token @code{NUM} and the resulting nonterminal @code{exp} have @samp{1} as value. @example @@ -9567,6 +9734,67 @@ no effect on the conflict report. Deprecated constructs whose support will be removed in future versions of Bison. +@item precedence +Useless precedence and associativity directives. Disabled by default. + +Consider for instance the following grammar: + +@example +@group +%nonassoc "=" +%left "+" +%left "*" +%precedence "(" +@end group +%% +@group +stmt: + exp +| "var" "=" exp +; +@end group + +@group +exp: + exp "+" exp +| exp "*" "num" +| "(" exp ")" +| "num" +; +@end group +@end example + +Bison reports: + +@c cannot leave the location and the [-Wprecedence] for lack of +@c width in PDF. +@example +@group +warning: useless precedence and associativity for "=" + %nonassoc "=" + ^^^ +@end group +@group +warning: useless associativity for "*", use %precedence + %left "*" + ^^^ +@end group +@group +warning: useless precedence for "(" + %precedence "(" + ^^^ +@end group +@end example + +One would get the exact same parser with the following directives instead: + +@example +@group +%left "+" +%precedence "*" +@end group +@end example + @item other All warnings not categorized above. These warnings are enabled by default. @@ -9617,7 +9845,7 @@ Show caret errors, in a manner similar to GCC's @option{-fdiagnostics-show-caret}, or Clang's @option{-fcaret-diagnotics}. The location provided with the message is used to quote the corresponding line of the source file, underlining the important part of it with carets (^). Here is -an example, using the following file @file{input.y}: +an example, using the following file @file{in.y}: @example %type exp @@ -9625,36 +9853,50 @@ an example, using the following file @file{input.y}: exp: exp '+' exp @{ $exp = $1 + $2; @}; @end example -When invoked with @option{-fcaret}, Bison will report: +When invoked with @option{-fcaret} (or nothing), Bison will report: @example @group -input.y:3.20-23: error: ambiguous reference: '$exp' +in.y:3.20-23: error: ambiguous reference: '$exp' exp: exp '+' exp @{ $exp = $1 + $2; @}; ^^^^ @end group @group -input.y:3.1-3: refers to: $exp at $$ +in.y:3.1-3: refers to: $exp at $$ exp: exp '+' exp @{ $exp = $1 + $2; @}; ^^^ @end group @group -input.y:3.6-8: refers to: $exp at $1 +in.y:3.6-8: refers to: $exp at $1 exp: exp '+' exp @{ $exp = $1 + $2; @}; ^^^ @end group @group -input.y:3.14-16: refers to: $exp at $3 +in.y:3.14-16: refers to: $exp at $3 exp: exp '+' exp @{ $exp = $1 + $2; @}; ^^^ @end group @group -input.y:3.32-33: error: $2 of 'exp' has no declared type +in.y:3.32-33: error: $2 of 'exp' has no declared type exp: exp '+' exp @{ $exp = $1 + $2; @}; ^^ @end group @end example +Whereas, when invoked with @option{-fno-caret}, Bison will only report: + +@example +@group +in.y:3.20-23: error: ambiguous reference: ‘$exp’ +in.y:3.1-3: refers to: $exp at $$ +in.y:3.6-8: refers to: $exp at $1 +in.y:3.14-16: refers to: $exp at $3 +in.y:3.32-33: error: $2 of ‘exp’ has no declared type +@end group +@end example + +This option is activated by default. + @end table @end table @@ -9970,10 +10212,9 @@ Symbols}. @node C++ Variants @subsubsection C++ Variants -Starting with version 2.6, Bison provides a @emph{variant} based -implementation of semantic values for C++. This alleviates all the -limitations reported in the previous section, and in particular, object -types can be used without pointers. +Bison provides a @emph{variant} based implementation of semantic values for +C++. This alleviates all the limitations reported in the previous section, +and in particular, object types can be used without pointers. To enable variant-based semantic values, set @code{%define} variable @code{variant} (@pxref{%define Summary,, variant}). Once this defined, @@ -10045,14 +10286,6 @@ therefore, since, as far as we know, @code{double} is the most demanding type on all platforms, alignments are enforced for @code{double} whatever types are actually used. This may waste space in some cases. -@item -Our implementation is not conforming with strict aliasing rules. Alias -analysis is a technique used in optimizing compilers to detect when two -pointers are disjoint (they cannot ``meet''). Our implementation breaks -some of the rules that G++ 4.4 uses in its alias analysis, so @emph{strict -alias analysis must be disabled}. Use the option -@option{-fno-strict-aliasing} to compile the generated parser. - @item There might be portability issues we are not aware of. @end itemize @@ -10401,7 +10634,7 @@ or @node Complete Symbols @subsubsection Complete Symbols -If you specified both @code{%define variant} and +If you specified both @code{%define api.value.type variant} and @code{%define api.token.constructor}, the @code{parser} class also defines the class @code{parser::symbol_type} which defines a @emph{complete} symbol, aggregating its type (i.e., the @@ -10665,7 +10898,7 @@ the grammar for. @noindent @findex %define api.token.constructor -@findex %define variant +@findex %define api.value.type variant This example will use genuine C++ objects as semantic values, therefore, we require the variant-based interface. To make sure we properly use it, we enable assertions. To fully benefit from type-safety and more natural @@ -10674,8 +10907,8 @@ definition of ``symbol'', we enable @code{api.token.constructor}. @comment file: calc++-parser.yy @example %define api.token.constructor +%define api.value.type variant %define parse.assert -%define variant @end example @noindent @@ -10874,7 +11107,7 @@ Finally, we enable scanner tracing. @comment file: calc++-scanner.ll @example -%option noyywrap nounput batch debug +%option noyywrap nounput batch debug noinput @end example @noindent @@ -11090,11 +11323,11 @@ semantic values' types (class names) should be specified in the By default, the semantic stack is declared to have @code{Object} members, which means that the class types you specify can be of any class. To improve the type safety of the parser, you can declare the common -superclass of all the semantic values using the @samp{%define stype} +superclass of all the semantic values using the @samp{%define api.value.type} directive. For example, after the following declaration: @example -%define stype "ASTNode" +%define api.value.type "ASTNode" @end example @noindent @@ -11320,7 +11553,7 @@ The return type can be changed using @code{%define api.position.type @deftypemethod {Lexer} {Object} getLVal () Return the semantic value of the last token that yylex returned. -The return type can be changed using @samp{%define stype +The return type can be changed using @samp{%define api.value.type "@var{class-name}".} @end deftypemethod @@ -11348,7 +11581,7 @@ Like @code{$@var{n}} but specifies a alternative type @var{typealt}. @defvar $$ The semantic value for the grouping made by the current rule. As a value, this is in the base type (@code{Object} or as specified by -@samp{%define stype}) as in not cast to the declared subtype because +@samp{%define api.value.type}) as in not cast to the declared subtype because casts are not allowed on the left-hand side of Java assignments. Use an explicit Java cast if the correct subtype is needed. @xref{Java Semantic Values}. @@ -11430,7 +11663,7 @@ corresponds to these C macros.}. @item Java lacks unions, so @code{%union} has no effect. Instead, semantic values have a common base type: @code{Object} or as specified by -@samp{%define stype}. Angle brackets on @code{%token}, @code{type}, +@samp{%define api.value.type}. Angle brackets on @code{%token}, @code{type}, @code{$@var{n}} and @code{$$} specify subtypes rather than fields of an union. The type of @code{$$}, even with angle brackets, is the base type since Java casts are not allow on the left-hand side of assignments. @@ -11606,7 +11839,7 @@ Whether the parser class is declared @code{public}. Default is false. @xref{Java Bison Interface}. @end deffn -@deffn {Directive} {%define stype} "@var{class}" +@deffn {Directive} {%define api.value.type} "@var{class}" The base type of semantic values. Default is @code{Object}. @xref{Java Semantic Values}. @end deffn @@ -12040,18 +12273,23 @@ In an action, the location of the left-hand side of the rule. @end deffn @deffn {Variable} @@@var{n} +@deffnx {Symbol} @@@var{n} In an action, the location of the @var{n}-th symbol of the right-hand side of the rule. @xref{Tracking Locations}. + +In a grammar, the Bison-generated nonterminal symbol for a mid-rule action +with a semantical value. @xref{Mid-Rule Action Translation}. @end deffn @deffn {Variable} @@@var{name} -In an action, the location of a symbol addressed by name. @xref{Tracking -Locations}. +@deffnx {Variable} @@[@var{name}] +In an action, the location of a symbol addressed by @var{name}. +@xref{Tracking Locations}. @end deffn -@deffn {Variable} @@[@var{name}] -In an action, the location of a symbol addressed by name. @xref{Tracking -Locations}. +@deffn {Symbol} $@@@var{n} +In a grammar, the Bison-generated nonterminal symbol for a mid-rule action +with no semantical value. @xref{Mid-Rule Action Translation}. @end deffn @deffn {Variable} $$ @@ -12065,12 +12303,8 @@ right-hand side of the rule. @xref{Actions}. @end deffn @deffn {Variable} $@var{name} -In an action, the semantic value of a symbol addressed by name. -@xref{Actions}. -@end deffn - -@deffn {Variable} $[@var{name}] -In an action, the semantic value of a symbol addressed by name. +@deffnx {Variable} $[@var{name}] +In an action, the semantic value of a symbol addressed by @var{name}. @xref{Actions}. @end deffn @@ -12101,8 +12335,9 @@ More user feedback will help to determine whether it should become a permanent feature. @end deffn -@deffn {Construct} /*@dots{}*/ -Comment delimiters, as in C. +@deffn {Construct} /* @dots{} */ +@deffnx {Construct} // @dots{} +Comments, as in C/C++. @end deffn @deffn {Delimiter} : @@ -12459,13 +12694,6 @@ the next token. @xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}. @end deffn -@deffn {Macro} YYLEX_PARAM -An obsolete macro for specifying an extra argument (or list of extra -arguments) for @code{yyparse} to pass to @code{yylex}. The use of this -macro is deprecated, and is supported only for Yacc like parsers. -@xref{Pure Calling,, Calling Conventions for Pure Parsers}. -@end deffn - @deffn {Variable} yylloc External variable in which @code{yylex} should place the line and column numbers associated with a token. (In a pure parser, it is a local @@ -12501,7 +12729,7 @@ Management}. @deffn {Variable} yynerrs Global variable which Bison increments each time it reports a syntax error. (In a pure parser, it is a local variable within @code{yyparse}. In a -pure push parser, it is a member of yypstate.) +pure push parser, it is a member of @code{yypstate}.) @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. @end deffn @@ -12587,7 +12815,7 @@ Data type of semantic values; @code{int} by default. @item Accepting state A state whose only action is the accept action. The accepting state is thus a consistent state. -@xref{Understanding,,}. +@xref{Understanding, ,Understanding Your Parser}. @item Backus-Naur Form (BNF; also called ``Backus Normal Form'') Formal method of specifying context-free grammars originally proposed