This manual (@value{UPDATED}) is for GNU Bison (version
@value{VERSION}), the GNU parser generator.
-Copyright @copyright{} 1988-1993, 1995, 1998-2012 Free Software
+Copyright @copyright{} 1988-1993, 1995, 1998-2013 Free Software
Foundation, Inc.
@quotation
* Grammar Outline:: Overall layout of the grammar file.
* Symbols:: Terminal and nonterminal symbols.
* Rules:: How to write grammar rules.
-* Recursion:: Writing recursive rules.
* Semantics:: Semantic values and actions.
* Tracking Locations:: Locations and actions.
* Named References:: Using named references in actions.
* Grammar Rules:: Syntax and usage of the grammar rules section.
* Epilogue:: Syntax and usage of the epilogue.
+Grammar Rules
+
+* Rules Syntax:: Syntax of the rules.
+* Empty Rules:: Symbols that can match the empty string.
+* Recursion:: Writing recursive rules.
+
+
Defining Language Semantics
* Value Type:: Specifying one data type for all semantic values.
This says when, why and how to use the exceptional
action in the middle of a rule.
+Actions in Mid-Rule
+
+* Using Mid-Rule Actions:: Putting an action in the middle of a rule.
+* Mid-Rule Action Translation:: How mid-rule actions are actually processed.
+* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
+
Tracking Locations
* Location Type:: Specifying a data type for locations.
%%
prog:
- /* Nothing. */
+ %empty
| prog stmt @{ printf ("\n"); @}
;
@example
/* Reverse polish notation calculator. */
+@group
%@{
#define YYSTYPE double
#include <stdio.h>
int yylex (void);
void yyerror (char const *);
%@}
+@end group
%token NUM
@example
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
@example
input:
- /* empty */
+ %empty
| input line
;
@end example
colon and the first @samp{|}; this means that @code{input} can match an
empty string of input (no tokens). We write the rules this way because it
is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
-It's conventional to put an empty alternative first and write the comment
-@samp{/* empty */} in it.
+It's conventional to put an empty alternative first and to use the
+(optional) @code{%empty} directive, or to write the comment @samp{/* empty
+*/} in it (@pxref{Empty Rules}).
The second alternate rule (@code{input line}) handles all nontrivial input.
It means, ``After reading any number of lines, read one more line if
%% /* The grammar follows. */
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
@example
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
%@{
#include <stdio.h> /* For printf, etc. */
#include <math.h> /* For pow, used in the grammar. */
- #include "calc.h" /* Contains definition of `symrec'. */
+ #include "calc.h" /* Contains definition of 'symrec'. */
int yylex (void);
void yyerror (char const *);
%@}
%type <val> exp
@group
-%right '='
+%precedence '='
%left '-' '+'
%left '*' '/'
%precedence NEG /* negation--unary minus */
%% /* The grammar follows. */
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
@group
typedef struct symrec symrec;
-/* The symbol table: a chain of `struct symrec'. */
+/* The symbol table: a chain of 'struct symrec'. */
extern symrec *sym_table;
symrec *putsym (char const *, int);
@end group
@group
-/* The symbol table: a chain of `struct symrec'. */
+/* The symbol table: a chain of 'struct symrec'. */
symrec *sym_table;
@end group
* Grammar Outline:: Overall layout of the grammar file.
* Symbols:: Terminal and nonterminal symbols.
* Rules:: How to write grammar rules.
-* Recursion:: Writing recursive rules.
* Semantics:: Semantic values and actions.
* Tracking Locations:: Locations and actions.
* Named References:: Using named references in actions.
@node Grammar Outline
@section Outline of a Bison Grammar
+@cindex comment
+@findex // @dots{}
+@findex /* @dots{} */
A Bison grammar file has four main sections, shown here with the
appropriate delimiters:
@end example
Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections.
-As a GNU extension, @samp{//} introduces a comment that
-continues until end of line.
+As a GNU extension, @samp{//} introduces a comment that continues until end
+of line.
@menu
* Prologue:: Syntax and usage of the prologue.
@code{%union} declaration.
@example
+@group
%@{
#define _GNU_SOURCE
#include <stdio.h>
#include "ptypes.h"
%@}
+@end group
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%@{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
%@}
+@end group
@dots{}
@end example
Look again at the example of the previous section:
@example
+@group
%@{
#define _GNU_SOURCE
#include <stdio.h>
#include "ptypes.h"
%@}
+@end group
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%@{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
%@}
+@end group
@dots{}
@end example
#include <stdio.h>
/* WARNING: The following code really belongs
- * in a `%code requires'; see below. */
+ * in a '%code requires'; see below. */
#include "ptypes.h"
#define YYLTYPE YYLTYPE
@} YYLTYPE;
@}
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%code @{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
static void trace_token (enum yytokentype token, YYLTYPE loc);
@}
+@end group
@dots{}
@end example
one of your tokens with a @code{%token} declaration.
@node Rules
-@section Syntax of Grammar Rules
+@section Grammar Rules
+
+A Bison grammar is a list of rules.
+
+@menu
+* Rules Syntax:: Syntax of the rules.
+* Empty Rules:: Symbols that can match the empty string.
+* Recursion:: Writing recursive rules.
+@end menu
+
+@node Rules Syntax
+@subsection Syntax of Grammar Rules
@cindex rule syntax
@cindex grammar rule syntax
@cindex syntax of grammar rules
@noindent
They are still considered distinct rules even when joined in this way.
-If @var{components} in a rule is empty, it means that @var{result} can
-match the empty string. For example, here is how to define a
-comma-separated sequence of zero or more @code{exp} groupings:
+@node Empty Rules
+@subsection Empty Rules
+@cindex empty rule
+@cindex rule, empty
+@findex %empty
+
+A rule is said to be @dfn{empty} if its right-hand side (@var{components})
+is empty. It means that @var{result} can match the empty string. For
+example, here is how to define an optional semicolon:
+
+@example
+semicolon.opt: | ";";
+@end example
+
+@noindent
+It is easy not to see an empty rule, especially when @code{|} is used. The
+@code{%empty} directive allows to make explicit that a rule is empty on
+purpose:
@example
@group
-expseq:
- /* empty */
-| expseq1
+semicolon.opt:
+ %empty
+| ";"
;
@end group
+@end example
+
+Flagging a non-empty rule with @code{%empty} is an error. If run with
+@option{-Wempty-rule}, @command{bison} will report empty rules without
+@code{%empty}. Using @code{%empty} enables this warning, unless
+@option{-Wno-empty-rule} was specified.
+The @code{%empty} directive is a Bison extension, it does not work with
+Yacc. To remain compatible with POSIX Yacc, it is customary to write a
+comment @samp{/* empty */} in each rule with no components:
+
+@example
@group
-expseq1:
- exp
-| expseq1 ',' exp
+semicolon.opt:
+ /* empty */
+| ";"
;
@end group
@end example
-@noindent
-It is customary to write a comment @samp{/* empty */} in each rule
-with no components.
@node Recursion
-@section Recursive Rules
+@subsection Recursive Rules
@cindex recursive rule
+@cindex rule, recursive
A rule is called @dfn{recursive} when its @var{result} nonterminal
appears also on its right hand side. Nearly all Bison grammars need to
@group
bar:
- /* empty */ @{ previous_expr = $0; @}
+ %empty @{ previous_expr = $0; @}
;
@end group
@end example
These actions are written just like usual end-of-rule actions, but they
are executed before the parser even recognizes the following components.
+@menu
+* Using Mid-Rule Actions:: Putting an action in the middle of a rule.
+* Mid-Rule Action Translation:: How mid-rule actions are actually processed.
+* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
+@end menu
+
+@node Using Mid-Rule Actions
+@subsubsection Using Mid-Rule Actions
+
A mid-rule action may refer to the components preceding it using
@code{$@var{n}}, but it may not refer to subsequent components because
it is run before they are parsed.
@example
@group
stmt:
- LET '(' var ')'
- @{ $<context>$ = push_context (); declare_variable ($3); @}
+ "let" '(' var ')'
+ @{
+ $<context>$ = push_context ();
+ declare_variable ($3);
+ @}
stmt
- @{ $$ = $6; pop_context ($<context>5); @}
+ @{
+ $$ = $6;
+ pop_context ($<context>5);
+ @}
@end group
@end example
@code{context} in the data-type union. Then it calls
@code{declare_variable} to add the new variable to that list. Once the
first action is finished, the embedded statement @code{stmt} can be
-parsed. Note that the mid-rule action is component number 5, so the
-@samp{stmt} is component number 6.
+parsed.
+
+Note that the mid-rule action is component number 5, so the @samp{stmt} is
+component number 6. Named references can be used to improve the readability
+and maintainability (@pxref{Named References}):
+
+@example
+@group
+stmt:
+ "let" '(' var ')'
+ @{
+ $<context>let = push_context ();
+ declare_variable ($3);
+ @}[let]
+ stmt
+ @{
+ $$ = $6;
+ pop_context ($<context>let);
+ @}
+@end group
+@end example
After the embedded statement is parsed, its semantic value becomes the
value of the entire @code{let}-statement. Then the semantic value from the
@group
%type <context> let
%destructor @{ pop_context ($$); @} let
+@end group
%%
+@group
stmt:
let stmt
@{
$$ = $2;
- pop_context ($1);
+ pop_context ($let);
@};
+@end group
+@group
let:
- LET '(' var ')'
+ "let" '(' var ')'
@{
- $$ = push_context ();
+ $let = push_context ();
declare_variable ($3);
@};
Any mid-rule action can be converted to an end-of-rule action in this way, and
this is what Bison actually does to implement mid-rule actions.
+@node Mid-Rule Action Translation
+@subsubsection Mid-Rule Action Translation
+@vindex $@@@var{n}
+@vindex @@@var{n}
+
+As hinted earlier, mid-rule actions are actually transformed into regular
+rules and actions. The various reports generated by Bison (textual,
+graphical, etc., see @ref{Understanding, , Understanding Your Parser})
+reveal this translation, best explained by means of an example. The
+following rule:
+
+@example
+exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @};
+@end example
+
+@noindent
+is translated into:
+
+@example
+$@@1: %empty @{ a(); @};
+$@@2: %empty @{ c(); @};
+$@@3: %empty @{ d(); @};
+exp: $@@1 "b" $@@2 $@@3 "e" @{ f(); @};
+@end example
+
+@noindent
+with new nonterminal symbols @code{$@@@var{n}}, where @var{n} is a number.
+
+A mid-rule action is expected to generate a value if it uses @code{$$}, or
+the (final) action uses @code{$@var{n}} where @var{n} denote the mid-rule
+action. In that case its nonterminal is rather named @code{@@@var{n}}:
+
+@example
+exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
+@end example
+
+@noindent
+is translated into
+
+@example
+@@1: %empty @{ a(); @};
+@@2: %empty @{ $$ = c(); @};
+$@@3: %empty @{ d(); @};
+exp: @@1 "b" @@2 $@@3 "e" @{ f = $1; @}
+@end example
+
+There are probably two errors in the above example: the first mid-rule
+action does not generate a value (it does not use @code{$$} although the
+final action uses it), and the value of the second one is not used (the
+final action does not use @code{$3}). Bison reports these errors when the
+@code{midrule-value} warnings are enabled (@pxref{Invocation, ,Invoking
+Bison}):
+
+@example
+$ bison -fcaret -Wmidrule-value mid.y
+@group
+mid.y:2.6-13: warning: unset value: $$
+ exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
+ ^^^^^^^^
+@end group
+@group
+mid.y:2.19-31: warning: unused value: $3
+ exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
+ ^^^^^^^^^^^^^
+@end group
+@end example
+
+
+@node Mid-Rule Conflicts
+@subsubsection Conflicts due to Mid-Rule Actions
Taking action before a rule is completely recognized often leads to
conflicts since the parser must commit to a parse in order to execute the
action. For example, the following two rules, without mid-rule actions,
@example
@group
subroutine:
- /* empty */ @{ prepare_for_local_variables (); @}
+ %empty @{ prepare_for_local_variables (); @}
;
@end group
Now Bison can execute the action in the rule for @code{subroutine} without
deciding which rule for @code{compound} it will eventually use.
+
@node Tracking Locations
@section Tracking Locations
@cindex location
@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
Parsers}, for the details of this. The variable @code{yynerrs}
becomes local in @code{yyparse} in pull mode but it becomes a member
-of yypstate in push mode. (@pxref{Error Reporting, ,The Error
+of @code{yypstate} in push mode. (@pxref{Error Reporting, ,The Error
Reporting Function @code{yyerror}}). The convention for calling
@code{yyparse} itself is unchanged.
@end deffn
@deffn {Directive} %defines @var{defines-file}
-Same as above, but save in the file @var{defines-file}.
+Same as above, but save in the file @file{@var{defines-file}}.
@end deffn
@deffn {Directive} %destructor
supported languages include C, C++, and Java.
@var{language} is case-insensitive.
-This directive is experimental and its effect may be modified in future
-releases.
@end deffn
@deffn {Directive} %locations
@end deffn
@deffn {Directive} %output "@var{file}"
-Specify @var{file} for the parser implementation file.
+Generate the parser implementation in @file{@var{file}}.
@end deffn
@deffn {Directive} %pure-parser
skeleton (@pxref{Decl Summary,,%language}, @pxref{Decl
Summary,,%skeleton}).
Unaccepted @var{variable}s produce an error.
-Some of the accepted @var{variable}s are:
+Some of the accepted @var{variable}s are described below.
-@table @code
-@c ================================================== api.namespace
-@item api.namespace
-@findex %define api.namespace
+@deffn Directive {%define api.namespace} @{@var{namespace}@}
@itemize
@item Languages(s): C++
For example, if you specify:
@example
-%define api.namespace "foo::bar"
+%define api.namespace @{foo::bar@}
@end example
Bison uses @code{foo::bar} verbatim in references such as:
lexical analyzer function. For example, if you specify:
@example
-%define api.namespace "foo"
+%define api.namespace @{foo@}
%name-prefix "bar::"
@end example
The parser namespace is @code{foo} and @code{yylex} is referenced as
@code{bar::lex}.
@end itemize
-@c namespace
+@end deffn
+@c api.namespace
@c ================================================== api.location.type
-@item @code{api.location.type}
-@findex %define api.location.type
+@deffn {Directive} {%define api.location.type} @var{type}
@itemize @bullet
@item Language(s): C++, Java
@item Default Value: none
-@item History: introduced in Bison 2.7
+@item History:
+Introduced in Bison 2.7 for C, C++ and Java. Introduced under the name
+@code{location_type} for C++ in Bison 2.5 and for Java in Bison 2.4.
@end itemize
+@end deffn
@c ================================================== api.prefix
-@item api.prefix
-@findex %define api.prefix
+@deffn {Directive} {%define api.prefix} @var{prefix}
@itemize @bullet
@item Language(s): All
@item History: introduced in Bison 2.6
@end itemize
+@end deffn
@c ================================================== api.pure
-@item api.pure
-@findex %define api.pure
+@deffn Directive {%define api.pure}
@itemize @bullet
@item Language(s): C
@code{yyerror} are:
@example
-void yyerror (char const *msg); /* Yacc parsers. */
-void yyerror (YYLTYPE *locp, char const *msg); /* GLR parsers. */
+void yyerror (char const *msg); // Yacc parsers.
+void yyerror (YYLTYPE *locp, char const *msg); // GLR parsers.
@end example
But if @samp{%locations %define api.pure %parse-param @{int *nastiness@}} is
@item Default Value: @code{false}
-@item History: the @code{full} value was introduced in Bison 2.7
+@item History:
+the @code{full} value was introduced in Bison 2.7
@end itemize
+@end deffn
@c api.pure
@c ================================================== api.push-pull
-@item api.push-pull
-@findex %define api.push-pull
+@deffn Directive {%define api.push-pull} @var{kind}
@itemize @bullet
@item Language(s): C (deterministic parsers only)
@item Default Value: @code{pull}
@end itemize
+@end deffn
@c api.push-pull
@c ================================================== api.token.constructor
-@item api.token.constructor
-@findex %define api.token.constructor
+@deffn Directive {%define api.token.constructor}
@itemize @bullet
@item Language(s):
@item History:
introduced in Bison 2.8
@end itemize
+@end deffn
@c api.token.constructor
@c ================================================== api.token.prefix
-@item api.token.prefix
-@findex %define api.token.prefix
+@deffn Directive {%define api.token.prefix} @var{prefix}
@itemize
@item Languages(s): all
@item History:
introduced in Bison 2.8
@end itemize
+@end deffn
@c api.token.prefix
+@c ================================================== api.value.type
+@deffn Directive {%define api.value.type} @var{type}
+@itemize @bullet
+@item Language(s):
+C++
+
+@item Purpose:
+Request variant-based semantic values.
+@xref{C++ Variants}.
+
+@item Default Value:
+FIXME:
+@item History:
+introduced in Bison 2.8. Was introduced for Java only in 2.3b as
+@code{stype}.
+@end itemize
+@end deffn
+@c api.value.type
+
+
+@c ================================================== location_type
+@deffn Directive {%define location_type}
+Obsoleted by @code{api.location.type} since Bison 2.7.
+@end deffn
+
+
@c ================================================== lr.default-reduction
-@item lr.default-reduction
-@findex %define lr.default-reduction
+@deffn Directive {%define lr.default-reduction} @var{when}
@itemize @bullet
@item Language(s): all
introduced as @code{lr.default-reduction} in 2.5, renamed as
@code{lr.default-reduction} in 2.8.
@end itemize
+@end deffn
@c ============================================ lr.keep-unreachable-state
-@item lr.keep-unreachable-state
-@findex %define lr.keep-unreachable-state
+@deffn Directive {%define lr.keep-unreachable-state}
@itemize @bullet
@item Language(s): all
remain in the parser tables. @xref{Unreachable States}.
@item Accepted Values: Boolean
@item Default Value: @code{false}
-@end itemize
+@item History:
introduced as @code{lr.keep_unreachable_states} in 2.3b, renamed as
@code{lr.keep-unreachable-states} in 2.5, and as
@code{lr.keep-unreachable-state} in 2.8.
+@end itemize
+@end deffn
@c lr.keep-unreachable-state
@c ================================================== lr.type
-@item lr.type
-@findex %define lr.type
+@deffn Directive {%define lr.type} @var{type}
@itemize @bullet
@item Language(s): all
@item Default Value: @code{lalr}
@end itemize
-
+@end deffn
@c ================================================== namespace
-@item namespace
-@findex %define namespace
+@deffn Directive %define namespace @{@var{namespace}@}
Obsoleted by @code{api.namespace}
@c namespace
-
+@end deffn
@c ================================================== parse.assert
-@item parse.assert
-@findex %define parse.assert
+@deffn Directive {%define parse.assert}
@itemize
@item Languages(s): C++
@item Default Value: @code{false}
@end itemize
+@end deffn
@c parse.assert
@c ================================================== parse.error
-@item parse.error
-@findex %define parse.error
+@deffn Directive {%define parse.error}
@itemize
@item Languages(s):
all
@item Default Value:
@code{simple}
@end itemize
+@end deffn
@c parse.error
@c ================================================== parse.lac
-@item parse.lac
-@findex %define parse.lac
+@deffn Directive {%define parse.lac}
@itemize
@item Languages(s): C (deterministic parsers only)
@item Accepted Values: @code{none}, @code{full}
@item Default Value: @code{none}
@end itemize
+@end deffn
@c parse.lac
@c ================================================== parse.trace
-@item parse.trace
-@findex %define parse.trace
+@deffn Directive {%define parse.trace}
@itemize
@item Languages(s): C, C++, Java
@item Default Value: @code{false}
@end itemize
+@end deffn
@c parse.trace
-@c ================================================== variant
-@item variant
-@findex %define variant
-
-@itemize @bullet
-@item Language(s):
-C++
-
-@item Purpose:
-Request variant-based semantic values.
-@xref{C++ Variants}.
-
-@item Accepted Values:
-Boolean.
-
-@item Default Value:
-@code{false}
-@end itemize
-@c variant
-@end table
-
-
@node %code Summary
@subsection %code Summary
@findex %code
@code{YYDEBUG} (not renamed) is used as a default value:
@example
-/* Enabling traces. */
+/* Debug traces. */
#ifndef CDEBUG
# if defined YYDEBUG
# if YYDEBUG
@samp{%define api.push-pull both} declaration is used.
@xref{Push Decl, ,A Push Parser}.
-@deftypefun int yypush_parse (yypstate *yyps)
+@deftypefun int yypush_parse (yypstate *@var{yyps})
The value returned by @code{yypush_parse} is the same as for yyparse with
the following exception: it returns @code{YYPUSH_MORE} if more input is
required to finish parsing the grammar.
declaration is used.
@xref{Push Decl, ,A Push Parser}.
-@deftypefun int yypull_parse (yypstate *yyps)
+@deftypefun int yypull_parse (yypstate *@var{yyps})
The value returned by @code{yypull_parse} is the same as for @code{yyparse}.
@end deftypefun
@samp{%define api.push-pull both} declaration is used.
@xref{Push Decl, ,A Push Parser}.
-@deftypefun void yypstate_delete (yypstate *yyps)
+@deftypefun void yypstate_delete (yypstate *@var{yyps})
This function will reclaim the memory associated with a parser instance.
After this call, you should no longer attempt to use the parser instance.
@end deftypefun
return 0;
@dots{}
if (c == '+' || c == '-')
- return c; /* Assume token type for `+' is '+'. */
+ return c; /* Assume token type for '+' is '+'. */
@dots{}
return INT; /* Return the type of the token. */
@dots{}
@end deffn
@deffn {Value} @@$
-@findex @@$
Acts like a structure variable containing information on the textual
location of the grouping made by the current rule. @xref{Tracking
Locations}.
@item
@cindex bison-i18n.m4
Into the directory containing the GNU Autoconf macros used
-by the package---often called @file{m4}---copy the
+by the package ---often called @file{m4}--- copy the
@file{bison-i18n.m4} file installed by Bison under
@samp{share/aclocal/bison-i18n.m4} in Bison's installation directory.
For example:
@example
@group
sequence:
- /* empty */ @{ printf ("empty sequence\n"); @}
+ %empty @{ printf ("empty sequence\n"); @}
| maybeword
| sequence word @{ printf ("added word %s\n", $2); @}
;
@group
maybeword:
- /* empty */ @{ printf ("empty maybeword\n"); @}
-| word @{ printf ("single word %s\n", $1); @}
+ %empty @{ printf ("empty maybeword\n"); @}
+| word @{ printf ("single word %s\n", $1); @}
;
@end group
@end example
@example
@group
sequence:
- /* empty */ @{ printf ("empty sequence\n"); @}
+ %empty @{ printf ("empty sequence\n"); @}
| sequence word @{ printf ("added word %s\n", $2); @}
;
@end group
@example
@group
sequence:
- /* empty */
+ %empty
| sequence words
| sequence redirects
;
@group
words:
- /* empty */
+ %empty
| words word
;
@end group
@group
redirects:
- /* empty */
+ %empty
| redirects redirect
;
@end group
@example
sequence:
- /* empty */
+ %empty
| sequence word
| sequence redirect
;
@example
@group
sequence:
- /* empty */
+ %empty
| sequence words
| sequence redirects
;
%%
@group
sequence:
- /* empty */
+ %empty
| sequence word %prec "sequence"
| sequence redirect %prec "sequence"
;
%%
@group
sequence:
- /* empty */
+ %empty
| sequence word %prec "word"
| sequence redirect %prec "redirect"
;
@example
stmts:
- /* empty string */
+ %empty
| stmts '\n'
| stmts exp '\n'
| stmts error '\n'
Developing a parser can be a challenge, especially if you don't understand
the algorithm (@pxref{Algorithm, ,The Bison Parser Algorithm}). This
-chapter explains how to generate and read the detailed description of the
-automaton, and how to enable and understand the parser run-time traces.
+chapter explains how understand and debug a parser.
+
+The first sections focus on the static part of the parser: its structure.
+They explain how to generate and read the detailed description of the
+automaton. There are several formats available:
+@itemize @minus
+@item
+as text, see @ref{Understanding, , Understanding Your Parser};
+
+@item
+as a graph, see @ref{Graphviz,, Visualizing Your Parser};
+
+@item
+or as a markup report that can be turned, for instance, into HTML, see
+@ref{Xml,, Visualizing your parser in multiple formats}.
+@end itemize
+
+The last section focuses on the dynamic part of the parser: how to enable
+and understand the parser run-time traces (@pxref{Tracing, ,Tracing Your
+Parser}).
@menu
* Understanding:: Understanding the structure of your parser.
As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm})
Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
frequent than one would hope), looking at this automaton is required to
-tune or simply fix a parser. Bison provides two different
-representation of it, either textually or graphically (as a DOT file).
+tune or simply fix a parser.
The textual file is generated when the options @option{--report} or
@option{--verbose} are specified, see @ref{Invocation, , Invoking
@example
%token NUM STR
+@group
%left '+' '-'
%left '*'
+@end group
%%
+@group
exp:
exp '+' exp
| exp '-' exp
| exp '/' exp
| NUM
;
+@end group
useless: STR;
%%
@end example
@example
calc.y: warning: 1 nonterminal useless in grammar
calc.y: warning: 1 rule useless in grammar
-calc.y:11.1-7: warning: nonterminal useless in grammar: useless
-calc.y:11.10-12: warning: rule useless in grammar: useless: STR
+calc.y:12.1-7: warning: nonterminal useless in grammar: useless
+calc.y:12.10-12: warning: rule useless in grammar: useless: STR
calc.y: conflicts: 7 shift/reduce
@end example
the location of the input cursor.
@example
-state 0
+State 0
0 $accept: . exp $end
@option{--report=itemset} to list the derived items as well:
@example
-state 0
+State 0
0 $accept: . exp $end
1 exp: . exp '+' exp
In the state 1@dots{}
@example
-state 1
+State 1
5 exp: NUM .
@noindent
the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead token
(@samp{$default}), the parser will reduce it. If it was coming from
-state 0, then, after this reduction it will return to state 0, and will
+State 0, then, after this reduction it will return to state 0, and will
jump to state 2 (@samp{exp: go to state 2}).
@example
-state 2
+State 2
0 $accept: exp . $end
1 exp: exp . '+' exp
state}:
@example
-state 3
+State 3
0 $accept: exp $end .
the reader.
@example
-state 4
+State 4
1 exp: exp '+' . exp
exp go to state 8
-state 5
+State 5
2 exp: exp '-' . exp
exp go to state 9
-state 6
+State 6
3 exp: exp '*' . exp
exp go to state 10
-state 7
+State 7
4 exp: exp '/' . exp
1 shift/reduce}:
@example
-state 8
+State 8
1 exp: exp . '+' exp
1 | exp '+' exp .
@option{--report=lookahead}, Bison specifies these lookahead tokens:
@example
-state 8
+State 8
1 exp: exp . '+' exp
1 | exp '+' exp . [$end, '+', '-', '/']
@example
@group
-state 9
+State 9
1 exp: exp . '+' exp
2 | exp . '-' exp
@end group
@group
-state 10
+State 10
1 exp: exp . '+' exp
2 | exp . '-' exp
@end group
@group
-state 11
+State 11
1 exp: exp . '+' exp
2 | exp . '-' exp
@noindent
Observe that state 11 contains conflicts not only due to the lack of
-precedence of @samp{/} with respect to @samp{+}, @samp{-}, and
-@samp{*}, but also because the
-associativity of @samp{/} is not specified.
+precedence of @samp{/} with respect to @samp{+}, @samp{-}, and @samp{*}, but
+also because the associativity of @samp{/} is not specified.
-Note that Bison may also produce an HTML version of this output, via an XML
-file and XSLT processing (@pxref{Xml}).
+Bison may also produce an HTML version of this output, via an XML file and
+XSLT processing (@pxref{Xml,,Visualizing your parser in multiple formats}).
@c ================================================= Graphical Representation
(@pxref{Invocation, , Invoking Bison}). Its name is made by removing
@samp{.tab.c} or @samp{.c} from the parser implementation file name, and
adding @samp{.dot} instead. If the grammar file is @file{foo.y}, the
-Graphviz output file is called @file{foo.dot}.
+Graphviz output file is called @file{foo.dot}. A DOT file may also be
+produced via an XML file and XSLT processing (@pxref{Xml,,Visualizing your
+parser in multiple formats}).
+
The following grammar file, @file{rr.y}, will be used in the sequel:
@end group
@end example
-The graphical output is very similar to the textual one, and as such it is
-easier understood by making direct comparisons between them. See
-@ref{Debugging, , Debugging Your Parser} for a detailled analysis of the
-textual report.
+The graphical output
+@ifnotinfo
+(see @ref{fig:graph})
+@end ifnotinfo
+is very similar to the textual one, and as such it is easier understood by
+making direct comparisons between them. @xref{Debugging, , Debugging Your
+Parser}, for a detailled analysis of the textual report.
+
+@ifnotinfo
+@float Figure,fig:graph
+@image{figs/example, 430pt}
+@caption{A graphical rendering of the parser.}
+@end float
+@end ifnotinfo
@subheading Graphical Representation of States
@example
@group
-state 3
+State 3
1 exp: a . ";"
This is how reductions are represented in the verbose file @file{rr.output}:
@example
-state 1
+State 1
3 a: "0" . [";"]
4 b: "0" . ["."]
are distinguished by a red filling color on these nodes, just like how they are
reported between square brackets in the verbose file.
-The reduction corresponding to the rule number 0 is the acceptation state. It
-is shown as a blue diamond, labelled "Acc".
+The reduction corresponding to the rule number 0 is the acceptation
+state. It is shown as a blue diamond, labelled ``Acc''.
@subheading Graphical representation of go tos
The @samp{go to} jump transitions are represented as dotted lines bearing
the name of the rule being jumped to.
-Note that a DOT file may also be produced via an XML file and XSLT
-processing (@pxref{Xml}).
-
@c ================================================= XML
@node Xml
@cindex xml
Bison supports two major report formats: textual output
-(@pxref{Understanding}) when invoked with option @option{--verbose}, and DOT
-(@pxref{Graphviz}) when invoked with option @option{--graph}. However,
+(@pxref{Understanding, ,Understanding Your Parser}) when invoked
+with option @option{--verbose}, and DOT
+(@pxref{Graphviz,, Visualizing Your Parser}) when invoked with
+option @option{--graph}. However,
another alternative is to output an XML file that may then be, with
@command{xsltproc}, rendered as either a raw text format equivalent to the
verbose file, or as an HTML version of the same file, with clickable
XSLT have no difference whatsoever with those obtained by invoking
@command{bison} with options @option{--verbose} or @option{--graph}.
-The textual file is generated when the options @option{-x} or
+The XML file is generated when the options @option{-x} or
@option{--xml[=FILE]} are specified, see @ref{Invocation,,Invoking Bison}.
If not specified, its name is made by removing @samp{.tab.c} or @samp{.c}
from the parser implementation file name, and adding @samp{.xml} instead.
@item xml2dot.xsl
Used to output a copy of the DOT visualization of the automaton.
@item xml2text.xsl
-Used to output a copy of the .output file.
+Used to output a copy of the @samp{.output} file.
@item xml2xhtml.xsl
-Used to output an xhtml enhancement of the .output file.
+Used to output an xhtml enhancement of the @samp{.output} file.
@end table
-Sample usage (requires @code{xsltproc}):
+Sample usage (requires @command{xsltproc}):
@example
-$ bison -x input.y
+$ bison -x gr.y
@group
$ bison --print-datadir
/usr/local/share/bison
@end group
-$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl input.xml > input.html
+$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl gr.xml >gr.html
@end example
@c ================================================= Tracing
@noindent
The previous reduction demonstrates the @code{%printer} directive for
-@code{<val>}: both the token @code{NUM} and the resulting non-terminal
+@code{<val>}: both the token @code{NUM} and the resulting nonterminal
@code{exp} have @samp{1} as value.
@example
short option. It is followed by a cross key alphabetized by long
option.
-@c Please, keep this ordered as in `bison --help'.
+@c Please, keep this ordered as in 'bison --help'.
@noindent
Operations modes:
@table @option
Deprecated constructs whose support will be removed in future versions of
Bison.
+@item empty-rule
+Empty rules without @code{%empty}. @xref{Empty Rules}. Disabled by
+default, but enabled by uses of @code{%empty}, unless
+@option{-Wno-empty-rule} was specified.
+
+@item precedence
+Useless precedence and associativity directives. Disabled by default.
+
+Consider for instance the following grammar:
+
+@example
+@group
+%nonassoc "="
+%left "+"
+%left "*"
+%precedence "("
+@end group
+%%
+@group
+stmt:
+ exp
+| "var" "=" exp
+;
+@end group
+
+@group
+exp:
+ exp "+" exp
+| exp "*" "num"
+| "(" exp ")"
+| "num"
+;
+@end group
+@end example
+
+Bison reports:
+
+@c cannot leave the location and the [-Wprecedence] for lack of
+@c width in PDF.
+@example
+@group
+warning: useless precedence and associativity for "="
+ %nonassoc "="
+ ^^^
+@end group
+@group
+warning: useless associativity for "*", use %precedence
+ %left "*"
+ ^^^
+@end group
+@group
+warning: useless precedence for "("
+ %precedence "("
+ ^^^
+@end group
+@end example
+
+One would get the exact same parser with the following directives instead:
+
+@example
+@group
+%left "+"
+%precedence "*"
+@end group
+@end example
+
@item other
All warnings not categorized above. These warnings are enabled by default.
categories.
@item all
-All the warnings.
+All the warnings except @code{yacc}.
+
@item none
Turn off all the warnings.
+
@item error
See @option{-Werror}, below.
@end table
$ bison -Werror=yacc,conflicts-sr input.y
$ bison -Werror=yacc,error=conflicts-sr input.y
@end example
+
+@item -f [@var{feature}]
+@itemx --feature[=@var{feature}]
+Activate miscellaneous @var{feature}. @var{feature} can be one of:
+@table @code
+@item caret
+@itemx diagnostics-show-caret
+Show caret errors, in a manner similar to GCC's
+@option{-fdiagnostics-show-caret}, or Clang's @option{-fcaret-diagnotics}. The
+location provided with the message is used to quote the corresponding line of
+the source file, underlining the important part of it with carets (^). Here is
+an example, using the following file @file{in.y}:
+
+@example
+%type <ival> exp
+%%
+exp: exp '+' exp @{ $exp = $1 + $2; @};
+@end example
+
+When invoked with @option{-fcaret} (or nothing), Bison will report:
+
+@example
+@group
+in.y:3.20-23: error: ambiguous reference: '$exp'
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^^
+@end group
+@group
+in.y:3.1-3: refers to: $exp at $$
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^
+@end group
+@group
+in.y:3.6-8: refers to: $exp at $1
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^
+@end group
+@group
+in.y:3.14-16: refers to: $exp at $3
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^
+@end group
+@group
+in.y:3.32-33: error: $2 of 'exp' has no declared type
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^
+@end group
+@end example
+
+Whereas, when invoked with @option{-fno-caret}, Bison will only report:
+
+@example
+@group
+in.y:3.20-23: error: ambiguous reference: ‘$exp’
+in.y:3.1-3: refers to: $exp at $$
+in.y:3.6-8: refers to: $exp at $1
+in.y:3.14-16: refers to: $exp at $3
+in.y:3.32-33: error: $2 of ‘exp’ has no declared type
+@end group
+@end example
+
+This option is activated by default.
+
+@end table
@end table
@noindent
Summary}). Currently supported languages include C, C++, and Java.
@var{language} is case-insensitive.
-This option is experimental and its effect may be modified in future
-releases.
-
@item --locations
Pretend that @code{%locations} was specified. @xref{Decl Summary}.
@node C++ Variants
@subsubsection C++ Variants
-Starting with version 2.6, Bison provides a @emph{variant} based
-implementation of semantic values for C++. This alleviates all the
-limitations reported in the previous section, and in particular, object
-types can be used without pointers.
+Bison provides a @emph{variant} based implementation of semantic values for
+C++. This alleviates all the limitations reported in the previous section,
+and in particular, object types can be used without pointers.
To enable variant-based semantic values, set @code{%define} variable
@code{variant} (@pxref{%define Summary,, variant}). Once this defined,
type on all platforms, alignments are enforced for @code{double} whatever
types are actually used. This may waste space in some cases.
-@item
-Our implementation is not conforming with strict aliasing rules. Alias
-analysis is a technique used in optimizing compilers to detect when two
-pointers are disjoint (they cannot ``meet''). Our implementation breaks
-some of the rules that G++ 4.4 uses in its alias analysis, so @emph{strict
-alias analysis must be disabled}. Use the option
-@option{-fno-strict-aliasing} to compile the generated parser.
-
@item
There might be portability issues we are not aware of.
@end itemize
@node Complete Symbols
@subsubsection Complete Symbols
-If you specified both @code{%define variant} and
+If you specified both @code{%define api.value.type variant} and
@code{%define api.token.constructor},
the @code{parser} class also defines the class @code{parser::symbol_type}
which defines a @emph{complete} symbol, aggregating its type (i.e., the
@noindent
@findex %define api.token.constructor
-@findex %define variant
+@findex %define api.value.type variant
This example will use genuine C++ objects as semantic values, therefore, we
require the variant-based interface. To make sure we properly use it, we
enable assertions. To fully benefit from type-safety and more natural
@comment file: calc++-parser.yy
@example
%define api.token.constructor
+%define api.value.type variant
%define parse.assert
-%define variant
@end example
@noindent
unit: assignments exp @{ driver.result = $2; @};
assignments:
- /* Nothing. */ @{@}
+ %empty @{@}
| assignments assignment @{@};
assignment:
@comment file: calc++-scanner.ll
@example
-%option noyywrap nounput batch debug
+%option noyywrap nounput batch debug noinput
@end example
@noindent
By default, the semantic stack is declared to have @code{Object} members,
which means that the class types you specify can be of any class.
To improve the type safety of the parser, you can declare the common
-superclass of all the semantic values using the @samp{%define stype}
+superclass of all the semantic values using the @samp{%define api.value.type}
directive. For example, after the following declaration:
@example
-%define stype "ASTNode"
+%define api.value.type "ASTNode"
@end example
@noindent
@deftypemethod {Lexer} {Object} getLVal ()
Return the semantic value of the last token that yylex returned.
-The return type can be changed using @samp{%define stype
+The return type can be changed using @samp{%define api.value.type
"@var{class-name}".}
@end deftypemethod
@defvar $$
The semantic value for the grouping made by the current rule. As a
value, this is in the base type (@code{Object} or as specified by
-@samp{%define stype}) as in not cast to the declared subtype because
+@samp{%define api.value.type}) as in not cast to the declared subtype because
casts are not allowed on the left-hand side of Java assignments.
Use an explicit Java cast if the correct subtype is needed.
@xref{Java Semantic Values}.
@item
Java lacks unions, so @code{%union} has no effect. Instead, semantic
values have a common base type: @code{Object} or as specified by
-@samp{%define stype}. Angle brackets on @code{%token}, @code{type},
+@samp{%define api.value.type}. Angle brackets on @code{%token}, @code{type},
@code{$@var{n}} and @code{$$} specify subtypes rather than fields of
an union. The type of @code{$$}, even with angle brackets, is the base
type since Java casts are not allow on the left-hand side of assignments.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define stype} "@var{class}"
+@deffn {Directive} {%define api.value.type} "@var{class}"
The base type of semantic values. Default is @code{Object}.
@xref{Java Semantic Values}.
@end deffn
version. If you have trouble compiling, you should also include a
transcript of the build session, starting with the invocation of
`configure'. Depending on the nature of the bug, you may be asked to
-send additional files as well (such as `config.h' or `config.cache').
+send additional files as well (such as @file{config.h} or @file{config.cache}).
Patches are most welcome, but not required. That is, do not hesitate to
send a bug report just because you cannot provide a fix.
@end deffn
@deffn {Variable} @@@var{n}
+@deffnx {Symbol} @@@var{n}
In an action, the location of the @var{n}-th symbol of the right-hand side
of the rule. @xref{Tracking Locations}.
+
+In a grammar, the Bison-generated nonterminal symbol for a mid-rule action
+with a semantical value. @xref{Mid-Rule Action Translation}.
@end deffn
@deffn {Variable} @@@var{name}
-In an action, the location of a symbol addressed by name. @xref{Tracking
-Locations}.
+@deffnx {Variable} @@[@var{name}]
+In an action, the location of a symbol addressed by @var{name}.
+@xref{Tracking Locations}.
@end deffn
-@deffn {Variable} @@[@var{name}]
-In an action, the location of a symbol addressed by name. @xref{Tracking
-Locations}.
+@deffn {Symbol} $@@@var{n}
+In a grammar, the Bison-generated nonterminal symbol for a mid-rule action
+with no semantical value. @xref{Mid-Rule Action Translation}.
@end deffn
@deffn {Variable} $$
@end deffn
@deffn {Variable} $@var{name}
-In an action, the semantic value of a symbol addressed by name.
-@xref{Actions}.
-@end deffn
-
-@deffn {Variable} $[@var{name}]
-In an action, the semantic value of a symbol addressed by name.
+@deffnx {Variable} $[@var{name}]
+In an action, the semantic value of a symbol addressed by @var{name}.
@xref{Actions}.
@end deffn
feature.
@end deffn
-@deffn {Construct} /*@dots{}*/
-Comment delimiters, as in C.
+@deffn {Construct} /* @dots{} */
+@deffnx {Construct} // @dots{}
+Comments, as in C/C++.
@end deffn
@deffn {Delimiter} :
GLR Parsers}.
@end deffn
+@deffn {Directive} %empty
+Bison declaration to declare make explicit that a rule has an empty
+right-hand side. @xref{Empty Rules}.
+@end deffn
+
@deffn {Symbol} $end
The predefined token marking the end of the token stream. It cannot be
used in the grammar.
@code{yylex}}.
@end deffn
-@deffn {Macro} YYLEX_PARAM
-An obsolete macro for specifying an extra argument (or list of extra
-arguments) for @code{yyparse} to pass to @code{yylex}. The use of this
-macro is deprecated, and is supported only for Yacc like parsers.
-@xref{Pure Calling,, Calling Conventions for Pure Parsers}.
-@end deffn
-
@deffn {Variable} yylloc
External variable in which @code{yylex} should place the line and column
numbers associated with a token. (In a pure parser, it is a local
@deffn {Variable} yynerrs
Global variable which Bison increments each time it reports a syntax error.
(In a pure parser, it is a local variable within @code{yyparse}. In a
-pure push parser, it is a member of yypstate.)
+pure push parser, it is a member of @code{yypstate}.)
@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
@end deffn
@item Accepting state
A state whose only action is the accept action.
The accepting state is thus a consistent state.
-@xref{Understanding,,}.
+@xref{Understanding, ,Understanding Your Parser}.
@item Backus-Naur Form (BNF; also called ``Backus Normal Form'')
Formal method of specifying context-free grammars originally proposed