@dfn{nonterminal symbols}; those which can't be subdivided are called
@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
corresponding to a single terminal symbol a @dfn{token}, and a piece
-corresponding to a single nonterminal symbol a @dfn{grouping}.@refill
+corresponding to a single nonterminal symbol a @dfn{grouping}.
We can use the C language as an example of what symbols, terminal and
nonterminal, mean. The tokens of C are identifiers, constants (numeric and
@emph{any} integer constant is grammatically valid in that position. The
precise value of the constant is irrelevant to how to parse the input: if
@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equally
-grammatical.@refill
+grammatical.
But the precise value is very important for what the input means once it is
parsed. A compiler is useless if it fails to distinguish between 4, 1 and
@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everything
you need to know to decide where the token may validly appear and how to
group it with other tokens. The grammar rules know nothing about tokens
-except their types.@refill
+except their types.
The semantic value has all the rest of the information about the
meaning of the token, such as the value of an integer, or the name of an
@example
%@{
-@var{Prologue (declarations)}
+@var{Prologue}
%@}
@var{Bison declarations}
%%
@var{Grammar rules}
%%
-@var{Epilogue (additional code)}
+@var{Epilogue}
@end example
@noindent
@code{yyerrok}, a macro defined automatically by Bison; its meaning is
that error recovery is complete (@pxref{Error Recovery}). Note the
difference between @code{yyerrok} and @code{yyerror}; neither one is a
-misprint.@refill
+misprint.
This form of error recovery deals with syntax errors. There are other
kinds of errors; for example, division by zero, which raises an exception
(@code{VAR} or @code{FNCT}) is returned to @code{yyparse}. If it is not
already in the table, then it is installed as a @code{VAR} using
@code{putsym}. Again, a pointer and its type (which must be @code{VAR}) is
-returned to @code{yyparse}.@refill
+returned to @code{yyparse}.
No change is needed in the handling of numeric values and arithmetic
operators in @code{yylex}.
@cindex Prologue
@cindex declarations
-The @var{prologue} section contains macro definitions and
+The @var{Prologue} section contains macro definitions and
declarations of functions and variables that are used in the actions in the
grammar rules. These are copied to the beginning of the parser file so
that they precede the definition of @code{yyparse}. You can use
@cindex epilogue
@cindex C code, section for additional
-The @var{epilogue} is copied verbatim to the end of the parser file, just as
-the @var{prologue} is copied to the beginning. This is the most convenient
+The @var{Epilogue} is copied verbatim to the end of the parser file, just as
+the @var{Prologue} is copied to the beginning. This is the most convenient
place to put anything that you want to have in the parser file but which need
not come before the definition of @code{yyparse}. For example, the
definitions of @code{yylex} and @code{yyerror} often go here.
The symbol @code{error} is a terminal symbol reserved for error recovery
(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
-In particular, @code{yylex} should never return this value.
-The default value of the error token is 256, so in the
-unlikely event that you need to use a character token with numeric
-value 256 you must reassign the error token's value with a
-@code{%token} declaration.
+In particular, @code{yylex} should never return this value. The default
+value of the error token is 256, unless you explicitly assigned 256 to
+one of your tokens with a @code{%token} declaration.
@node Rules
@section Syntax of Grammar Rules
The sum is stored into @code{$$} so that it becomes the semantic value of
the addition-expression just recognized by the rule. If there were a
useful semantic value associated with the @samp{+} token, it could be
-referred to as @code{$2}.@refill
+referred to as @code{$2}.
+
+Note that the vertical-bar character @samp{|} is really a rule
+separator, and actions are attached to a single rule. This is a
+difference with tools like Flex, for which @samp{|} stands for either
+``or'', or ``the same action as that of the next rule''. In the
+following example, the action is triggered only when @samp{b} is found:
+
+@example
+@group
+a-or-b: 'a'|'b' @{ a_or_b_found = 1; @};
+@end group
+@end example
@cindex default action
If you don't specify an action for a rule, Bison supplies a default:
must declare a choice among these types for each terminal or nonterminal
symbol that can have a semantic value. Then each time you use @code{$$} or
@code{$@var{n}}, its data type is determined by which symbol it refers to
-in the rule. In this example,@refill
+in the rule. In this example,
@example
@group
@code{$1} and @code{$3} refer to instances of @code{exp}, so they all
have the data type declared for the nonterminal symbol @code{exp}. If
@code{$2} were used, it would have the data type declared for the
-terminal symbol @code{'+'}, whatever that might be.@refill
+terminal symbol @code{'+'}, whatever that might be.
Alternatively, you can specify the data type when you refer to the value,
by inserting @samp{<@var{type}>} after the @samp{$} at the beginning of the
@code{YYSTYPE}, as well as a few @code{extern} variable declarations.
If the parser output file is named @file{@var{name}.c} then this file
-is named @file{@var{name}.h}.@refill
+is named @file{@var{name}.h}.
This output file is essential if you wish to put the definition of
@code{yylex} in a separate source file, because @code{yylex} needs to
be able to refer to token type codes and the variable
-@code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}.@refill
+@code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}.
@item %file-prefix="@var{prefix}"
Specify a prefix to use for all Bison output file names. The names are
operator precedence and the unresolved ones.
The file's name is made by removing @samp{.tab.c} or @samp{.c} from
-the parser output file name, and adding @samp{.output} instead.@refill
+the parser output file name, and adding @samp{.output} instead.
Therefore, if the input file is @file{foo.y}, then the parser file is
called @file{foo.tab.c} by default. As a consequence, the verbose
-output file is called @file{foo.output}.@refill
+output file is called @file{foo.output}.
@item %yacc
Pretend the option @option{--yacc} was given, i.e., imitate Yacc,
To do this, use the @samp{-d} option when you run Bison, so that it will
write these macro definitions into a separate header file
@file{@var{name}.tab.h} which you can include in the other source files
-that need it. @xref{Invocation, ,Invoking Bison}.@refill
+that need it. @xref{Invocation, ,Invoking Bison}.
@menu
* Calling Convention:: How @code{yyparse} calls @code{yylex}.
@item $<@var{typealt}>@var{n}
Like @code{$@var{n}} but specifies alternative @var{typealt} in the
union specified by the @code{%union} declaration.
-@xref{Action Types, ,Data Types of Values in Actions}.@refill
+@xref{Action Types, ,Data Types of Values in Actions}.
@item YYABORT;
Return immediately from @code{yyparse}, indicating failure.
@code{%nonassoc}, can only be used once for a given token; so a token has
only one precedence declared in this way. For context-dependent
precedence, you need to use an additional mechanism: the @code{%prec}
-modifier for rules.@refill
+modifier for rules.
The @code{%prec} modifier declares the precedence of a particular rule by
specifying a terminal symbol whose precedence should be used for that rule.
@node Debugging
@chapter Debugging Your Parser
-@findex YYDEBUG
@findex yydebug
@cindex debugging
@cindex tracing the parser
If a Bison grammar compiles properly but doesn't do what you want when it
runs, the @code{yydebug} parser-trace feature can help you figure out why.
-To enable compilation of trace facilities, you must define the macro
-@code{YYDEBUG} to a nonzero value when you compile the parser. You
-could use @samp{-DYYDEBUG=1} as a compiler option or you could put
-@samp{#define YYDEBUG 1} in the prologue of the grammar file
-(@pxref{Prologue, , The Prologue}). Alternatively, use the @samp{-t}
-option when you run Bison (@pxref{Invocation, ,Invoking Bison}) or the
-@code{%debug} declaration (@pxref{Decl Summary, ,Bison Declaration
-Summary}). We suggest that you always define @code{YYDEBUG} so that
-debugging is always possible.
+There are several means to enable compilation of trace facilities:
+
+@table @asis
+@item the macro @code{YYDEBUG}
+@findex YYDEBUG
+Define the macro @code{YYDEBUG} to a nonzero value when you compile the
+parser. This is compliant with POSIX Yacc. You could use
+@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define
+YYDEBUG 1} in the prologue of the grammar file (@pxref{Prologue, , The
+Prologue}).
+
+@item the option @option{-t}, @option{--debug}
+Use the @samp{-t} option when you run Bison (@pxref{Invocation,
+,Invoking Bison}). This is POSIX compliant too.
+
+@item the directive @samp{%debug}
+@findex %debug
+Add the @code{%debug} directive (@pxref{Decl Summary, ,Bison
+Declaration Summary}). This is a Bison extension, which will prove
+useful when Bison will output parsers for languages that don't use a
+preprocessor. Useless POSIX and Yacc portability matter to you, this is
+the preferred solution.
+@end table
+
+We suggest that you always enable the debug option so that debugging is
+always possible.
The trace facility outputs messages with macro calls of the form
@code{YYFPRINTF (stderr, @var{format}, @var{args})} where
@file{y.tab.c}, and the other outputs are called @file{y.output} and
@file{y.tab.h}. The purpose of this option is to imitate Yacc's output
file name conventions. Thus, the following shell script can substitute
-for Yacc:@refill
+for Yacc:
@example
bison -y $*
@cindex symbols in Bison, table of
@table @code
+@item @@$
+In an action, the location of the left-hand side of the rule.
+ @xref{Locations, , Locations Overview}.
+
+@item @@@var{n}
+In an action, the location of the @var{n}-th symbol of the right-hand
+side of the rule. @xref{Locations, , Locations Overview}.
+
+@item $$
+In an action, the semantic value of the left-hand side of the rule.
+@xref{Actions}.
+
+@item $@var{n}
+In an action, the semantic value of the @var{n}-th symbol of the
+right-hand side of the rule. @xref{Actions}.
+
@item error
A token name reserved for error recovery. This token may be used in
grammar rules so as to allow the Bison parser to recognize an error in
Macro to discard a value from the parser stack and fake a look-ahead
token. @xref{Action Features, ,Special Features for Use in Actions}.
+@item YYDEBUG
+Macro to define to equip the parser with tracing code. @xref{Debugging,
+,Debugging Your Parser}.
+
@item YYERROR
Macro to pretend that a syntax error has just been detected: call
@code{yyerror} and then perform normal error recovery if possible
values. @xref{Union Decl, ,The Collection of Value Types}.
@end table
+@sp 1
+
These are the punctuation and delimiters used in Bison input:
@table @samp