@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
@cindex ambiguous grammars
@cindex non-deterministic parsing
-Parsers for @acronym{LALR}(1) grammars are @dfn{deterministic},
-meaning roughly that
-the next grammar rule to apply at any point in the input is uniquely
-determined by the preceding input and a fixed, finite portion (called
-a @dfn{look-ahead}) of the remaining input.
-A context-free grammar can be @dfn{ambiguous}, meaning that
-there are multiple ways to apply the grammar rules to get the some inputs.
-Even unambiguous grammars can be @dfn{non-deterministic}, meaning that no
-fixed look-ahead always suffices to determine the next grammar rule to apply.
-With the proper declarations, Bison is also able to parse these more general
-context-free grammars, using a technique known as @acronym{GLR} parsing (for
-Generalized @acronym{LR}). Bison's @acronym{GLR} parsers are able to
-handle any context-free
-grammar for which the number of possible parses of any given string
-is finite.
+
+Parsers for @acronym{LALR}(1) grammars are @dfn{deterministic}, meaning
+roughly that the next grammar rule to apply at any point in the input is
+uniquely determined by the preceding input and a fixed, finite portion
+(called a @dfn{look-ahead}) of the remaining input. A context-free
+grammar can be @dfn{ambiguous}, meaning that there are multiple ways to
+apply the grammar rules to get the some inputs. Even unambiguous
+grammars can be @dfn{non-deterministic}, meaning that no fixed
+look-ahead always suffices to determine the next grammar rule to apply.
+With the proper declarations, Bison is also able to parse these more
+general context-free grammars, using a technique known as @acronym{GLR}
+parsing (for Generalized @acronym{LR}). Bison's @acronym{GLR} parsers
+are able to handle any context-free grammar for which the number of
+possible parses of any given string is finite.
@cindex symbols (abstract)
@cindex token
@cindex syntactic grouping
@cindex grouping, syntactic
-In the formal grammatical rules for a language, each kind of syntactic unit
-or grouping is named by a @dfn{symbol}. Those which are built by grouping
-smaller constructs according to grammatical rules are called
+In the formal grammatical rules for a language, each kind of syntactic
+unit or grouping is named by a @dfn{symbol}. Those which are built by
+grouping smaller constructs according to grammatical rules are called
@dfn{nonterminal symbols}; those which can't be subdivided are called
@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
corresponding to a single terminal symbol a @dfn{token}, and a piece
corresponding to a single nonterminal symbol a @dfn{grouping}.
We can use the C language as an example of what symbols, terminal and
-nonterminal, mean. The tokens of C are identifiers, constants (numeric and
-string), and the various keywords, arithmetic operators and punctuation
-marks. So the terminal symbols of a grammar for C include `identifier',
-`number', `string', plus one symbol for each keyword, operator or
-punctuation mark: `if', `return', `const', `static', `int', `char',
-`plus-sign', `open-brace', `close-brace', `comma' and many more. (These
-tokens can be subdivided into characters, but that is a matter of
+nonterminal, mean. The tokens of C are identifiers, constants (numeric
+and string), and the various keywords, arithmetic operators and
+punctuation marks. So the terminal symbols of a grammar for C include
+`identifier', `number', `string', plus one symbol for each keyword,
+operator or punctuation mark: `if', `return', `const', `static', `int',
+`char', `plus-sign', `open-brace', `close-brace', `comma' and many more.
+(These tokens can be subdivided into characters, but that is a matter of
lexicography, not grammar.)
Here is a simple C function subdivided into tokens:
@cindex conflicts
@cindex shift/reduce conflicts
-In some grammars, there will be cases where Bison's standard @acronym{LALR}(1)
-parsing algorithm cannot decide whether to apply a certain grammar rule
-at a given point. That is, it may not be able to decide (on the basis
-of the input read so far) which of two possible reductions (applications
-of a grammar rule) applies, or whether to apply a reduction or read more
-of the input and apply a reduction later in the input. These are known
-respectively as @dfn{reduce/reduce} conflicts (@pxref{Reduce/Reduce}),
-and @dfn{shift/reduce} conflicts (@pxref{Shift/Reduce}).
-
-To use a grammar that is not easily modified to be @acronym{LALR}(1), a more
-general parsing algorithm is sometimes necessary. If you include
+In some grammars, there will be cases where Bison's standard
+@acronym{LALR}(1) parsing algorithm cannot decide whether to apply a
+certain grammar rule at a given point. That is, it may not be able to
+decide (on the basis of the input read so far) which of two possible
+reductions (applications of a grammar rule) applies, or whether to apply
+a reduction or read more of the input and apply a reduction later in the
+input. These are known respectively as @dfn{reduce/reduce} conflicts
+(@pxref{Reduce/Reduce}), and @dfn{shift/reduce} conflicts
+(@pxref{Shift/Reduce}).
+
+To use a grammar that is not easily modified to be @acronym{LALR}(1), a
+more general parsing algorithm is sometimes necessary. If you include
@code{%glr-parser} among the Bison declarations in your file
-(@pxref{Grammar Outline}), the result will be a Generalized
-@acronym{LR} (@acronym{GLR})
-parser. These parsers handle Bison grammars that contain no unresolved
-conflicts (i.e., after applying precedence declarations) identically to
-@acronym{LALR}(1) parsers. However, when faced with unresolved
-shift/reduce and reduce/reduce conflicts, @acronym{GLR} parsers use
-the simple expedient of doing
-both, effectively cloning the parser to follow both possibilities. Each
-of the resulting parsers can again split, so that at any given time,
-there can be any number of possible parses being explored. The parsers
+(@pxref{Grammar Outline}), the result will be a Generalized @acronym{LR}
+(@acronym{GLR}) parser. These parsers handle Bison grammars that
+contain no unresolved conflicts (i.e., after applying precedence
+declarations) identically to @acronym{LALR}(1) parsers. However, when
+faced with unresolved shift/reduce and reduce/reduce conflicts,
+@acronym{GLR} parsers use the simple expedient of doing both,
+effectively cloning the parser to follow both possibilities. Each of
+the resulting parsers can again split, so that at any given time, there
+can be any number of possible parses being explored. The parsers
proceed in lockstep; that is, all of them consume (shift) a given input
symbol before any of them proceed to the next. Each of the cloned
parsers eventually meets one of two possible fates: either it runs into
"x" y z + T <init-declare> x T <cast> y z + = <OR>
@end example
+@sp 1
+
+@cindex @code{incline}
+@cindex @acronym{GLR} parsers and @code{inline}
+Note that the @acronym{GLR} parsers require an ISO C89 compiler. In
+addition, they use the @code{inline} keyword, which is not C89, but a
+common extension. It is up to the user of these parsers to handle
+portability issues. For instance, if using Autoconf and the Autoconf
+macro @code{AC_C_INLINE}, a mere
+
+@example
+%@{
+#include <config.h>
+%@}
+@end example
+
+@noindent
+will suffice. Otherwise, we suggest
+
+@example
+%@{
+#if ! defined __GNUC__ && ! defined inline
+# define inline
+#endif
+%@}
+@end example
@node Locations Overview
@section Locations
;
line: '\n'
- | exp '\n' @{ printf ("\t%.10g\n", $1); @}
+ | exp '\n' @{ printf ("\t%.10g\n", $1); @}
;
-exp: NUM @{ $$ = $1; @}
- | exp exp '+' @{ $$ = $1 + $2; @}
- | exp exp '-' @{ $$ = $1 - $2; @}
- | exp exp '*' @{ $$ = $1 * $2; @}
- | exp exp '/' @{ $$ = $1 / $2; @}
- /* Exponentiation */
- | exp exp '^' @{ $$ = pow ($1, $2); @}
- /* Unary minus */
- | exp 'n' @{ $$ = -$1; @}
+exp: NUM @{ $$ = $1; @}
+ | exp exp '+' @{ $$ = $1 + $2; @}
+ | exp exp '-' @{ $$ = $1 - $2; @}
+ | exp exp '*' @{ $$ = $1 * $2; @}
+ | exp exp '/' @{ $$ = $1 / $2; @}
+ /* Exponentiation */
+ | exp exp '^' @{ $$ = pow ($1, $2); @}
+ /* Unary minus */
+ | exp 'n' @{ $$ = -$1; @}
;
%%
@end example
When @code{yyparse} detects a syntax error, it calls the error reporting
function @code{yyerror} to print an error message (usually but not
-always @code{"parse error"}). It is up to the programmer to supply
+always @code{"syntax error"}). It is up to the programmer to supply
@code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so
here is the definition we will use:
@end example
This addition to the grammar allows for simple error recovery in the
-event of a parse error. If an expression that cannot be evaluated is
+event of a syntax error. If an expression that cannot be evaluated is
read, the error will be recognized by the third rule for @code{line},
and parsing will continue. (The @code{yyerror} function is still called
upon to print its message as well.) The action executes the statement
yylex (void)
@{
int c;
+@end group
+@group
/* Skip white space. */
while ((c = getchar ()) == ' ' || c == '\t')
++yylloc.last_column;
+@end group
+@group
/* Step. */
yylloc.first_line = yylloc.last_line;
yylloc.first_column = yylloc.last_column;
Here are the C and Bison declarations for the multi-function calculator.
@smallexample
+@group
%@{
#include <math.h> /* For math functions, cos(), sin(), etc. */
-#include "calc.h" /* Contains definition of `symrec' */
+#include "calc.h" /* Contains definition of `symrec' */
%@}
+@end group
+@group
%union @{
-double val; /* For returning numbers. */
-symrec *tptr; /* For returning symbol-table pointers */
+ double val; /* For returning numbers. */
+ symrec *tptr; /* For returning symbol-table pointers. */
@}
-
-%token <val> NUM /* Simple double precision number */
-%token <tptr> VAR FNCT /* Variable and Function */
+@end group
+%token <val> NUM /* Simple double precision number. */
+%token <tptr> VAR FNCT /* Variable and Function. */
%type <val> exp
+@group
%right '='
%left '-' '+'
%left '*' '/'
%left NEG /* Negation--unary minus */
%right '^' /* Exponentiation */
-
+@end group
/* Grammar follows */
%%
@end smallexample
those which mention @code{VAR} or @code{FNCT}, are new.
@smallexample
+@group
input: /* empty */
| input line
;
+@end group
+@group
line:
'\n'
| exp '\n' @{ printf ("\t%.10g\n", $1); @}
| error '\n' @{ yyerrok; @}
;
+@end group
+@group
exp: NUM @{ $$ = $1; @}
| VAR @{ $$ = $1->value.var; @}
| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
| exp '^' exp @{ $$ = pow ($1, $3); @}
| '(' exp ')' @{ $$ = $2; @}
;
+@end group
/* End of grammar */
%%
@end smallexample
@code{init_table} as well:
@smallexample
-@group
#include <stdio.h>
+@group
int
main (void)
@{
@{
printf ("%s\n", s);
@}
+@end group
+@group
struct init
@{
char *fname;
"sqrt", sqrt,
0, 0
@};
+@end group
+@group
/* The symbol table: a chain of `struct symrec'. */
symrec *sym_table = (symrec *) 0;
@end group
@smallexample
@group
#include <ctype.h>
+@end group
+@group
int
yylex (void)
@{
if (i == length)
@{
length *= 2;
- symbuf = (char *)realloc (symbuf, length + 1);
+ symbuf = (char *) realloc (symbuf, length + 1);
@}
/* Add this character to the buffer. */
symbuf[i++] = c;
@strong{Warning:} as of Bison 1.875, this feature is still considered as
experimental, as there was not enough users feedback. In particular,
-the syntax might still change, and for the time being, only the default
-@acronym{LALR}(1) skeleton supports this feature.
+the syntax might still change.
@end deffn
For instance:
Here is a summary of the declarations used to define a grammar:
-@table @code
-@item %union
+@deffn {Directive} %union
Declare the collection of data types that semantic values may have
(@pxref{Union Decl, ,The Collection of Value Types}).
+@end deffn
-@item %token
+@deffn {Directive} %token
Declare a terminal symbol (token type name) with no precedence
or associativity specified (@pxref{Token Decl, ,Token Type Names}).
+@end deffn
-@item %right
+@deffn {Directive} %right
Declare a terminal symbol (token type name) that is right-associative
(@pxref{Precedence Decl, ,Operator Precedence}).
+@end deffn
-@item %left
+@deffn {Directive} %left
Declare a terminal symbol (token type name) that is left-associative
(@pxref{Precedence Decl, ,Operator Precedence}).
+@end deffn
-@item %nonassoc
+@deffn {Directive} %nonassoc
Declare a terminal symbol (token type name) that is nonassociative
(using it in a way that would be associative is a syntax error)
+@end deffn
(@pxref{Precedence Decl, ,Operator Precedence}).
-@item %type
+@deffn {Directive} %type
Declare the type of semantic values for a nonterminal symbol
(@pxref{Type Decl, ,Nonterminal Symbols}).
+@end deffn
-@item %start
+@deffn {Directive} %start
Specify the grammar's start symbol (@pxref{Start Decl, ,The
Start-Symbol}).
+@end deffn
-@item %expect
+@deffn {Directive} %expect
Declare the expected number of shift-reduce conflicts
(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
-@end table
+@end deffn
+
@sp 1
@noindent
In order to change the behavior of @command{bison}, use the following
directives:
-@table @code
-@item %debug
+@deffn {Directive} %debug
In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
already defined, so that the debugging facilities are compiled.
+@end deffn
@xref{Tracing, ,Tracing Your Parser}.
-@item %defines
+@deffn {Directive} %defines
Write an extra output file containing macro definitions for the token
type names defined in the grammar and the semantic value type
@code{YYSTYPE}, as well as a few @code{extern} variable declarations.
@code{yylex} in a separate source file, because @code{yylex} needs to
be able to refer to token type codes and the variable
@code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}.
+@end deffn
-@item %destructor
+@deffn {Directive} %destructor
Specifying how the parser should reclaim the memory associated to
discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
+@end deffn
-@item %file-prefix="@var{prefix}"
+@deffn {Directive} %file-prefix="@var{prefix}"
Specify a prefix to use for all Bison output file names. The names are
chosen as if the input file were named @file{@var{prefix}.y}.
+@end deffn
-@c @item %header-extension
-@c Specify the extension of the parser header file generated when
-@c @code{%define} or @samp{-d} are used.
-@c
-@c For example, a grammar file named @file{foo.ypp} and containing a
-@c @code{%header-extension .hh} directive will produce a header file
-@c named @file{foo.tab.hh}
-
-@item %locations
+@deffn {Directive} %locations
Generate the code processing the locations (@pxref{Action Features,
,Special Features for Use in Actions}). This mode is enabled as soon as
the grammar uses the special @samp{@@@var{n}} tokens, but if your
grammar does not use it, using @samp{%locations} allows for more
-accurate parse error messages.
+accurate syntax error messages.
+@end deffn
-@item %name-prefix="@var{prefix}"
+@deffn {Directive} %name-prefix="@var{prefix}"
Rename the external symbols used in the parser so that they start with
@var{prefix} instead of @samp{yy}. The precise list of symbols renamed
is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
@samp{%name-prefix="c_"}, the names become @code{c_parse}, @code{c_lex},
and so on. @xref{Multiple Parsers, ,Multiple Parsers in the Same
Program}.
+@end deffn
-@item %no-parser
+@deffn {Directive} %no-parser
Do not include any C code in the parser file; generate tables only. The
parser file contains just @code{#define} directives and static variable
declarations.
This option also tells Bison to write the C code for the grammar actions
into a file named @file{@var{filename}.act}, in the form of a
brace-surrounded body fit for a @code{switch} statement.
+@end deffn
-@item %no-lines
+@deffn {Directive} %no-lines
Don't generate any @code{#line} preprocessor commands in the parser
file. Ordinarily Bison writes these commands in the parser file so that
the C compiler and debuggers will associate errors and object code with
your source file (the grammar file). This directive causes them to
associate errors with the parser file, treating it an independent source
file in its own right.
+@end deffn
-@item %output="@var{filename}"
+@deffn {Directive} %output="@var{filename}"
Specify the @var{filename} for the parser file.
+@end deffn
-@item %pure-parser
+@deffn {Directive} %pure-parser
Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure
(Reentrant) Parser}).
+@end deffn
-@c @item %source-extension
-@c Specify the extension of the parser output file.
-@c
-@c For example, a grammar file named @file{foo.yy} and containing a
-@c @code{%source-extension .cpp} directive will produce a parser file
-@c named @file{foo.tab.cpp}
-
-@item %token-table
+@deffn {Directive} %token-table
Generate an array of token names in the parser file. The name of the
array is @code{yytname}; @code{yytname[@var{i}]} is the name of the
token whose internal Bison token code number is @var{i}. The first
@item YYNSTATES
The number of parser states (@pxref{Parser States}).
@end table
+@end deffn
-@item %verbose
+@deffn {Directive} %verbose
Write an extra output file containing verbose descriptions of the
parser states and what is done for each type of look-ahead token in
that state. @xref{Understanding, , Understanding Your Parser}, for more
information.
+@end deffn
-@item %yacc
+@deffn {Directive} %yacc
Pretend the option @option{--yacc} was given, i.e., imitate Yacc,
including its naming conventions. @xref{Bison Options}, for more.
-@end table
-
-
+@end deffn
@node Multiple Parsers
parameter information to it in a reentrant way. To do so, use the
declaration @code{%parse-param}:
-@deffn {Directive} %parse-param @var{argument-declaration} @var{argument-name}
+@deffn {Directive} %parse-param @{@var{argument-declaration}@}, @{@var{argument-name}@}
@findex %parse-param
Declare that @code{argument-name} is an additional @code{yyparse}
argument. This argument is also passed to @code{yyerror}. The
Here's an example. Write this in the parser:
@example
-%parse-param "int *nastiness" "nastiness"
-%parse-param "int *randomness" "randomness"
+%parse-param @{int *nastiness@}, @{nastiness@}
+%parse-param @{int *randomness@}, @{randomness@}
@end example
@noindent
@code{%lex-param} just like @code{%parse-param} (@pxref{Parser
Function}).
-@deffn {Directive} lex-param @var{argument-declaration} @var{argument-name}
+@deffn {Directive} lex-param @{@var{argument-declaration}@}, @{@var{argument-name}@}
@findex %lex-param
Declare that @code{argument-name} is an additional @code{yylex}
argument.
For instance:
@example
-%parse-param "int *nastiness" "nastiness"
-%lex-param "int *nastiness" "nastiness"
-%parse-param "int *randomness" "randomness"
+%parse-param @{int *nastiness@}, @{nastiness@}
+%lex-param @{int *nastiness@}, @{nastiness@}
+%parse-param @{int *randomness@}, @{randomness@}
@end example
@noindent
@cindex parse error
@cindex syntax error
-The Bison parser detects a @dfn{parse error} or @dfn{syntax error}
+The Bison parser detects a @dfn{syntax error} or @dfn{parse error}
whenever it reads a token which cannot satisfy any syntax rule. An
action in the grammar can also explicitly proclaim an error, using the
macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use
The Bison parser expects to report the error by calling an error
reporting function named @code{yyerror}, which you must supply. It is
called by @code{yyparse} whenever a syntax error is found, and it
-receives one argument. For a parse error, the string is normally
-@w{@code{"parse error"}}.
+receives one argument. For a syntax error, the string is normally
+@w{@code{"syntax error"}}.
@findex %error-verbose
If you invoke the directive @code{%error-verbose} in the Bison
declarations section (@pxref{Bison Declarations, ,The Bison Declarations
Section}), then Bison provides a more verbose and specific error message
-string instead of just plain @w{@code{"parse error"}}.
+string instead of just plain @w{@code{"syntax error"}}.
The parser can detect one other kind of error: stack overflow. This
happens when the input contains constructions that are very deeply
void yyerror (YYLTYPE *locp, const char *msg); /* GLR parsers. */
@end example
-If @samp{%parse-param "int *nastiness" "nastiness"} is used, then:
+If @samp{%parse-param @{int *nastiness@}, @{nastiness@}} is used, then:
@example
void yyerror (int *randomness, const char *msg); /* Yacc parsers. */
%locations
/* Pure yylex. */
%pure-parser
-%lex-param "int *nastiness" "nastiness"
+%lex-param @{int *nastiness@}, @{nastiness@}
/* Pure yyparse. */
-%parse-param "int *nastiness" "nastiness"
-%parse-param "int *randomness" "randomness"
+%parse-param @{int *nastiness@}, @{nastiness@}
+%parse-param @{int *randomness@}, @{randomness@}
@end example
@noindent
Here is a table of Bison constructs, variables and macros that
are useful in actions.
-@table @samp
-@item $$
+@deffn {Variable} $$
Acts like a variable that contains the semantic value for the
grouping made by the current rule. @xref{Actions}.
+@end deffn
-@item $@var{n}
+@deffn {Variable} $@var{n}
Acts like a variable that contains the semantic value for the
@var{n}th component of the current rule. @xref{Actions}.
+@end deffn
-@item $<@var{typealt}>$
+@deffn {Variable} $<@var{typealt}>$
Like @code{$$} but specifies alternative @var{typealt} in the union
specified by the @code{%union} declaration. @xref{Action Types, ,Data
Types of Values in Actions}.
+@end deffn
-@item $<@var{typealt}>@var{n}
+@deffn {Variable} $<@var{typealt}>@var{n}
Like @code{$@var{n}} but specifies alternative @var{typealt} in the
union specified by the @code{%union} declaration.
@xref{Action Types, ,Data Types of Values in Actions}.
+@end deffn
-@item YYABORT;
+@deffn {Macro} YYABORT;
Return immediately from @code{yyparse}, indicating failure.
@xref{Parser Function, ,The Parser Function @code{yyparse}}.
+@end deffn
-@item YYACCEPT;
+@deffn {Macro} YYACCEPT;
Return immediately from @code{yyparse}, indicating success.
@xref{Parser Function, ,The Parser Function @code{yyparse}}.
+@end deffn
-@item YYBACKUP (@var{token}, @var{value});
+@deffn {Macro} YYBACKUP (@var{token}, @var{value});
@findex YYBACKUP
Unshift a token. This macro is allowed only for rules that reduce
a single value, and only when there is no look-ahead token.
recovery.
In either case, the rest of the action is not executed.
+@end deffn
-@item YYEMPTY
+@deffn {Macro} YYEMPTY
@vindex YYEMPTY
Value stored in @code{yychar} when there is no look-ahead token.
+@end deffn
-@item YYERROR;
+@deffn {Macro} YYERROR;
@findex YYERROR
Cause an immediate syntax error. This statement initiates error
recovery just as if the parser itself had detected an error; however, it
does not call @code{yyerror}, and does not print any message. If you
want to print an error message, call @code{yyerror} explicitly before
the @samp{YYERROR;} statement. @xref{Error Recovery}.
+@end deffn
-@item YYRECOVERING
+@deffn {Macro} YYRECOVERING
This macro stands for an expression that has the value 1 when the parser
is recovering from a syntax error, and 0 the rest of the time.
@xref{Error Recovery}.
+@end deffn
-@item yychar
+@deffn {Variable} yychar
Variable containing the current look-ahead token. (In a pure parser,
this is actually a local variable within @code{yyparse}.) When there is
no look-ahead token, the value @code{YYEMPTY} is stored in the variable.
@xref{Look-Ahead, ,Look-Ahead Tokens}.
+@end deffn
-@item yyclearin;
+@deffn {Macro} yyclearin;
Discard the current look-ahead token. This is useful primarily in
error rules. @xref{Error Recovery}.
+@end deffn
-@item yyerrok;
+@deffn {Macro} yyerrok;
Resume generating error messages immediately for subsequent syntax
errors. This is useful primarily in error rules.
@xref{Error Recovery}.
+@end deffn
-@item @@$
+@deffn {Value} @@$
@findex @@$
Acts like a structure variable containing information on the textual position
of the grouping made by the current rule. @xref{Locations, ,
@c those members.
@c The use of this feature makes the parser noticeably slower.
+@end deffn
-@item @@@var{n}
+@deffn {Value} @@@var{n}
@findex @@@var{n}
Acts like a structure variable containing information on the textual position
of the @var{n}th component of the current rule. @xref{Locations, ,
Tracking Locations}.
+@end deffn
-@end table
@node Algorithm
@chapter The Bison Parser Algorithm
@cindex error recovery
@cindex recovery from errors
-It is not usually acceptable to have a program terminate on a parse
+It is not usually acceptable to have a program terminate on a syntax
error. For example, a compiler should recover sufficiently to parse the
rest of the input file and check it for errors; a calculator should accept
another expression.
this token. Write the statement @samp{yyclearin;} in the error rule's
action.
-For example, suppose that on a parse error, an error handling routine is
+For example, suppose that on a syntax error, an error handling routine is
called that advances the input stream to some point where parsing should
once again commence. The next symbol returned by the lexical scanner is
probably correct. The previous look-ahead token ought to be discarded
flow jumps to state 2. If there is no such transition on a nonterminal
symbol, and the lookahead is a @code{NUM}, then this token is shifted on
the parse stack, and the control flow jumps to state 1. Any other
-lookahead triggers a parse error.''
+lookahead triggers a syntax error.''
@cindex core, item set
@cindex item set core
@samp{+}, it will be shifted on the parse stack, and the automaton
control will jump to state 4, corresponding to the item @samp{exp -> exp
'+' . exp}. Since there is no default action, any other token than
-those listed above will trigger a parse error.
+those listed above will trigger a syntax error.
The state 3 is named the @dfn{final state}, or the @dfn{accepting
state}:
yyprint (FILE *file, int type, YYSTYPE value)
@{
if (type == VAR)
- fprintf (file, " %s", value.tptr->name);
+ fprintf (file, "%s", value.tptr->name);
else if (type == NUM)
- fprintf (file, " %d", value.val);
+ fprintf (file, "%d", value.val);
@}
@end smallexample
@cindex Bison symbols, table of
@cindex symbols in Bison, table of
-@table @code
-@item @@$
+@deffn {Variable} @@$
In an action, the location of the left-hand side of the rule.
@xref{Locations, , Locations Overview}.
+@end deffn
-@item @@@var{n}
+@deffn {Variable} @@@var{n}
In an action, the location of the @var{n}-th symbol of the right-hand
side of the rule. @xref{Locations, , Locations Overview}.
+@end deffn
-@item $$
+@deffn {Variable} $$
In an action, the semantic value of the left-hand side of the rule.
@xref{Actions}.
+@end deffn
-@item $@var{n}
+@deffn {Variable} $@var{n}
In an action, the semantic value of the @var{n}-th symbol of the
right-hand side of the rule. @xref{Actions}.
+@end deffn
-@item $accept
+@deffn {Symbol} $accept
The predefined nonterminal whose only rule is @samp{$accept: @var{start}
$end}, where @var{start} is the start symbol. @xref{Start Decl, , The
Start-Symbol}. It cannot be used in the grammar.
+@end deffn
-@item $end
+@deffn {Symbol} $end
The predefined token marking the end of the token stream. It cannot be
used in the grammar.
+@end deffn
-@item $undefined
+@deffn {Symbol} $undefined
The predefined token onto which all undefined values returned by
@code{yylex} are mapped. It cannot be used in the grammar, rather, use
@code{error}.
+@end deffn
-@item error
+@deffn {Symbol} error
A token name reserved for error recovery. This token may be used in
grammar rules so as to allow the Bison parser to recognize an error in
the grammar without halting the process. In effect, a sentence
-containing an error may be recognized as valid. On a parse error, the
+containing an error may be recognized as valid. On a syntax error, the
token @code{error} becomes the current look-ahead token. Actions
corresponding to @code{error} are then executed, and the look-ahead
token is reset to the token that originally caused the violation.
@xref{Error Recovery}.
+@end deffn
-@item YYABORT
+@deffn {Macro} YYABORT
Macro to pretend that an unrecoverable syntax error has occurred, by
making @code{yyparse} return 1 immediately. The error reporting
function @code{yyerror} is not called. @xref{Parser Function, ,The
Parser Function @code{yyparse}}.
+@end deffn
-@item YYACCEPT
+@deffn {Macro} YYACCEPT
Macro to pretend that a complete utterance of the language has been
read, by making @code{yyparse} return 0 immediately.
@xref{Parser Function, ,The Parser Function @code{yyparse}}.
+@end deffn
-@item YYBACKUP
+@deffn {Macro} YYBACKUP
Macro to discard a value from the parser stack and fake a look-ahead
token. @xref{Action Features, ,Special Features for Use in Actions}.
+@end deffn
-@item YYDEBUG
+@deffn {Macro} YYDEBUG
Macro to define to equip the parser with tracing code. @xref{Tracing,
,Tracing Your Parser}.
+@end deffn
-@item YYERROR
+@deffn {Macro} YYERROR
Macro to pretend that a syntax error has just been detected: call
@code{yyerror} and then perform normal error recovery if possible
(@pxref{Error Recovery}), or (if recovery is impossible) make
@code{yyparse} return 1. @xref{Error Recovery}.
+@end deffn
-@item YYERROR_VERBOSE
+@deffn {Macro} YYERROR_VERBOSE
An obsolete macro that you define with @code{#define} in the Bison
declarations section to request verbose, specific error message strings
when @code{yyerror} is called. It doesn't matter what definition you
use for @code{YYERROR_VERBOSE}, just whether you define it. Using
@code{%error-verbose} is preferred.
+@end deffn
-@item YYINITDEPTH
+@deffn {Macro} YYINITDEPTH
Macro for specifying the initial size of the parser stack.
@xref{Stack Overflow}.
+@end deffn
-@item YYLEX_PARAM
+@deffn {Macro} YYLEX_PARAM
An obsolete macro for specifying an extra argument (or list of extra
arguments) for @code{yyparse} to pass to @code{yylex}. he use of this
macro is deprecated, and is supported only for Yacc like parsers.
@xref{Pure Calling,, Calling Conventions for Pure Parsers}.
+@end deffn
-@item YYLTYPE
+@deffn {Macro} YYLTYPE
Macro for the data type of @code{yylloc}; a structure with four
members. @xref{Location Type, , Data Types of Locations}.
+@end deffn
-@item yyltype
+@deffn {Type} yyltype
Default value for YYLTYPE.
+@end deffn
-@item YYMAXDEPTH
-Macro for specifying the maximum size of the parser stack.
-@xref{Stack Overflow}.
+@deffn {Macro} YYMAXDEPTH
+Macro for specifying the maximum size of the parser stack. @xref{Stack
+Overflow}.
+@end deffn
-@item YYPARSE_PARAM
+@deffn {Macro} YYPARSE_PARAM
An obsolete macro for specifying the name of a parameter that
@code{yyparse} should accept. The use of this macro is deprecated, and
is supported only for Yacc like parsers. @xref{Pure Calling,, Calling
Conventions for Pure Parsers}.
+@end deffn
-@item YYRECOVERING
+@deffn {Macro} YYRECOVERING
Macro whose value indicates whether the parser is recovering from a
syntax error. @xref{Action Features, ,Special Features for Use in Actions}.
+@end deffn
-@item YYSTACK_USE_ALLOCA
+@deffn {Macro} YYSTACK_USE_ALLOCA
Macro used to control the use of @code{alloca}. If defined to @samp{0},
the parser will not use @code{alloca} but @code{malloc} when trying to
grow its internal stacks. Do @emph{not} define @code{YYSTACK_USE_ALLOCA}
to anything else.
+@end deffn
-@item YYSTYPE
+@deffn {Macro} YYSTYPE
Macro for the data type of semantic values; @code{int} by default.
@xref{Value Type, ,Data Types of Semantic Values}.
+@end deffn
-@item yychar
+@deffn {Variable} yychar
External integer variable that contains the integer value of the current
look-ahead token. (In a pure parser, it is a local variable within
@code{yyparse}.) Error-recovery rule actions may examine this variable.
@xref{Action Features, ,Special Features for Use in Actions}.
+@end deffn
-@item yyclearin
+@deffn {Variable} yyclearin
Macro used in error-recovery rule actions. It clears the previous
look-ahead token. @xref{Error Recovery}.
+@end deffn
-@item yydebug
+@deffn {Variable} yydebug
External integer variable set to zero by default. If @code{yydebug}
is given a nonzero value, the parser will output information on input
symbols and parser action. @xref{Tracing, ,Tracing Your Parser}.
+@end deffn
-@item yyerrok
+@deffn {Macro} yyerrok
Macro to cause parser to recover immediately to its normal mode
-after a parse error. @xref{Error Recovery}.
+after a syntax error. @xref{Error Recovery}.
+@end deffn
-@item yyerror
+@deffn {Function} yyerror
User-supplied function to be called by @code{yyparse} on error. The
function receives one argument, a pointer to a character string
containing an error message. @xref{Error Reporting, ,The Error
Reporting Function @code{yyerror}}.
+@end deffn
-@item yylex
+@deffn {Function} yylex
User-supplied lexical analyzer function, called with no arguments to get
the next token. @xref{Lexical, ,The Lexical Analyzer Function
@code{yylex}}.
+@end deffn
-@item yylval
+@deffn {Variable} yylval
External variable in which @code{yylex} should place the semantic
value associated with a token. (In a pure parser, it is a local
variable within @code{yyparse}, and its address is passed to
@code{yylex}.) @xref{Token Values, ,Semantic Values of Tokens}.
+@end deffn
-@item yylloc
+@deffn {Variable} yylloc
External variable in which @code{yylex} should place the line and column
numbers associated with a token. (In a pure parser, it is a local
variable within @code{yyparse}, and its address is passed to
@code{yylex}.) You can ignore this variable if you don't use the
@samp{@@} feature in the grammar actions. @xref{Token Positions,
,Textual Positions of Tokens}.
+@end deffn
-@item yynerrs
-Global variable which Bison increments each time there is a parse error.
+@deffn {Variable} yynerrs
+Global variable which Bison increments each time there is a syntax error.
(In a pure parser, it is a local variable within @code{yyparse}.)
@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+@end deffn
-@item yyparse
+@deffn {Function} yyparse
The parser function produced by Bison; call this function to start
parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}.
+@end deffn
-@item %debug
+@deffn {Directive} %debug
Equip the parser for debugging. @xref{Decl Summary}.
+@end deffn
-@item %defines
+@deffn {Directive} %defines
Bison declaration to create a header file meant for the scanner.
@xref{Decl Summary}.
+@end deffn
-@item %destructor
+@deffn {Directive} %destructor
Specifying how the parser should reclaim the memory associated to
discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
+@end deffn
-@item %dprec
+@deffn {Directive} %dprec
Bison declaration to assign a precedence to a rule that is used at parse
time to resolve reduce/reduce conflicts. @xref{GLR Parsers, ,Writing
@acronym{GLR} Parsers}.
+@end deffn
-@item %error-verbose
+@deffn {Directive} %error-verbose
Bison declaration to request verbose, specific error message strings
when @code{yyerror} is called.
+@end deffn
-@item %file-prefix="@var{prefix}"
+@deffn {Directive} %file-prefix="@var{prefix}"
Bison declaration to set the prefix of the output files. @xref{Decl
Summary}.
+@end deffn
-@item %glr-parser
+@deffn {Directive} %glr-parser
Bison declaration to produce a @acronym{GLR} parser. @xref{GLR
Parsers, ,Writing @acronym{GLR} Parsers}.
+@end deffn
-@c @item %source-extension
-@c Bison declaration to specify the generated parser output file extension.
-@c @xref{Decl Summary}.
-@c
-@c @item %header-extension
-@c Bison declaration to specify the generated parser header file extension
-@c if required. @xref{Decl Summary}.
-
-@item %left
+@deffn {Directive} %left
Bison declaration to assign left associativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
+@end deffn
-@item %lex-param "@var{argument-declaration}" "@var{argument-name}"
+@deffn {Directive} %lex-param @{@var{argument-declaration}@}. @{@var{argument-name}"@}
Bison declaration to specifying an additional parameter that
@code{yylex} should accept. @xref{Pure Calling,, Calling Conventions
for Pure Parsers}.
+@end deffn
-@item %merge
+@deffn {Directive} %merge
Bison declaration to assign a merging function to a rule. If there is a
reduce/reduce conflict with a rule having the same merging function, the
function is applied to the two semantic values to get a single result.
@xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}.
+@end deffn
-@item %name-prefix="@var{prefix}"
+@deffn {Directive} %name-prefix="@var{prefix}"
Bison declaration to rename the external symbols. @xref{Decl Summary}.
+@end deffn
-@item %no-lines
+@deffn {Directive} %no-lines
Bison declaration to avoid generating @code{#line} directives in the
parser file. @xref{Decl Summary}.
+@end deffn
-@item %nonassoc
+@deffn {Directive} %nonassoc
Bison declaration to assign non-associativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
+@end deffn
-@item %output="@var{filename}"
+@deffn {Directive} %output="@var{filename}"
Bison declaration to set the name of the parser file. @xref{Decl
Summary}.
+@end deffn
-@item %parse-param "@var{argument-declaration}" "@var{argument-name}"
+@deffn {Directive} %parse-param @{@var{argument-declaration}@}, @{@var{argument-name}@}
Bison declaration to specifying an additional parameter that
@code{yyparse} should accept. @xref{Parser Function,, The Parser
Function @code{yyparse}}.
+@end deffn
-@item %prec
+@deffn {Directive} %prec
Bison declaration to assign a precedence to a specific rule.
@xref{Contextual Precedence, ,Context-Dependent Precedence}.
+@end deffn
-@item %pure-parser
+@deffn {Directive} %pure-parser
Bison declaration to request a pure (reentrant) parser.
@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
+@end deffn
-@item %right
+@deffn {Directive} %right
Bison declaration to assign right associativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
+@end deffn
-@item %start
+@deffn {Directive} %start
Bison declaration to specify the start symbol. @xref{Start Decl, ,The
Start-Symbol}.
+@end deffn
-@item %token
+@deffn {Directive} %token
Bison declaration to declare token(s) without specifying precedence.
@xref{Token Decl, ,Token Type Names}.
+@end deffn
-@item %token-table
+@deffn {Directive} %token-table
Bison declaration to include a token name table in the parser file.
@xref{Decl Summary}.
+@end deffn
-@item %type
+@deffn {Directive} %type
Bison declaration to declare nonterminals. @xref{Type Decl,
,Nonterminal Symbols}.
+@end deffn
-@item %union
+@deffn {Directive} %union
Bison declaration to specify several possible data types for semantic
values. @xref{Union Decl, ,The Collection of Value Types}.
-@end table
+@end deffn
@sp 1
These are the punctuation and delimiters used in Bison input:
-@table @samp
-@item %%
+@deffn {Delimiter} %%
Delimiter used to separate the grammar rule section from the
Bison declarations section or the epilogue.
@xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}.
+@end deffn
-@item %@{ %@}
+@c Don't insert spaces, or check the DVI output.
+@deffn {Delimiter} %@{@var{code}%@}
All code listed between @samp{%@{} and @samp{%@}} is copied directly to
the output file uninterpreted. Such code forms the prologue of the input
file. @xref{Grammar Outline, ,Outline of a Bison
Grammar}.
+@end deffn
-@item /*@dots{}*/
+@deffn {Construct} /*@dots{}*/
Comment delimiters, as in C.
+@end deffn
-@item :
+@deffn {Delimiter} :
Separates a rule's result from its components. @xref{Rules, ,Syntax of
Grammar Rules}.
+@end deffn
-@item ;
+@deffn {Delimiter} ;
Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}.
+@end deffn
-@item |
+@deffn {Delimiter} |
Separates alternate rules for the same result nonterminal.
@xref{Rules, ,Syntax of Grammar Rules}.
-@end table
+@end deffn
@node Glossary
@appendix Glossary
be expressed through rules in terms of smaller constructs; in other
words, a construct that is not a token. @xref{Symbols}.
-@item Parse error
-An error encountered during parsing of an input stream due to invalid
-syntax. @xref{Error Recovery}.
-
@item Parser
A function that recognizes valid sentences of a language by analyzing
the syntax structure of a set of tokens passed to it from a lexical
during parsing to allow for recognition and use of existing
information in repeated uses of a symbol. @xref{Multi-function Calc}.
+@item Syntax error
+An error encountered during parsing of an input stream due to invalid
+syntax. @xref{Error Recovery}.
+
@item Token
A basic, grammatically indivisible unit of a language. The symbol
that describes a token in the grammar is a terminal symbol.