\input texinfo @c -*-texinfo-*-
@comment %**start of header
@setfilename bison.info
-@settitle Bison 1.23
+@include version.texi
+@settitle Bison @value{VERSION}
@setchapternewpage odd
@iftex
@finalout
@end iftex
-@c SMALL BOOK version
+@c SMALL BOOK version
@c This edition has been formatted so that you can format and print it in
-@c the smallbook format.
+@c the smallbook format.
@c @smallbook
-@c next time, consider using @set for edition number, etc...
-
@c Set following if you have the new `shorttitlepage' command
@c @clear shorttitlepage-enabled
@c @set shorttitlepage-enabled
@end ifinfo
@comment %**end of header
+@ifinfo
+@format
+START-INFO-DIR-ENTRY
+* bison: (bison). GNU Project parser generator (yacc replacement).
+END-INFO-DIR-ENTRY
+@end format
+@end ifinfo
+
@ifinfo
This file documents the Bison parser generator.
-Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993 Free Software Foundation, Inc.
+Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999, 2000
+Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@titlepage
@title Bison
@subtitle The YACC-compatible Parser Generator
-@subtitle December 1993, Bison Version 1.23
+@subtitle @value{UPDATED}, Bison Version @value{VERSION}
@author by Charles Donnelly and Richard Stallman
@page
@vskip 0pt plus 1filll
-Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993 Free Software
-Foundation
+Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998,
+1999, 2000
+Free Software Foundation, Inc.
@sp 2
Published by the Free Software Foundation @*
-675 Massachusetts Avenue @*
-Cambridge, MA 02139 USA @*
-Printed copies are available for $15 each.@*
-ISBN-1-882114-30-2
+59 Temple Place, Suite 330 @*
+Boston, MA 02111-1307 USA @*
+Printed copies are available from the Free Software Foundation.@*
+ISBN 1-882114-44-2
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@sp 2
Cover art by Etienne Suvasa.
@end titlepage
-@page
+
+@contents
@node Top, Introduction, (dir), (dir)
@ifinfo
-This manual documents version 1.23 of Bison.
+This manual documents version @value{VERSION} of Bison.
@end ifinfo
@menu
-* Introduction::
-* Conditions::
+* Introduction::
+* Conditions::
* Copying:: The GNU General Public License says
how you can copy and share Bison
Grammar Rules for @code{rpcalc}
-* Rpcalc Input::
-* Rpcalc Line::
-* Rpcalc Expr::
+* Rpcalc Input::
+* Rpcalc Line::
+* Rpcalc Expr::
Multi-Function Calculator: @code{mfcalc}
Parser C-Language Interface
* Parser Function:: How to call @code{yyparse} and what it returns.
-* Lexical:: You must supply a function @code{yylex}
+* Lexical:: You must supply a function @code{yylex}
which reads tokens.
* Error Reporting:: You must supply a function @code{yyerror}.
* Action Features:: Special features for use in actions.
* Pure Calling:: How the calling convention differs
in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
-The Bison Parser Algorithm
+The Bison Parser Algorithm
* Look-Ahead:: Parser looks one token ahead when deciding what to do.
* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
Invoking Bison
-* Bison Options:: All the options described in detail,
+* Bison Options:: All the options described in detail,
in alphabetical order by short options.
* Option Cross Key:: Alphabetical list of long options.
* VMS Invocation:: Bison command syntax on VMS.
don't know Bison or Yacc, start by reading these chapters. Reference
chapters follow which describe specific aspects of Bison in detail.
-Bison was written primarily by Robert Corbett; Richard Stallman made
-it Yacc-compatible. This edition corresponds to version 1.23 of Bison.
+Bison was written primarily by Robert Corbett; Richard Stallman made it
+Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added
+multicharacter string literals and other features.
+
+This edition corresponds to version @value{VERSION} of Bison.
@node Conditions, Copying, Introduction, Top
@unnumbered Conditions for Using Bison
-Bison grammars can be used only in programs that are free software. This
-is in contrast to what happens with the GNU C compiler and the other
-GNU programming tools.
-
-The reason Bison is special is that the output of the Bison utility---the
-Bison parser file---contains a verbatim copy of a sizable piece of Bison,
-which is the code for the @code{yyparse} function. (The actions from your
-grammar are inserted into this function at one point, but the rest of the
-function is not changed.)
-
-As a result, the Bison parser file is covered by the same copying
-conditions that cover Bison itself and the rest of the GNU system: any
-program containing it has to be distributed under the standard GNU copying
-conditions.
-
-Occasionally people who would like to use Bison to develop proprietary
-programs complain about this.
-
-We don't particularly sympathize with their complaints. The purpose of the
-GNU project is to promote the right to share software and the practice of
-sharing software; it is a means of changing society. The people who
-complain are planning to be uncooperative toward the rest of the world; why
-should they deserve our help in doing so?
-
-However, it's possible that a change in these conditions might encourage
-computer companies to use and distribute the GNU system. If so, then we
-might decide to change the terms on @code{yyparse} as a matter of the
-strategy of promoting the right to share. Such a change would be
-irrevocable. Since we stand by the copying permissions we have announced,
-we cannot withdraw them once given.
-
-We mustn't make an irrevocable change hastily. We have to wait until there
-is a complete GNU system and there has been time to learn how this issue
-affects its reception.
+As of Bison version 1.24, we have changed the distribution terms for
+@code{yyparse} to permit using Bison's output in nonfree programs.
+Formerly, Bison parsers could be used only in programs that were free
+software.
+
+The other GNU programming tools, such as the GNU C compiler, have never
+had such a requirement. They could always be used for nonfree
+software. The reason Bison was different was not due to a special
+policy decision; it resulted from applying the usual General Public
+License to all of the Bison source code.
+
+The output of the Bison utility---the Bison parser file---contains a
+verbatim copy of a sizable piece of Bison, which is the code for the
+@code{yyparse} function. (The actions from your grammar are inserted
+into this function at one point, but the rest of the function is not
+changed.) When we applied the GPL terms to the code for @code{yyparse},
+the effect was to restrict the use of Bison output to free software.
+
+We didn't change the terms because of sympathy for people who want to
+make software proprietary. @strong{Software should be free.} But we
+concluded that limiting Bison's use to free software was doing little to
+encourage people to make other software free. So we decided to make the
+practical conditions for using Bison match the practical conditions for
+using the other GNU tools.
@node Copying, Concepts, Conditions, Top
@unnumbered GNU GENERAL PUBLIC LICENSE
@display
Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc.
-675 Mass Ave, Cambridge, MA 02139, USA
+59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
-Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+Foundation, Inc., 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.
@end smallexample
Also add information on how to contact you by electronic and paper mail.
@smallexample
Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
-Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
+Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
@code{RETURN}. A terminal symbol that stands for a particular keyword in
the language should be named after that keyword converted to upper case.
The terminal symbol @code{error} is reserved for error recovery.
-@xref{Symbols}.@refill
+@xref{Symbols}.
A terminal symbol can also be represented as a character literal, just like
a C character constant. You should do this whenever a token is just a
single character (parenthesis, plus-sign, etc.): use that same character in
a literal as the terminal symbol for that token.
+A third way to represent a terminal symbol is with a C string constant
+containing several characters. @xref{Symbols}, for more information.
+
The grammar rules also have an expression in Bison syntax. For example,
here is the Bison rule for a C @code{return} statement. The semicolon in
quotes is a literal character token, representing part of the C syntax for
rule can have an @dfn{action} made up of C statements. Each time the
parser recognizes a match for that rule, the action is executed.
@xref{Actions}.
-
+
Most of the time, the purpose of an action is to compute the semantic value
of the whole construct from the semantic values of its parts. For example,
suppose we have a rule which says an expression can be the sum of two
rule are referred to as @code{$1}, @code{$2}, and so on.
@menu
-* Rpcalc Input::
-* Rpcalc Line::
-* Rpcalc Expr::
+* Rpcalc Input::
+* Rpcalc Line::
+* Rpcalc Expr::
@end menu
@node Rpcalc Input, Rpcalc Line, , Rpcalc Rules
@example
@group
-/* Lexical analyzer returns a double floating point
+/* Lexical analyzer returns a double floating point
number on the stack and the token NUM, or the ASCII
character read if not a number. Skips all blanks
and tabs, returns 0 for EOF. */
@end group
@group
-yylex ()
+int
+yylex (void)
@{
int c;
/* skip white space */
- while ((c = getchar ()) == ' ' || c == '\t')
+ while ((c = getchar ()) == ' ' || c == '\t')
;
@end group
@group
/* process numbers */
- if (c == '.' || isdigit (c))
+ if (c == '.' || isdigit (c))
@{
ungetc (c, stdin);
scanf ("%lf", &yylval);
@end group
@group
/* return end-of-file */
- if (c == EOF)
+ if (c == EOF)
return 0;
/* return single chars */
- return c;
+ return c;
@}
@end group
@end example
@example
@group
-main ()
+int
+main (void)
@{
- yyparse ();
+ return yyparse ();
@}
@end group
@end example
@cindex error reporting routine
When @code{yyparse} detects a syntax error, it calls the error reporting
-function @code{yyerror} to print an error message (usually but not always
-@code{"parse error"}). It is up to the programmer to supply @code{yyerror}
-(@pxref{Interface, ,Parser C-Language Interface}), so here is the definition we will use:
+function @code{yyerror} to print an error message (usually but not
+always @code{"parse error"}). It is up to the programmer to supply
+@code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so
+here is the definition we will use:
@example
@group
#include <stdio.h>
-yyerror (s) /* Called by yyparse on error */
- char *s;
+void
+yyerror (const char *s) /* Called by yyparse on error */
@{
printf ("%s\n", s);
@}
(@pxref{Error Recovery}). Otherwise, @code{yyparse} returns nonzero. We
have not written any error rules in this example, so any invalid input will
cause the calculator program to exit. This is not clean behavior for a
-real calculator, but it is adequate in the first example.
+real calculator, but it is adequate for the first example.
@node Rpcalc Gen, Rpcalc Compile, Rpcalc Error, RPN Calc
@subsection Running Bison to Make the Parser
@cindex running Bison (introduction)
-Before running Bison to produce a parser, we need to decide how to arrange
-all the source code in one or more source files. For such a simple example,
-the easiest thing is to put everything in one file. The definitions of
-@code{yylex}, @code{yyerror} and @code{main} go at the end, in the
-``additional C code'' section of the file (@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}).
+Before running Bison to produce a parser, we need to decide how to
+arrange all the source code in one or more source files. For such a
+simple example, the easiest thing is to put everything in one file. The
+definitions of @code{yylex}, @code{yyerror} and @code{main} go at the
+end, in the ``additional C code'' section of the file (@pxref{Grammar
+Layout, ,The Overall Layout of a Bison Grammar}).
For a large project, you would probably have several source files, and use
@code{make} to arrange to recompile them.
@end example
@noindent
-The functions @code{yylex}, @code{yyerror} and @code{main} can be the same
-as before.
+The functions @code{yylex}, @code{yyerror} and @code{main} can be the
+same as before.
There are two important new features shown in this code.
Up to this point, this manual has not addressed the issue of @dfn{error
recovery}---how to continue parsing after the parser detects a syntax
-error. All we have handled is error reporting with @code{yyerror}. Recall
-that by default @code{yyparse} returns after calling @code{yyerror}. This
-means that an erroneous input line causes the calculator program to exit.
-Now we show how to rectify this deficiency.
+error. All we have handled is error reporting with @code{yyerror}.
+Recall that by default @code{yyparse} returns after calling
+@code{yyerror}. This means that an erroneous input line causes the
+calculator program to exit. Now we show how to rectify this deficiency.
The Bison language itself includes the reserved word @code{error}, which
may be included in the grammar rules. In the example below it has
@end group
@end example
-This addition to the grammar allows for simple error recovery in the event
-of a parse error. If an expression that cannot be evaluated is read, the
-error will be recognized by the third rule for @code{line}, and parsing
-will continue. (The @code{yyerror} function is still called upon to print
-its message as well.) The action executes the statement @code{yyerrok}, a
-macro defined automatically by Bison; its meaning is that error recovery is
-complete (@pxref{Error Recovery}). Note the difference between
-@code{yyerrok} and @code{yyerror}; neither one is a misprint.@refill
+This addition to the grammar allows for simple error recovery in the
+event of a parse error. If an expression that cannot be evaluated is
+read, the error will be recognized by the third rule for @code{line},
+and parsing will continue. (The @code{yyerror} function is still called
+upon to print its message as well.) The action executes the statement
+@code{yyerrok}, a macro defined automatically by Bison; its meaning is
+that error recovery is complete (@pxref{Error Recovery}). Note the
+difference between @code{yyerrok} and @code{yyerror}; neither one is a
+misprint.@refill
This form of error recovery deals with syntax errors. There are other
kinds of errors; for example, division by zero, which raises an exception
It is easy to add new operators to the infix calculator as long as they are
only single-character literals. The lexical analyzer @code{yylex} passes
-back all non-number characters as tokens, so new grammar rules suffice for
+back all nonnumber characters as tokens, so new grammar rules suffice for
adding a new operator. But we want something more flexible: built-in
functions whose syntax has this form:
definition, which is kept in the header @file{calc.h}, is as follows. It
provides for either functions or variables to be placed in the table.
+@c FIXME: ANSIfy the prototypes for FNCTPTR etc.
@smallexample
@group
/* Data type for links in the chain of symbols. */
@group
#include <stdio.h>
-main ()
+int
+main (void)
@{
init_table ();
- yyparse ();
+ return yyparse ();
@}
@end group
@group
-yyerror (s) /* Called by yyparse on error */
- char *s;
+void
+yyerror (const char *s) /* Called by yyparse on error */
@{
printf ("%s\n", s);
@}
@end group
@group
-struct init arith_fncts[]
- = @{
- "sin", sin,
- "cos", cos,
- "atan", atan,
- "ln", log,
- "exp", exp,
- "sqrt", sqrt,
- 0, 0
- @};
+struct init arith_fncts[] =
+@{
+ "sin", sin,
+ "cos", cos,
+ "atan", atan,
+ "ln", log,
+ "exp", exp,
+ "sqrt", sqrt,
+ 0, 0
+@};
/* The symbol table: a chain of `struct symrec'. */
symrec *sym_table = (symrec *)0;
@end group
@group
-init_table () /* puts arithmetic functions in table. */
+/* Put arithmetic functions in table. */
+void
+init_table (void)
@{
int i;
symrec *ptr;
@smallexample
symrec *
-putsym (sym_name,sym_type)
- char *sym_name;
- int sym_type;
+putsym (char *sym_name, int sym_type)
@{
symrec *ptr;
ptr = (symrec *) malloc (sizeof (symrec));
@}
symrec *
-getsym (sym_name)
- char *sym_name;
+getsym (const char *sym_name)
@{
symrec *ptr;
for (ptr = sym_table; ptr != (symrec *) 0;
@smallexample
@group
#include <ctype.h>
-yylex ()
+
+int
+yylex (void)
@{
int c;
@cindex additional C code section
@cindex C code, section for additional
-The @var{additional C code} section is copied verbatim to the end of
-the parser file, just as the @var{C declarations} section is copied to
-the beginning. This is the most convenient place to put anything
-that you want to have in the parser file but which need not come before
-the definition of @code{yyparse}. For example, the definitions of
-@code{yylex} and @code{yyerror} often go here. @xref{Interface, ,Parser C-Language Interface}.
+The @var{additional C code} section is copied verbatim to the end of the
+parser file, just as the @var{C declarations} section is copied to the
+beginning. This is the most convenient place to put anything that you
+want to have in the parser file but which need not come before the
+definition of @code{yyparse}. For example, the definitions of
+@code{yylex} and @code{yyerror} often go here. @xref{Interface, ,Parser
+C-Language Interface}.
If the last section is empty, you may omit the @samp{%%} that separates it
from the grammar rules.
Symbol names can contain letters, digits (not at the beginning),
underscores and periods. Periods make sense only in nonterminals.
-There are two ways of writing terminal symbols in the grammar:
+There are three ways of writing terminal symbols in the grammar:
@itemize @bullet
@item
@cindex character token
@cindex literal token
@cindex single-character literal
-A @dfn{character token type} (or @dfn{literal token}) is written in
-the grammar using the same syntax used in C for character constants;
-for example, @code{'+'} is a character token type. A character token
-type doesn't need to be declared unless you need to specify its
-semantic value data type (@pxref{Value Type, ,Data Types of Semantic Values}), associativity, or
-precedence (@pxref{Precedence, ,Operator Precedence}).
+A @dfn{character token type} (or @dfn{literal character token}) is
+written in the grammar using the same syntax used in C for character
+constants; for example, @code{'+'} is a character token type. A
+character token type doesn't need to be declared unless you need to
+specify its semantic value data type (@pxref{Value Type, ,Data Types of
+Semantic Values}), associativity, or precedence (@pxref{Precedence,
+,Operator Precedence}).
By convention, a character token type is used only to represent a
token that consists of that particular character. Thus, the token
All the usual escape sequences used in character literals in C can be
used in Bison as well, but you must not use the null character as a
-character literal because its ASCII code, zero, is the code
-@code{yylex} returns for end-of-input (@pxref{Calling Convention, ,Calling Convention for @code{yylex}}).
+character literal because its ASCII code, zero, is the code @code{yylex}
+returns for end-of-input (@pxref{Calling Convention, ,Calling Convention
+for @code{yylex}}).
+
+@item
+@cindex string token
+@cindex literal string token
+@cindex multicharacter literal
+A @dfn{literal string token} is written like a C string constant; for
+example, @code{"<="} is a literal string token. A literal string token
+doesn't need to be declared unless you need to specify its semantic
+value data type (@pxref{Value Type}), associativity, precedence
+(@pxref{Precedence}).
+
+You can associate the literal string token with a symbolic name as an
+alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token
+Declarations}). If you don't do that, the lexical analyzer has to
+retrieve the token number for the literal string token from the
+@code{yytname} table (@pxref{Calling Convention}).
+
+@strong{WARNING}: literal string tokens do not work in Yacc.
+
+By convention, a literal string token is used only to represent a token
+that consists of that particular string. Thus, you should use the token
+type @code{"<="} to represent the string @samp{<=} as a token. Bison
+does not enforce this convention, but if you depart from it, people who
+read your program will be confused.
+
+All the escape sequences used in string literals in C can be used in
+Bison as well. A literal string token must contain two or more
+characters; for a token containing just one character, use a character
+token (see above).
@end itemize
How you choose to write a terminal symbol has no effect on its
the character, so @code{yylex} can use the identical character constant to
generate the requisite code. Each named token type becomes a C macro in
the parser file, so @code{yylex} can use the name to stand for the code.
-(This is why periods don't make sense in terminal symbols.)
+(This is why periods don't make sense in terminal symbols.)
@xref{Calling Convention, ,Calling Convention for @code{yylex}}.
If @code{yylex} is defined in a separate file, you need to arrange for the
@end example
@noindent
-where @var{result} is the nonterminal symbol that this rule describes
+where @var{result} is the nonterminal symbol that this rule describes,
and @var{components} are various terminal and nonterminal symbols that
-are put together by this rule (@pxref{Symbols}).
+are put together by this rule (@pxref{Symbols}).
For example,
A rule is called @dfn{recursive} when its @var{result} nonterminal appears
also on its right hand side. Nearly all Bison grammars need to use
recursion, because that is the only way to define a sequence of any number
-of somethings. Consider this recursive definition of a comma-separated
-sequence of one or more expressions:
+of a particular thing. Consider this recursive definition of a
+comma-separated sequence of one or more expressions:
@example
@group
@dfn{Indirect} or @dfn{mutual} recursion occurs when the result of the
rule does not appear directly on its right hand side, but does appear
in rules for other nonterminals which do appear on its right hand
-side.
+side.
For example:
@node Semantics, Declarations, Recursion, Grammar File
@section Defining Language Semantics
@cindex defining language semantics
-@cindex language semantics, defining
+@cindex language semantics, defining
The grammar rules for a language determine only the syntax. The semantics
are determined by the semantic values associated with various tokens and
@subsection Token Type Names
@cindex declaring token type names
@cindex token type names, declaring
+@cindex declaring literal string tokens
@findex %token
The basic way to declare a token type name (terminal symbol) is as follows:
can use the name @var{name} to stand for this token type's code.
Alternatively, you can use @code{%left}, @code{%right}, or @code{%nonassoc}
-instead of @code{%token}, if you wish to specify precedence.
+instead of @code{%token}, if you wish to specify associativity and precedence.
@xref{Precedence Decl, ,Operator Precedence}.
You can explicitly specify the numeric code for a token type by appending
In the event that the stack type is a union, you must augment the
@code{%token} or other token declaration to include the data type
-alternative delimited by angle-brackets (@pxref{Multiple Types, ,More Than One Value Type}).
+alternative delimited by angle-brackets (@pxref{Multiple Types, ,More Than One Value Type}).
For example:
@end group
@end example
+You can associate a literal string token with a token type name by
+writing the literal string at the end of a @code{%token}
+declaration which declares the name. For example:
+
+@example
+%token arrow "=>"
+@end example
+
+@noindent
+For example, a grammar for the C language might specify these names with
+equivalent literal string tokens:
+
+@example
+%token <operator> OR "||"
+%token <operator> LE 134 "<="
+%left OR "<="
+@end example
+
+@noindent
+Once you equate the literal string and the token name, you can use them
+interchangeably in further declarations or the grammar rules. The
+@code{yylex} function can use the token name or the literal string to
+obtain the token type code number (@pxref{Calling Convention}).
+
@node Precedence Decl, Union Decl, Token Decl, Declarations
@subsection Operator Precedence
@cindex precedence declarations
The @code{%union} declaration specifies the entire collection of possible
data types for semantic values. The keyword @code{%union} is followed by a
pair of braces containing the same thing that goes inside a @code{union} in
-C.
+C.
For example:
the same @code{%type} declaration, if they have the same value type. Use
spaces to separate the symbol names.
+You can also declare the value type of a terminal symbol. To do this,
+use the same @code{<@var{type}>} construction in a declaration for the
+terminal symbol. All kinds of token declarations allow
+@code{<@var{type}>}.
+
@node Expect Decl, Start Decl, Type Decl, Declarations
@subsection Suppressing Conflict Warnings
@cindex suppressing conflict warnings
handler. In systems with multiple threads of control, a nonreentrant
program must be called only within interlocks.
-The Bison parser is not normally a reentrant program, because it uses
-statically allocated variables for communication with @code{yylex}. These
-variables include @code{yylval} and @code{yylloc}.
+Normally, Bison generates a parser which is not reentrant. This is
+suitable for most uses, and it permits compatibility with YACC. (The
+standard YACC interfaces are inherently nonreentrant, because they use
+statically allocated variables for communication with @code{yylex},
+including @code{yylval} and @code{yylloc}.)
-The Bison declaration @code{%pure_parser} says that you want the parser
-to be reentrant. It looks like this:
+Alternatively, you can generate a pure, reentrant parser. The Bison
+declaration @code{%pure_parser} says that you want the parser to be
+reentrant. It looks like this:
@example
%pure_parser
@end example
-The effect is that the two communication variables become local
-variables in @code{yyparse}, and a different calling convention is used
-for the lexical analyzer function @code{yylex}. @xref{Pure Calling,
-,Calling Conventions for Pure Parsers}, for the details of this. The
-variable @code{yynerrs} also becomes local in @code{yyparse}
-(@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}).
-The convention for calling @code{yyparse} itself is unchanged.
+The result is that the communication variables @code{yylval} and
+@code{yylloc} become local variables in @code{yyparse}, and a different
+calling convention is used for the lexical analyzer function
+@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
+Parsers}, for the details of this. The variable @code{yynerrs} also
+becomes local in @code{yyparse} (@pxref{Error Reporting, ,The Error
+Reporting Function @code{yyerror}}). The convention for calling
+@code{yyparse} itself is unchanged.
+
+Whether the parser is pure has nothing to do with the grammar rules.
+You can generate either a pure parser or a nonreentrant parser from any
+valid grammar.
@node Decl Summary, , Pure Decl, Declarations
@subsection Bison Declaration Summary
@item %pure_parser
Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
+
+@item %no_lines
+Don't generate any @code{#line} preprocessor commands in the parser
+file. Ordinarily Bison writes these commands in the parser file so that
+the C compiler and debuggers will associate errors and object code with
+your source file (the grammar file). This directive causes them to
+associate errors with the parser file, treating it an independent source
+file in its own right.
+
+@item %raw
+The output file @file{@var{name}.h} normally defines the tokens with
+Yacc-compatible token numbers. If this option is specified, the
+internal Bison numbers are used instead. (Yacc-compatible numbers start
+at 257 except for single-character tokens; Bison assigns token numbers
+sequentially for all tokens starting at 3.)
+
+@item %token_table
+Generate an array of token names in the parser file. The name of the
+array is @code{yytname}; @code{yytname[@var{i}]} is the name of the
+token whose internal Bison token code number is @var{i}. The first three
+elements of @code{yytname} are always @code{"$"}, @code{"error"}, and
+@code{"$illegal"}; after these come the symbols defined in the grammar
+file.
+
+For single-character literal tokens and literal string tokens, the name
+in the table includes the single-quote or double-quote characters: for
+example, @code{"'+'"} is a single-character literal and @code{"\"<=\""}
+is a literal string token. All the characters of the literal string
+token appear verbatim in the string found in the table; even
+double-quote characters are not escaped. For example, if the token
+consists of three characters @samp{*"*}, its string in @code{yytname}
+contains @samp{"*"*"}. (In C, that would be written as
+@code{"\"*\"*\""}).
+
+When you specify @code{%token_table}, Bison also generates macro
+definitions for macros @code{YYNTOKENS}, @code{YYNNTS}, and
+@code{YYNRULES}, and @code{YYNSTATES}:
+
+@table @code
+@item YYNTOKENS
+The highest token number, plus one.
+@item YYNNTS
+The number of nonterminal symbols.
+@item YYNRULES
+The number of grammar rules,
+@item YYNSTATES
+The number of parser states (@pxref{Parser States}).
+@end table
@end table
-@node Multiple Parsers, , Declarations, Grammar File
+@node Multiple Parsers,, Declarations, Grammar File
@section Multiple Parsers in the Same Program
Most programs that use Bison parse only one language and therefore contain
@menu
* Parser Function:: How to call @code{yyparse} and what it returns.
-* Lexical:: You must supply a function @code{yylex}
+* Lexical:: You must supply a function @code{yylex}
which reads tokens.
* Error Reporting:: You must supply a function @code{yyerror}.
* Action Features:: Special features for use in actions.
Here is an example showing these things:
@example
-yylex ()
+int
+yylex (void)
@{
@dots{}
if (c == EOF) /* Detect end of file. */
This interface has been designed so that the output from the @code{lex}
utility can be used without change as the definition of @code{yylex}.
+If the grammar uses literal string tokens, there are two ways that
+@code{yylex} can determine the token type codes for them:
+
+@itemize @bullet
+@item
+If the grammar defines symbolic token names as aliases for the
+literal string tokens, @code{yylex} can use these symbolic names like
+all others. In this case, the use of the literal string tokens in
+the grammar file has no effect on @code{yylex}.
+
+@item
+@code{yylex} can find the multicharacter token in the @code{yytname}
+table. The index of the token in the table is the token type's code.
+The name of a multicharacter token is recorded in @code{yytname} with a
+double-quote, the token's characters, and another double-quote. The
+token's characters are not escaped in any way; they appear verbatim in
+the contents of the string in the table.
+
+Here's code for looking up a token in @code{yytname}, assuming that the
+characters of the token are stored in @code{token_buffer}.
+
+@smallexample
+for (i = 0; i < YYNTOKENS; i++)
+ @{
+ if (yytname[i] != 0
+ && yytname[i][0] == '"'
+ && strncmp (yytname[i] + 1, token_buffer,
+ strlen (token_buffer))
+ && yytname[i][strlen (token_buffer) + 1] == '"'
+ && yytname[i][strlen (token_buffer) + 2] == 0)
+ break;
+ @}
+@end smallexample
+
+The @code{yytname} table is generated only if you use the
+@code{%token_table} declaration. @xref{Decl Summary}.
+@end itemize
+
@node Token Values, Token Positions, Calling Convention, Lexical
@subsection Semantic Values of Tokens
pointers.
@example
-yylex (lvalp, llocp)
- YYSTYPE *lvalp;
- YYLTYPE *llocp;
+int
+yylex (YYSTYPE *lvalp, YYLTYPE *llocp)
@{
@dots{}
*lvalp = value; /* Put value onto Bison stack. */
only one argument.
@vindex YYPARSE_PARAM
-You can pass parameter information to a reentrant parser in a reentrant
-way. Define the macro @code{YYPARSE_PARAM} as a variable name. The
-resulting @code{yyparse} function then accepts one argument, of type
-@code{void *}, with that name.
+If you use a reentrant parser, you can optionally pass additional
+parameter information to it in a reentrant way. To do so, define the
+macro @code{YYPARSE_PARAM} as a variable name. This modifies the
+@code{yyparse} function to accept one argument, of type @code{void *},
+with that name.
When you call @code{yyparse}, pass the address of an object, casting the
address to @code{void *}. The grammar actions can refer to the contents
the proper object type, or you can declare it as @code{void *} and
access the contents as shown above.
+You can use @samp{%pure_parser} to request a reentrant parser without
+also using @code{YYPARSE_PARAM}. Then you should call @code{yyparse}
+with no arguments, as usual.
+
@node Error Reporting, Action Features, Lexical, Interface
@section The Error Reporting Function @code{yyerror}
@cindex error reporting function
@cindex syntax error
The Bison parser detects a @dfn{parse error} or @dfn{syntax error}
-whenever it reads a token which cannot satisfy any syntax rule. A
+whenever it reads a token which cannot satisfy any syntax rule. An
action in the grammar can also explicitly proclaim an error, using the
-macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use in Actions}).
+macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use
+in Actions}).
The Bison parser expects to report the error by calling an error
reporting function named @code{yyerror}, which you must supply. It is
@findex YYERROR_VERBOSE
If you define the macro @code{YYERROR_VERBOSE} in the Bison declarations
-section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then Bison provides a more verbose
-and specific error message string instead of just plain @w{@code{"parse
-error"}}. It doesn't matter what definition you use for
-@code{YYERROR_VERBOSE}, just whether you define it.
+section (@pxref{Bison Declarations, ,The Bison Declarations Section}),
+then Bison provides a more verbose and specific error message string
+instead of just plain @w{@code{"parse error"}}. It doesn't matter what
+definition you use for @code{YYERROR_VERBOSE}, just whether you define
+it.
The parser can detect one other kind of error: stack overflow. This
happens when the input contains constructions that are very deeply
@example
@group
-yyerror (s)
- char *s;
+void
+yyerror (char *s)
@{
@end group
@group
@item $<@var{typealt}>@var{n}
Like @code{$@var{n}} but specifies alternative @var{typealt} in the
-union specified by the @code{%union} declaration.
+union specified by the @code{%union} declaration.
@xref{Action Types, ,Data Types of Values in Actions}.@refill
@item YYABORT;
@item yyerrok;
Resume generating error messages immediately for subsequent syntax
-errors. This is useful primarily in error rules.
+errors. This is useful primarily in error rules.
@xref{Error Recovery}.
@item @@@var{n}
@};
@end example
-Thus, to get the starting line number of the third component, use
-@samp{@@3.first_line}.
+Thus, to get the starting line number of the third component, you would
+use @samp{@@3.first_line}.
In order for the members of this structure to contain valid information,
you must make @code{yylex} supply this information about each token.
@end table
@node Algorithm, Error Recovery, Interface, Top
-@chapter The Bison Parser Algorithm
-@cindex Bison parser algorithm
+@chapter The Bison Parser Algorithm
+@cindex Bison parser algorithm
@cindex algorithm of parser
@cindex shifting
@cindex reduction
@noindent
Suppose the parser has seen the tokens @samp{1}, @samp{-} and @samp{2};
-should it reduce them via the rule for the addition operator? It depends
+should it reduce them via the rule for the subtraction operator? It depends
on the next token. Of course, if the next token is @samp{)}, we must
reduce; shifting is invalid because no single rule can reduce the token
sequence @w{@samp{- 2 )}} or anything starting with that. But if the next
To decide which one Bison should do, we must consider the
results. If the next operator token @var{op} is shifted, then it
must be reduced first in order to permit another opportunity to
-reduce the sum. The result is (in effect) @w{@samp{1 - (2
+reduce the difference. The result is (in effect) @w{@samp{1 - (2
@var{op} 3)}}. On the other hand, if the subtraction is reduced
before shifting @var{op}, the result is @w{@samp{(1 - 2) @var{op}
3}}. Clearly, then, the choice of shift or reduce should depend
@end example
It would seem that this grammar can be parsed with only a single token
-of look-ahead: when a @code{param_spec} is being read, an @code{ID} is
+of look-ahead: when a @code{param_spec} is being read, an @code{ID} is
a @code{name} if a comma or colon follows, or a @code{type} if another
@code{ID} follows. In other words, this grammar is LR(1).
is always defined (you need not declare it) and reserved for error
handling. The Bison parser generates an @code{error} token whenever a
syntax error happens; if you have provided a rule to recognize this token
-in the current context, the parse can continue.
+in the current context, the parse can continue.
For example:
Unfortunately, the name being declared is separated from the declaration
construct itself by a complicated syntactic structure---the ``declarator''.
-As a result, the part of Bison parser for C needs to be duplicated, with
+As a result, part of the Bison parser for C needs to be duplicated, with
all the nonterminal names changed: once for parsing a declaration in which
a typedef name can be redefined, and once for parsing a declaration in
which that can't be done. Here is a part of the duplication, with actions
with letters are parsed as integers if possible.
The declaration of @code{hexflag} shown in the C declarations section of
-the parser file is needed to make it accessible to the actions
+the parser file is needed to make it accessible to the actions
(@pxref{C Declarations, ,The C Declarations Section}). You must also write the code in @code{yylex}
to obey the flag.
To enable compilation of trace facilities, you must define the macro
@code{YYDEBUG} when you compile the parser. You could use
@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define
-YYDEBUG 1} in the C declarations section of the grammar file
+YYDEBUG 1} in the C declarations section of the grammar file
(@pxref{C Declarations, ,The C Declarations Section}). Alternatively, use the @samp{-t} option when
you run Bison (@pxref{Invocation, ,Invoking Bison}). We always define @code{YYDEBUG} so that
debugging is always possible.
#define YYPRINT(file, type, value) yyprint (file, type, value)
static void
-yyprint (file, type, value)
- FILE *file;
- int type;
- YYSTYPE value;
+yyprint (FILE *file, int type, YYSTYPE value)
@{
if (type == VAR)
fprintf (file, " %s", value.tptr->name);
@file{hack/foo.tab.c}.@refill
@menu
-* Bison Options:: All the options described in detail,
+* Bison Options:: All the options described in detail,
in alphabetical order by short options.
+* Environment Variables:: Variables which affect Bison execution.
* Option Cross Key:: Alphabetical list of long options.
* VMS Invocation:: Bison command syntax on VMS.
@end menu
-@node Bison Options, Option Cross Key, , Invocation
+@node Bison Options, Environment Variables, , Invocation
@section Bison Options
Bison supports both traditional single-letter options and mnemonic long
Ordinarily Bison puts them in the parser file so that the C compiler
and debuggers will associate errors with your source file, the
grammar file. This option causes them to associate errors with the
-parser file, treating it an independent source file in its own right.
+parser file, treating it as an independent source file in its own right.
+
+@item -n
+@itemx --no-parser
+Do not include any C code in the parser file; generate tables only. The
+parser file contains just @code{#define} directives and static variable
+declarations.
+
+This option also tells Bison to write the C code for the grammar actions
+into a file named @file{@var{filename}.act}, in the form of a
+brace-surrounded body fit for a @code{switch} statement.
@item -o @var{outfile}
@itemx --output-file=@var{outfile}
Specify the name @var{outfile} for the parser file.
The other output files' names are constructed from @var{outfile}
-as described under the @samp{-v} and @samp{-d} switches.
+as described under the @samp{-v} and @samp{-d} options.
@item -p @var{prefix}
@itemx --name-prefix=@var{prefix}
@xref{Multiple Parsers, ,Multiple Parsers in the Same Program}.
+@item -r
+@itemx --raw
+Pretend that @code{%raw} was specified. @xref{Decl Summary}.
+
@item -t
@itemx --debug
Output a definition of the macro @code{YYDEBUG} into the parser file,
@itemx --fixed-output-files
Equivalent to @samp{-o y.tab.c}; the parser output file is called
@file{y.tab.c}, and the other outputs are called @file{y.output} and
-@file{y.tab.h}. The purpose of this switch is to imitate Yacc's output
+@file{y.tab.h}. The purpose of this option is to imitate Yacc's output
file name conventions. Thus, the following shell script can substitute
for Yacc:@refill
@end example
@end table
-@node Option Cross Key, VMS Invocation, Bison Options, Invocation
+@node Environment Variables, Option Cross Key, Bison Options, Invocation
+@section Environment Variables
+@cindex environment variables
+@cindex BISON_HAIRY
+@cindex BISON_SIMPLE
+
+Here is a list of environment variables which affect the way Bison
+runs.
+
+@table @samp
+@item BISON_SIMPLE
+@itemx BISON_HAIRY
+Much of the parser generated by Bison is copied verbatim from a file
+called @file{bison.simple}. If Bison cannot find that file, or if you
+would like to direct Bison to use a different copy, setting the
+environment variable @code{BISON_SIMPLE} to the path of the file will
+cause Bison to use that copy instead.
+
+When the @samp{%semantic_parser} declaration is used, Bison copies from
+a file called @file{bison.hairy} instead. The location of this file can
+also be specified or overridden in a similar fashion, with the
+@code{BISON_HAIRY} environment variable.
+
+@end table
+
+@node Option Cross Key, VMS Invocation, Environment Variables, Invocation
@section Option Cross Key
Here is a list of options, alphabetized by long option, to help you find
\line{ --help \leaderfill -h}
\line{ --name-prefix \leaderfill -p}
\line{ --no-lines \leaderfill -l}
+\line{ --no-parser \leaderfill -n}
\line{ --output-file \leaderfill -o}
+\line{ --raw \leaderfill -r}
+\line{ --token-table \leaderfill -k}
\line{ --verbose \leaderfill -v}
\line{ --version \leaderfill -V}
\line{ --yacc \leaderfill -y}
--file-prefix=@var{prefix} -b @var{file-prefix}
--fixed-output-files --yacc -y
--help -h
---name-prefix -p
+--name-prefix=@var{prefix} -p @var{name-prefix}
--no-lines -l
+--no-parser -n
--output-file=@var{outfile} -o @var{outfile}
+--raw -r
+--token-table -k
--verbose -v
--version -V
@end example
@item YYABORT
Macro to pretend that an unrecoverable syntax error has occurred, by
making @code{yyparse} return 1 immediately. The error reporting
-function @code{yyerror} is not called. @xref{Parser Function, ,The Parser Function @code{yyparse}}.
+function @code{yyerror} is not called. @xref{Parser Function, ,The
+Parser Function @code{yyparse}}.
@item YYACCEPT
Macro to pretend that a complete utterance of the language has been
-read, by making @code{yyparse} return 0 immediately.
+read, by making @code{yyparse} return 0 immediately.
@xref{Parser Function, ,The Parser Function @code{yyparse}}.
@item YYBACKUP
Macro for the data type of @code{yylloc}; a structure with four
members. @xref{Token Positions, ,Textual Positions of Tokens}.
+@item yyltype
+Default value for YYLTYPE.
+
@item YYMAXDEPTH
Macro for specifying the maximum size of the parser stack.
@xref{Stack Overflow}.
@xref{Value Type, ,Data Types of Semantic Values}.
@item yychar
-External integer variable that contains the integer value of the
-current look-ahead token. (In a pure parser, it is a local variable
-within @code{yyparse}.) Error-recovery rule actions may examine this
-variable. @xref{Action Features, ,Special Features for Use in Actions}.
+External integer variable that contains the integer value of the current
+look-ahead token. (In a pure parser, it is a local variable within
+@code{yyparse}.) Error-recovery rule actions may examine this variable.
+@xref{Action Features, ,Special Features for Use in Actions}.
@item yyclearin
Macro used in error-recovery rule actions. It clears the previous
@item yyerror
User-supplied function to be called by @code{yyparse} on error. The
function receives one argument, a pointer to a character string
-containing an error message. @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+containing an error message. @xref{Error Reporting, ,The Error
+Reporting Function @code{yyerror}}.
@item yylex
User-supplied lexical analyzer function, called with no arguments
@code{yylex}.) @xref{Token Values, ,Semantic Values of Tokens}.
@item yylloc
-External variable in which @code{yylex} should place the line and
-column numbers associated with a token. (In a pure parser, it is a
-local variable within @code{yyparse}, and its address is passed to
+External variable in which @code{yylex} should place the line and column
+numbers associated with a token. (In a pure parser, it is a local
+variable within @code{yyparse}, and its address is passed to
@code{yylex}.) You can ignore this variable if you don't use the
-@samp{@@} feature in the grammar actions. @xref{Token Positions, ,Textual Positions of Tokens}.
+@samp{@@} feature in the grammar actions. @xref{Token Positions,
+,Textual Positions of Tokens}.
@item yynerrs
-Global variable which Bison increments each time there is a parse
-error. (In a pure parser, it is a local variable within
-@code{yyparse}.) @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+Global variable which Bison increments each time there is a parse error.
+(In a pure parser, it is a local variable within @code{yyparse}.)
+@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
@item yyparse
The parser function produced by Bison; call this function to start
Bison declaration to assign left associativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
+@item %no_lines
+Bison declaration to avoid generating @code{#line} directives in the
+parser file. @xref{Decl Summary}.
+
@item %nonassoc
Bison declaration to assign nonassociativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
Bison declaration to request a pure (reentrant) parser.
@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
+@item %raw
+Bison declaration to use Bison internal token code numbers in token
+tables instead of the usual Yacc-compatible token code numbers.
+@xref{Decl Summary}.
+
@item %right
Bison declaration to assign right associativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
Bison declaration to declare token(s) without specifying precedence.
@xref{Token Decl, ,Token Type Names}.
+@item %token_table
+Bison declaration to include a token name table in the parser file.
+@xref{Decl Summary}.
+
@item %type
Bison declaration to declare nonterminals. @xref{Type Decl, ,Nonterminal Symbols}.
@item Grouping
A language construct that is (in general) grammatically divisible;
-for example, `expression' or `declaration' in C.
+for example, `expression' or `declaration' in C.
@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
@item Infix operator
A flag, set by actions in the grammar rules, which alters the way
tokens are parsed. @xref{Lexical Tie-ins}.
+@item Literal string token
+A token which consists of two or more fixed characters.
+@xref{Symbols}.
+
@item Look-ahead token
A token already read but not yet shifted. @xref{Look-Ahead, ,Look-Ahead Tokens}.
@item Start symbol
The nonterminal symbol that stands for a complete valid utterance in
the language being parsed. The start symbol is usually listed as the
-first nonterminal symbol in a language specification.
+first nonterminal symbol in a language specification.
@xref{Start Decl, ,The Start-Symbol}.
@item Symbol table
@printindex cp
-@contents
-
@bye
-
-
-\f
-
-@c old menu
-
-* Introduction::
-* Conditions::
-* Copying:: The GNU General Public License says
- how you can copy and share Bison
-
-Tutorial sections:
-* Concepts:: Basic concepts for understanding Bison.
-* Examples:: Three simple explained examples of using Bison.
-
-Reference sections:
-* Grammar File:: Writing Bison declarations and rules.
-* Interface:: C-language interface to the parser function @code{yyparse}.
-* Algorithm:: How the Bison parser works at run-time.
-* Error Recovery:: Writing rules for error recovery.
-* Context Dependency::What to do if your language syntax is too
- messy for Bison to handle straightforwardly.
-* Debugging:: Debugging Bison parsers that parse wrong.
-* Invocation:: How to run Bison (to produce the parser source file).
-* Table of Symbols:: All the keywords of the Bison language are explained.
-* Glossary:: Basic concepts are explained.
-* Index:: Cross-references to the text.
-