git.saurik.com Git - bison.git/blame_incremental

... / ...

Commit	Line	Data
	1	\input texinfo @c --texinfo--
	2	@comment %**start of header
	3	@setfilename bison.info
	4	@include version.texi
	5	@settitle Bison @value{VERSION}
	6	@setchapternewpage odd
	7
	8	@finalout
	9
	10	@c SMALL BOOK version
	11	@c This edition has been formatted so that you can format and print it in
	12	@c the smallbook format.
	13	@c @smallbook
	14
	15	@c Set following if you have the new `shorttitlepage' command
	16	@c @clear shorttitlepage-enabled
	17	@c @set shorttitlepage-enabled
	18
	19	@c Set following if you want to document %default-prec and %no-default-prec.
	20	@c This feature is experimental and may change in future Bison versions.
	21	@c @set defaultprec
	22
	23	@c ISPELL CHECK: done, 14 Jan 1993 --bob
	24
	25	@c Check COPYRIGHT dates. should be updated in the titlepage, ifinfo
	26	@c titlepage; should NOT be changed in the GPL. --mew
	27
	28	@c FIXME: I don't understand this `iftex'. Obsolete? --akim.
	29	@iftex
	30	@syncodeindex fn cp
	31	@syncodeindex vr cp
	32	@syncodeindex tp cp
	33	@end iftex
	34	@ifinfo
	35	@synindex fn cp
	36	@synindex vr cp
	37	@synindex tp cp
	38	@end ifinfo
	39	@comment %**end of header
	40
	41	@copying
	42
	43	This manual is for @acronym{GNU} Bison (version @value{VERSION},
	44	@value{UPDATED}), the @acronym{GNU} parser generator.
	45
	46	Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998,
	47	1999, 2000, 2001, 2002, 2003, 2004 Free Software Foundation, Inc.
	48
	49	@quotation
	50	Permission is granted to copy, distribute and/or modify this document
	51	under the terms of the @acronym{GNU} Free Documentation License,
	52	Version 1.1 or any later version published by the Free Software
	53	Foundation; with no Invariant Sections, with the Front-Cover texts
	54	being ``A @acronym{GNU} Manual,'' and with the Back-Cover Texts as in
	55	(a) below. A copy of the license is included in the section entitled
	56	``@acronym{GNU} Free Documentation License.''
	57
	58	(a) The @acronym{FSF}'s Back-Cover Text is: ``You have freedom to copy
	59	and modify this @acronym{GNU} Manual, like @acronym{GNU} software.
	60	Copies published by the Free Software Foundation raise funds for
	61	@acronym{GNU} development.''
	62	@end quotation
	63	@end copying
	64
	65	@dircategory GNU programming tools
	66	@direntry
	67	* bison: (bison). @acronym{GNU} parser generator (Yacc replacement).
	68	@end direntry
	69
	70	@ifset shorttitlepage-enabled
	71	@shorttitlepage Bison
	72	@end ifset
	73	@titlepage
	74	@title Bison
	75	@subtitle The Yacc-compatible Parser Generator
	76	@subtitle @value{UPDATED}, Bison Version @value{VERSION}
	77
	78	@author by Charles Donnelly and Richard Stallman
	79
	80	@page
	81	@vskip 0pt plus 1filll
	82	@insertcopying
	83	@sp 2
	84	Published by the Free Software Foundation @*
	85	59 Temple Place, Suite 330 @*
	86	Boston, MA 02111-1307 USA @*
	87	Printed copies are available from the Free Software Foundation.@*
	88	@acronym{ISBN} 1-882114-44-2
	89	@sp 2
	90	Cover art by Etienne Suvasa.
	91	@end titlepage
	92
	93	@contents
	94
	95	@ifnottex
	96	@node Top
	97	@top Bison
	98	@insertcopying
	99	@end ifnottex
	100
	101	@menu
	102	* Introduction::
	103	* Conditions::
	104	* Copying:: The @acronym{GNU} General Public License says
	105	how you can copy and share Bison
	106
	107	Tutorial sections:
	108	* Concepts:: Basic concepts for understanding Bison.
	109	* Examples:: Three simple explained examples of using Bison.
	110
	111	Reference sections:
	112	* Grammar File:: Writing Bison declarations and rules.
	113	* Interface:: C-language interface to the parser function @code{yyparse}.
	114	* Algorithm:: How the Bison parser works at run-time.
	115	* Error Recovery:: Writing rules for error recovery.
	116	* Context Dependency:: What to do if your language syntax is too
	117	messy for Bison to handle straightforwardly.
	118	* Debugging:: Understanding or debugging Bison parsers.
	119	* Invocation:: How to run Bison (to produce the parser source file).
	120	* Table of Symbols:: All the keywords of the Bison language are explained.
	121	* Glossary:: Basic concepts are explained.
	122	* FAQ:: Frequently Asked Questions
	123	* Copying This Manual:: License for copying this manual.
	124	* Index:: Cross-references to the text.
	125
	126	@detailmenu
	127	--- The Detailed Node Listing ---
	128
	129	The Concepts of Bison
	130
	131	* Language and Grammar:: Languages and context-free grammars,
	132	as mathematical ideas.
	133	* Grammar in Bison:: How we represent grammars for Bison's sake.
	134	* Semantic Values:: Each token or syntactic grouping can have
	135	a semantic value (the value of an integer,
	136	the name of an identifier, etc.).
	137	* Semantic Actions:: Each rule can have an action containing C code.
	138	* GLR Parsers:: Writing parsers for general context-free languages
	139	* Locations Overview:: Tracking Locations.
	140	* Bison Parser:: What are Bison's input and output,
	141	how is the output used?
	142	* Stages:: Stages in writing and running Bison grammars.
	143	* Grammar Layout:: Overall structure of a Bison grammar file.
	144
	145	Examples
	146
	147	* RPN Calc:: Reverse polish notation calculator;
	148	a first example with no operator precedence.
	149	* Infix Calc:: Infix (algebraic) notation calculator.
	150	Operator precedence is introduced.
	151	* Simple Error Recovery:: Continuing after syntax errors.
	152	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	153	* Multi-function Calc:: Calculator with memory and trig functions.
	154	It uses multiple data-types for semantic values.
	155	* Exercises:: Ideas for improving the multi-function calculator.
	156
	157	Reverse Polish Notation Calculator
	158
	159	* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc.
	160	* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation.
	161	* Lexer: Rpcalc Lexer. The lexical analyzer.
	162	* Main: Rpcalc Main. The controlling function.
	163	* Error: Rpcalc Error. The error reporting function.
	164	* Gen: Rpcalc Gen. Running Bison on the grammar file.
	165	* Comp: Rpcalc Compile. Run the C compiler on the output code.
	166
	167	Grammar Rules for @code{rpcalc}
	168
	169	* Rpcalc Input::
	170	* Rpcalc Line::
	171	* Rpcalc Expr::
	172
	173	Location Tracking Calculator: @code{ltcalc}
	174
	175	* Decls: Ltcalc Decls. Bison and C declarations for ltcalc.
	176	* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations.
	177	* Lexer: Ltcalc Lexer. The lexical analyzer.
	178
	179	Multi-Function Calculator: @code{mfcalc}
	180
	181	* Decl: Mfcalc Decl. Bison declarations for multi-function calculator.
	182	* Rules: Mfcalc Rules. Grammar rules for the calculator.
	183	* Symtab: Mfcalc Symtab. Symbol table management subroutines.
	184
	185	Bison Grammar Files
	186
	187	* Grammar Outline:: Overall layout of the grammar file.
	188	* Symbols:: Terminal and nonterminal symbols.
	189	* Rules:: How to write grammar rules.
	190	* Recursion:: Writing recursive rules.
	191	* Semantics:: Semantic values and actions.
	192	* Locations:: Locations and actions.
	193	* Declarations:: All kinds of Bison declarations are described here.
	194	* Multiple Parsers:: Putting more than one Bison parser in one program.
	195
	196	Outline of a Bison Grammar
	197
	198	* Prologue:: Syntax and usage of the prologue.
	199	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	200	* Grammar Rules:: Syntax and usage of the grammar rules section.
	201	* Epilogue:: Syntax and usage of the epilogue.
	202
	203	Defining Language Semantics
	204
	205	* Value Type:: Specifying one data type for all semantic values.
	206	* Multiple Types:: Specifying several alternative data types.
	207	* Actions:: An action is the semantic definition of a grammar rule.
	208	* Action Types:: Specifying data types for actions to operate on.
	209	* Mid-Rule Actions:: Most actions go at the end of a rule.
	210	This says when, why and how to use the exceptional
	211	action in the middle of a rule.
	212
	213	Tracking Locations
	214
	215	* Location Type:: Specifying a data type for locations.
	216	* Actions and Locations:: Using locations in actions.
	217	* Location Default Action:: Defining a general way to compute locations.
	218
	219	Bison Declarations
	220
	221	* Token Decl:: Declaring terminal symbols.
	222	* Precedence Decl:: Declaring terminals with precedence and associativity.
	223	* Union Decl:: Declaring the set of all semantic value types.
	224	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	225	* Destructor Decl:: Declaring how symbols are freed.
	226	* Expect Decl:: Suppressing warnings about parsing conflicts.
	227	* Start Decl:: Specifying the start symbol.
	228	* Pure Decl:: Requesting a reentrant parser.
	229	* Decl Summary:: Table of all Bison declarations.
	230
	231	Parser C-Language Interface
	232
	233	* Parser Function:: How to call @code{yyparse} and what it returns.
	234	* Lexical:: You must supply a function @code{yylex}
	235	which reads tokens.
	236	* Error Reporting:: You must supply a function @code{yyerror}.
	237	* Action Features:: Special features for use in actions.
	238
	239	The Lexical Analyzer Function @code{yylex}
	240
	241	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	242	* Token Values:: How @code{yylex} must return the semantic value
	243	of the token it has read.
	244	* Token Locations:: How @code{yylex} must return the text location
	245	(line number, etc.) of the token, if the
	246	actions want that.
	247	* Pure Calling:: How the calling convention differs
	248	in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	249
	250	The Bison Parser Algorithm
	251
	252	* Look-Ahead:: Parser looks one token ahead when deciding what to do.
	253	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	254	* Precedence:: Operator precedence works by resolving conflicts.
	255	* Contextual Precedence:: When an operator's precedence depends on context.
	256	* Parser States:: The parser is a finite-state-machine with stack.
	257	* Reduce/Reduce:: When two rules are applicable in the same situation.
	258	* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
	259	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	260	* Stack Overflow:: What happens when stack gets full. How to avoid it.
	261
	262	Operator Precedence
	263
	264	* Why Precedence:: An example showing why precedence is needed.
	265	* Using Precedence:: How to specify precedence in Bison grammars.
	266	* Precedence Examples:: How these features are used in the previous example.
	267	* How Precedence:: How they work.
	268
	269	Handling Context Dependencies
	270
	271	* Semantic Tokens:: Token parsing can depend on the semantic context.
	272	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	273	* Tie-in Recovery:: Lexical tie-ins have implications for how
	274	error recovery rules must be written.
	275
	276	Debugging Your Parser
	277
	278	* Understanding:: Understanding the structure of your parser.
	279	* Tracing:: Tracing the execution of your parser.
	280
	281	Invoking Bison
	282
	283	* Bison Options:: All the options described in detail,
	284	in alphabetical order by short options.
	285	* Option Cross Key:: Alphabetical list of long options.
	286	* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
	287
	288	Frequently Asked Questions
	289
	290	* Parser Stack Overflow:: Breaking the Stack Limits
	291	* How Can I Reset the Parser:: @code{yyparse} Keeps some State
	292	* Strings are Destroyed:: @code{yylval} Loses Track of Strings
	293	* C++ Parsers:: Compiling Parsers with C++ Compilers
	294	* Implementing Loops:: Control Flow in the Calculator
	295
	296	Copying This Manual
	297
	298	* GNU Free Documentation License:: License for copying this manual.
	299
	300	@end detailmenu
	301	@end menu
	302
	303	@node Introduction
	304	@unnumbered Introduction
	305	@cindex introduction
	306
	307	@dfn{Bison} is a general-purpose parser generator that converts a
	308	grammar description for an @acronym{LALR}(1) context-free grammar into a C
	309	program to parse that grammar. Once you are proficient with Bison,
	310	you may use it to develop a wide range of language parsers, from those
	311	used in simple desk calculators to complex programming languages.
	312
	313	Bison is upward compatible with Yacc: all properly-written Yacc grammars
	314	ought to work with Bison with no change. Anyone familiar with Yacc
	315	should be able to use Bison with little trouble. You need to be fluent in
	316	C programming in order to use Bison or to understand this manual.
	317
	318	We begin with tutorial chapters that explain the basic concepts of using
	319	Bison and show three explained examples, each building on the last. If you
	320	don't know Bison or Yacc, start by reading these chapters. Reference
	321	chapters follow which describe specific aspects of Bison in detail.
	322
	323	Bison was written primarily by Robert Corbett; Richard Stallman made it
	324	Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added
	325	multi-character string literals and other features.
	326
	327	This edition corresponds to version @value{VERSION} of Bison.
	328
	329	@node Conditions
	330	@unnumbered Conditions for Using Bison
	331
	332	As of Bison version 1.24, we have changed the distribution terms for
	333	@code{yyparse} to permit using Bison's output in nonfree programs when
	334	Bison is generating C code for @acronym{LALR}(1) parsers. Formerly, these
	335	parsers could be used only in programs that were free software.
	336
	337	The other @acronym{GNU} programming tools, such as the @acronym{GNU} C
	338	compiler, have never
	339	had such a requirement. They could always be used for nonfree
	340	software. The reason Bison was different was not due to a special
	341	policy decision; it resulted from applying the usual General Public
	342	License to all of the Bison source code.
	343
	344	The output of the Bison utility---the Bison parser file---contains a
	345	verbatim copy of a sizable piece of Bison, which is the code for the
	346	@code{yyparse} function. (The actions from your grammar are inserted
	347	into this function at one point, but the rest of the function is not
	348	changed.) When we applied the @acronym{GPL} terms to the code for
	349	@code{yyparse},
	350	the effect was to restrict the use of Bison output to free software.
	351
	352	We didn't change the terms because of sympathy for people who want to
	353	make software proprietary. @strong{Software should be free.} But we
	354	concluded that limiting Bison's use to free software was doing little to
	355	encourage people to make other software free. So we decided to make the
	356	practical conditions for using Bison match the practical conditions for
	357	using the other @acronym{GNU} tools.
	358
	359	This exception applies only when Bison is generating C code for an
	360	@acronym{LALR}(1) parser; otherwise, the @acronym{GPL} terms operate
	361	as usual. You can
	362	tell whether the exception applies to your @samp{.c} output file by
	363	inspecting it to see whether it says ``As a special exception, when
	364	this file is copied by Bison into a Bison output file, you may use
	365	that output file without restriction.''
	366
	367	@include gpl.texi
	368
	369	@node Concepts
	370	@chapter The Concepts of Bison
	371
	372	This chapter introduces many of the basic concepts without which the
	373	details of Bison will not make sense. If you do not already know how to
	374	use Bison or Yacc, we suggest you start by reading this chapter carefully.
	375
	376	@menu
	377	* Language and Grammar:: Languages and context-free grammars,
	378	as mathematical ideas.
	379	* Grammar in Bison:: How we represent grammars for Bison's sake.
	380	* Semantic Values:: Each token or syntactic grouping can have
	381	a semantic value (the value of an integer,
	382	the name of an identifier, etc.).
	383	* Semantic Actions:: Each rule can have an action containing C code.
	384	* GLR Parsers:: Writing parsers for general context-free languages
	385	* Locations Overview:: Tracking Locations.
	386	* Bison Parser:: What are Bison's input and output,
	387	how is the output used?
	388	* Stages:: Stages in writing and running Bison grammars.
	389	* Grammar Layout:: Overall structure of a Bison grammar file.
	390	@end menu
	391
	392	@node Language and Grammar
	393	@section Languages and Context-Free Grammars
	394
	395	@cindex context-free grammar
	396	@cindex grammar, context-free
	397	In order for Bison to parse a language, it must be described by a
	398	@dfn{context-free grammar}. This means that you specify one or more
	399	@dfn{syntactic groupings} and give rules for constructing them from their
	400	parts. For example, in the C language, one kind of grouping is called an
	401	`expression'. One rule for making an expression might be, ``An expression
	402	can be made of a minus sign and another expression''. Another would be,
	403	``An expression can be an integer''. As you can see, rules are often
	404	recursive, but there must be at least one rule which leads out of the
	405	recursion.
	406
	407	@cindex @acronym{BNF}
	408	@cindex Backus-Naur form
	409	The most common formal system for presenting such rules for humans to read
	410	is @dfn{Backus-Naur Form} or ``@acronym{BNF}'', which was developed in
	411	order to specify the language Algol 60. Any grammar expressed in
	412	@acronym{BNF} is a context-free grammar. The input to Bison is
	413	essentially machine-readable @acronym{BNF}.
	414
	415	@cindex @acronym{LALR}(1) grammars
	416	@cindex @acronym{LR}(1) grammars
	417	There are various important subclasses of context-free grammar. Although it
	418	can handle almost all context-free grammars, Bison is optimized for what
	419	are called @acronym{LALR}(1) grammars.
	420	In brief, in these grammars, it must be possible to
	421	tell how to parse any portion of an input string with just a single
	422	token of look-ahead. Strictly speaking, that is a description of an
	423	@acronym{LR}(1) grammar, and @acronym{LALR}(1) involves additional
	424	restrictions that are
	425	hard to explain simply; but it is rare in actual practice to find an
	426	@acronym{LR}(1) grammar that fails to be @acronym{LALR}(1).
	427	@xref{Mystery Conflicts, ,Mysterious Reduce/Reduce Conflicts}, for
	428	more information on this.
	429
	430	@cindex @acronym{GLR} parsing
	431	@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
	432	@cindex ambiguous grammars
	433	@cindex non-deterministic parsing
	434
	435	Parsers for @acronym{LALR}(1) grammars are @dfn{deterministic}, meaning
	436	roughly that the next grammar rule to apply at any point in the input is
	437	uniquely determined by the preceding input and a fixed, finite portion
	438	(called a @dfn{look-ahead}) of the remaining input. A context-free
	439	grammar can be @dfn{ambiguous}, meaning that there are multiple ways to
	440	apply the grammar rules to get the some inputs. Even unambiguous
	441	grammars can be @dfn{non-deterministic}, meaning that no fixed
	442	look-ahead always suffices to determine the next grammar rule to apply.
	443	With the proper declarations, Bison is also able to parse these more
	444	general context-free grammars, using a technique known as @acronym{GLR}
	445	parsing (for Generalized @acronym{LR}). Bison's @acronym{GLR} parsers
	446	are able to handle any context-free grammar for which the number of
	447	possible parses of any given string is finite.
	448
	449	@cindex symbols (abstract)
	450	@cindex token
	451	@cindex syntactic grouping
	452	@cindex grouping, syntactic
	453	In the formal grammatical rules for a language, each kind of syntactic
	454	unit or grouping is named by a @dfn{symbol}. Those which are built by
	455	grouping smaller constructs according to grammatical rules are called
	456	@dfn{nonterminal symbols}; those which can't be subdivided are called
	457	@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
	458	corresponding to a single terminal symbol a @dfn{token}, and a piece
	459	corresponding to a single nonterminal symbol a @dfn{grouping}.
	460
	461	We can use the C language as an example of what symbols, terminal and
	462	nonterminal, mean. The tokens of C are identifiers, constants (numeric
	463	and string), and the various keywords, arithmetic operators and
	464	punctuation marks. So the terminal symbols of a grammar for C include
	465	`identifier', `number', `string', plus one symbol for each keyword,
	466	operator or punctuation mark: `if', `return', `const', `static', `int',
	467	`char', `plus-sign', `open-brace', `close-brace', `comma' and many more.
	468	(These tokens can be subdivided into characters, but that is a matter of
	469	lexicography, not grammar.)
	470
	471	Here is a simple C function subdivided into tokens:
	472
	473	@ifinfo
	474	@example
	475	int /* @r{keyword `int'} */
	476	square (int x) /* @r{identifier, open-paren, identifier,}
	477	@r{identifier, close-paren} */
	478	@{ /* @r{open-brace} */
	479	return x * x; /* @r{keyword `return', identifier, asterisk,
	480	identifier, semicolon} */
	481	@} /* @r{close-brace} */
	482	@end example
	483	@end ifinfo
	484	@ifnotinfo
	485	@example
	486	int /* @r{keyword `int'} */
	487	square (int x) /* @r{identifier, open-paren, identifier, identifier, close-paren} */
	488	@{ /* @r{open-brace} */
	489	return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */
	490	@} /* @r{close-brace} */
	491	@end example
	492	@end ifnotinfo
	493
	494	The syntactic groupings of C include the expression, the statement, the
	495	declaration, and the function definition. These are represented in the
	496	grammar of C by nonterminal symbols `expression', `statement',
	497	`declaration' and `function definition'. The full grammar uses dozens of
	498	additional language constructs, each with its own nonterminal symbol, in
	499	order to express the meanings of these four. The example above is a
	500	function definition; it contains one declaration, and one statement. In
	501	the statement, each @samp{x} is an expression and so is @samp{x * x}.
	502
	503	Each nonterminal symbol must have grammatical rules showing how it is made
	504	out of simpler constructs. For example, one kind of C statement is the
	505	@code{return} statement; this would be described with a grammar rule which
	506	reads informally as follows:
	507
	508	@quotation
	509	A `statement' can be made of a `return' keyword, an `expression' and a
	510	`semicolon'.
	511	@end quotation
	512
	513	@noindent
	514	There would be many other rules for `statement', one for each kind of
	515	statement in C.
	516
	517	@cindex start symbol
	518	One nonterminal symbol must be distinguished as the special one which
	519	defines a complete utterance in the language. It is called the @dfn{start
	520	symbol}. In a compiler, this means a complete input program. In the C
	521	language, the nonterminal symbol `sequence of definitions and declarations'
	522	plays this role.
	523
	524	For example, @samp{1 + 2} is a valid C expression---a valid part of a C
	525	program---but it is not valid as an @emph{entire} C program. In the
	526	context-free grammar of C, this follows from the fact that `expression' is
	527	not the start symbol.
	528
	529	The Bison parser reads a sequence of tokens as its input, and groups the
	530	tokens using the grammar rules. If the input is valid, the end result is
	531	that the entire token sequence reduces to a single grouping whose symbol is
	532	the grammar's start symbol. If we use a grammar for C, the entire input
	533	must be a `sequence of definitions and declarations'. If not, the parser
	534	reports a syntax error.
	535
	536	@node Grammar in Bison
	537	@section From Formal Rules to Bison Input
	538	@cindex Bison grammar
	539	@cindex grammar, Bison
	540	@cindex formal grammar
	541
	542	A formal grammar is a mathematical construct. To define the language
	543	for Bison, you must write a file expressing the grammar in Bison syntax:
	544	a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}.
	545
	546	A nonterminal symbol in the formal grammar is represented in Bison input
	547	as an identifier, like an identifier in C@. By convention, it should be
	548	in lower case, such as @code{expr}, @code{stmt} or @code{declaration}.
	549
	550	The Bison representation for a terminal symbol is also called a @dfn{token
	551	type}. Token types as well can be represented as C-like identifiers. By
	552	convention, these identifiers should be upper case to distinguish them from
	553	nonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or
	554	@code{RETURN}. A terminal symbol that stands for a particular keyword in
	555	the language should be named after that keyword converted to upper case.
	556	The terminal symbol @code{error} is reserved for error recovery.
	557	@xref{Symbols}.
	558
	559	A terminal symbol can also be represented as a character literal, just like
	560	a C character constant. You should do this whenever a token is just a
	561	single character (parenthesis, plus-sign, etc.): use that same character in
	562	a literal as the terminal symbol for that token.
	563
	564	A third way to represent a terminal symbol is with a C string constant
	565	containing several characters. @xref{Symbols}, for more information.
	566
	567	The grammar rules also have an expression in Bison syntax. For example,
	568	here is the Bison rule for a C @code{return} statement. The semicolon in
	569	quotes is a literal character token, representing part of the C syntax for
	570	the statement; the naked semicolon, and the colon, are Bison punctuation
	571	used in every rule.
	572
	573	@example
	574	stmt: RETURN expr ';'
	575	;
	576	@end example
	577
	578	@noindent
	579	@xref{Rules, ,Syntax of Grammar Rules}.
	580
	581	@node Semantic Values
	582	@section Semantic Values
	583	@cindex semantic value
	584	@cindex value, semantic
	585
	586	A formal grammar selects tokens only by their classifications: for example,
	587	if a rule mentions the terminal symbol `integer constant', it means that
	588	@emph{any} integer constant is grammatically valid in that position. The
	589	precise value of the constant is irrelevant to how to parse the input: if
	590	@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equally
	591	grammatical.
	592
	593	But the precise value is very important for what the input means once it is
	594	parsed. A compiler is useless if it fails to distinguish between 4, 1 and
	595	3989 as constants in the program! Therefore, each token in a Bison grammar
	596	has both a token type and a @dfn{semantic value}. @xref{Semantics,
	597	,Defining Language Semantics},
	598	for details.
	599
	600	The token type is a terminal symbol defined in the grammar, such as
	601	@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everything
	602	you need to know to decide where the token may validly appear and how to
	603	group it with other tokens. The grammar rules know nothing about tokens
	604	except their types.
	605
	606	The semantic value has all the rest of the information about the
	607	meaning of the token, such as the value of an integer, or the name of an
	608	identifier. (A token such as @code{','} which is just punctuation doesn't
	609	need to have any semantic value.)
	610
	611	For example, an input token might be classified as token type
	612	@code{INTEGER} and have the semantic value 4. Another input token might
	613	have the same token type @code{INTEGER} but value 3989. When a grammar
	614	rule says that @code{INTEGER} is allowed, either of these tokens is
	615	acceptable because each is an @code{INTEGER}. When the parser accepts the
	616	token, it keeps track of the token's semantic value.
	617
	618	Each grouping can also have a semantic value as well as its nonterminal
	619	symbol. For example, in a calculator, an expression typically has a
	620	semantic value that is a number. In a compiler for a programming
	621	language, an expression typically has a semantic value that is a tree
	622	structure describing the meaning of the expression.
	623
	624	@node Semantic Actions
	625	@section Semantic Actions
	626	@cindex semantic actions
	627	@cindex actions, semantic
	628
	629	In order to be useful, a program must do more than parse input; it must
	630	also produce some output based on the input. In a Bison grammar, a grammar
	631	rule can have an @dfn{action} made up of C statements. Each time the
	632	parser recognizes a match for that rule, the action is executed.
	633	@xref{Actions}.
	634
	635	Most of the time, the purpose of an action is to compute the semantic value
	636	of the whole construct from the semantic values of its parts. For example,
	637	suppose we have a rule which says an expression can be the sum of two
	638	expressions. When the parser recognizes such a sum, each of the
	639	subexpressions has a semantic value which describes how it was built up.
	640	The action for this rule should create a similar sort of value for the
	641	newly recognized larger expression.
	642
	643	For example, here is a rule that says an expression can be the sum of
	644	two subexpressions:
	645
	646	@example
	647	expr: expr '+' expr @{ $$ = $1 + $3; @}
	648	;
	649	@end example
	650
	651	@noindent
	652	The action says how to produce the semantic value of the sum expression
	653	from the values of the two subexpressions.
	654
	655	@node GLR Parsers
	656	@section Writing @acronym{GLR} Parsers
	657	@cindex @acronym{GLR} parsing
	658	@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
	659	@findex %glr-parser
	660	@cindex conflicts
	661	@cindex shift/reduce conflicts
	662
	663	In some grammars, there will be cases where Bison's standard
	664	@acronym{LALR}(1) parsing algorithm cannot decide whether to apply a
	665	certain grammar rule at a given point. That is, it may not be able to
	666	decide (on the basis of the input read so far) which of two possible
	667	reductions (applications of a grammar rule) applies, or whether to apply
	668	a reduction or read more of the input and apply a reduction later in the
	669	input. These are known respectively as @dfn{reduce/reduce} conflicts
	670	(@pxref{Reduce/Reduce}), and @dfn{shift/reduce} conflicts
	671	(@pxref{Shift/Reduce}).
	672
	673	To use a grammar that is not easily modified to be @acronym{LALR}(1), a
	674	more general parsing algorithm is sometimes necessary. If you include
	675	@code{%glr-parser} among the Bison declarations in your file
	676	(@pxref{Grammar Outline}), the result will be a Generalized @acronym{LR}
	677	(@acronym{GLR}) parser. These parsers handle Bison grammars that
	678	contain no unresolved conflicts (i.e., after applying precedence
	679	declarations) identically to @acronym{LALR}(1) parsers. However, when
	680	faced with unresolved shift/reduce and reduce/reduce conflicts,
	681	@acronym{GLR} parsers use the simple expedient of doing both,
	682	effectively cloning the parser to follow both possibilities. Each of
	683	the resulting parsers can again split, so that at any given time, there
	684	can be any number of possible parses being explored. The parsers
	685	proceed in lockstep; that is, all of them consume (shift) a given input
	686	symbol before any of them proceed to the next. Each of the cloned
	687	parsers eventually meets one of two possible fates: either it runs into
	688	a parsing error, in which case it simply vanishes, or it merges with
	689	another parser, because the two of them have reduced the input to an
	690	identical set of symbols.
	691
	692	During the time that there are multiple parsers, semantic actions are
	693	recorded, but not performed. When a parser disappears, its recorded
	694	semantic actions disappear as well, and are never performed. When a
	695	reduction makes two parsers identical, causing them to merge, Bison
	696	records both sets of semantic actions. Whenever the last two parsers
	697	merge, reverting to the single-parser case, Bison resolves all the
	698	outstanding actions either by precedences given to the grammar rules
	699	involved, or by performing both actions, and then calling a designated
	700	user-defined function on the resulting values to produce an arbitrary
	701	merged result.
	702
	703	Let's consider an example, vastly simplified from a C++ grammar.
	704
	705	@example
	706	%@{
	707	#include <stdio.h>
	708	#define YYSTYPE char const *
	709	int yylex (void);
	710	void yyerror (char const *);
	711	%@}
	712
	713	%token TYPENAME ID
	714
	715	%right '='
	716	%left '+'
	717
	718	%glr-parser
	719
	720	%%
	721
	722	prog :
	723	\| prog stmt @{ printf ("\n"); @}
	724	;
	725
	726	stmt : expr ';' %dprec 1
	727	\| decl %dprec 2
	728	;
	729
	730	expr : ID @{ printf ("%s ", $$); @}
	731	\| TYPENAME '(' expr ')'
	732	@{ printf ("%s <cast> ", $1); @}
	733	\| expr '+' expr @{ printf ("+ "); @}
	734	\| expr '=' expr @{ printf ("= "); @}
	735	;
	736
	737	decl : TYPENAME declarator ';'
	738	@{ printf ("%s <declare> ", $1); @}
	739	\| TYPENAME declarator '=' expr ';'
	740	@{ printf ("%s <init-declare> ", $1); @}
	741	;
	742
	743	declarator : ID @{ printf ("\"%s\" ", $1); @}
	744	\| '(' declarator ')'
	745	;
	746	@end example
	747
	748	@noindent
	749	This models a problematic part of the C++ grammar---the ambiguity between
	750	certain declarations and statements. For example,
	751
	752	@example
	753	T (x) = y+z;
	754	@end example
	755
	756	@noindent
	757	parses as either an @code{expr} or a @code{stmt}
	758	(assuming that @samp{T} is recognized as a @code{TYPENAME} and
	759	@samp{x} as an @code{ID}).
	760	Bison detects this as a reduce/reduce conflict between the rules
	761	@code{expr : ID} and @code{declarator : ID}, which it cannot resolve at the
	762	time it encounters @code{x} in the example above. The two @code{%dprec}
	763	declarations, however, give precedence to interpreting the example as a
	764	@code{decl}, which implies that @code{x} is a declarator.
	765	The parser therefore prints
	766
	767	@example
	768	"x" y z + T <init-declare>
	769	@end example
	770
	771	Consider a different input string for this parser:
	772
	773	@example
	774	T (x) + y;
	775	@end example
	776
	777	@noindent
	778	Here, there is no ambiguity (this cannot be parsed as a declaration).
	779	However, at the time the Bison parser encounters @code{x}, it does not
	780	have enough information to resolve the reduce/reduce conflict (again,
	781	between @code{x} as an @code{expr} or a @code{declarator}). In this
	782	case, no precedence declaration is used. Instead, the parser splits
	783	into two, one assuming that @code{x} is an @code{expr}, and the other
	784	assuming @code{x} is a @code{declarator}. The second of these parsers
	785	then vanishes when it sees @code{+}, and the parser prints
	786
	787	@example
	788	x T <cast> y +
	789	@end example
	790
	791	Suppose that instead of resolving the ambiguity, you wanted to see all
	792	the possibilities. For this purpose, we must @dfn{merge} the semantic
	793	actions of the two possible parsers, rather than choosing one over the
	794	other. To do so, you could change the declaration of @code{stmt} as
	795	follows:
	796
	797	@example
	798	stmt : expr ';' %merge <stmtMerge>
	799	\| decl %merge <stmtMerge>
	800	;
	801	@end example
	802
	803	@noindent
	804
	805	and define the @code{stmtMerge} function as:
	806
	807	@example
	808	static YYSTYPE
	809	stmtMerge (YYSTYPE x0, YYSTYPE x1)
	810	@{
	811	printf ("<OR> ");
	812	return "";
	813	@}
	814	@end example
	815
	816	@noindent
	817	with an accompanying forward declaration
	818	in the C declarations at the beginning of the file:
	819
	820	@example
	821	%@{
	822	#define YYSTYPE char const *
	823	static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1);
	824	%@}
	825	@end example
	826
	827	@noindent
	828	With these declarations, the resulting parser will parse the first example
	829	as both an @code{expr} and a @code{decl}, and print
	830
	831	@example
	832	"x" y z + T <init-declare> x T <cast> y z + = <OR>
	833	@end example
	834
	835	@sp 1
	836
	837	@cindex @code{incline}
	838	@cindex @acronym{GLR} parsers and @code{inline}
	839	The @acronym{GLR} parsers require a compiler for @acronym{ISO} C89 or
	840	later. In addition, they use the @code{inline} keyword, which is not
	841	C89, but is C99 and is a common extension in pre-C99 compilers. It is
	842	up to the user of these parsers to handle
	843	portability issues. For instance, if using Autoconf and the Autoconf
	844	macro @code{AC_C_INLINE}, a mere
	845
	846	@example
	847	%@{
	848	#include <config.h>
	849	%@}
	850	@end example
	851
	852	@noindent
	853	will suffice. Otherwise, we suggest
	854
	855	@example
	856	%@{
	857	#if __STDC_VERSION__ < 199901 && ! defined __GNUC__ && ! defined inline
	858	#define inline
	859	#endif
	860	%@}
	861	@end example
	862
	863	@node Locations Overview
	864	@section Locations
	865	@cindex location
	866	@cindex textual location
	867	@cindex location, textual
	868
	869	Many applications, like interpreters or compilers, have to produce verbose
	870	and useful error messages. To achieve this, one must be able to keep track of
	871	the @dfn{textual location}, or @dfn{location}, of each syntactic construct.
	872	Bison provides a mechanism for handling these locations.
	873
	874	Each token has a semantic value. In a similar fashion, each token has an
	875	associated location, but the type of locations is the same for all tokens and
	876	groupings. Moreover, the output parser is equipped with a default data
	877	structure for storing locations (@pxref{Locations}, for more details).
	878
	879	Like semantic values, locations can be reached in actions using a dedicated
	880	set of constructs. In the example above, the location of the whole grouping
	881	is @code{@@$}, while the locations of the subexpressions are @code{@@1} and
	882	@code{@@3}.
	883
	884	When a rule is matched, a default action is used to compute the semantic value
	885	of its left hand side (@pxref{Actions}). In the same way, another default
	886	action is used for locations. However, the action for locations is general
	887	enough for most cases, meaning there is usually no need to describe for each
	888	rule how @code{@@$} should be formed. When building a new location for a given
	889	grouping, the default behavior of the output parser is to take the beginning
	890	of the first symbol, and the end of the last symbol.
	891
	892	@node Bison Parser
	893	@section Bison Output: the Parser File
	894	@cindex Bison parser
	895	@cindex Bison utility
	896	@cindex lexical analyzer, purpose
	897	@cindex parser
	898
	899	When you run Bison, you give it a Bison grammar file as input. The output
	900	is a C source file that parses the language described by the grammar.
	901	This file is called a @dfn{Bison parser}. Keep in mind that the Bison
	902	utility and the Bison parser are two distinct programs: the Bison utility
	903	is a program whose output is the Bison parser that becomes part of your
	904	program.
	905
	906	The job of the Bison parser is to group tokens into groupings according to
	907	the grammar rules---for example, to build identifiers and operators into
	908	expressions. As it does this, it runs the actions for the grammar rules it
	909	uses.
	910
	911	The tokens come from a function called the @dfn{lexical analyzer} that
	912	you must supply in some fashion (such as by writing it in C). The Bison
	913	parser calls the lexical analyzer each time it wants a new token. It
	914	doesn't know what is ``inside'' the tokens (though their semantic values
	915	may reflect this). Typically the lexical analyzer makes the tokens by
	916	parsing characters of text, but Bison does not depend on this.
	917	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	918
	919	The Bison parser file is C code which defines a function named
	920	@code{yyparse} which implements that grammar. This function does not make
	921	a complete C program: you must supply some additional functions. One is
	922	the lexical analyzer. Another is an error-reporting function which the
	923	parser calls to report an error. In addition, a complete C program must
	924	start with a function called @code{main}; you have to provide this, and
	925	arrange for it to call @code{yyparse} or the parser will never run.
	926	@xref{Interface, ,Parser C-Language Interface}.
	927
	928	Aside from the token type names and the symbols in the actions you
	929	write, all symbols defined in the Bison parser file itself
	930	begin with @samp{yy} or @samp{YY}. This includes interface functions
	931	such as the lexical analyzer function @code{yylex}, the error reporting
	932	function @code{yyerror} and the parser function @code{yyparse} itself.
	933	This also includes numerous identifiers used for internal purposes.
	934	Therefore, you should avoid using C identifiers starting with @samp{yy}
	935	or @samp{YY} in the Bison grammar file except for the ones defined in
	936	this manual.
	937
	938	In some cases the Bison parser file includes system headers, and in
	939	those cases your code should respect the identifiers reserved by those
	940	headers. On some non-@acronym{GNU} hosts, @code{<alloca.h>},
	941	@code{<stddef.h>}, and @code{<stdlib.h>} are included as needed to
	942	declare memory allocators and related types. Other system headers may
	943	be included if you define @code{YYDEBUG} to a nonzero value
	944	(@pxref{Tracing, ,Tracing Your Parser}).
	945
	946	@node Stages
	947	@section Stages in Using Bison
	948	@cindex stages in using Bison
	949	@cindex using Bison
	950
	951	The actual language-design process using Bison, from grammar specification
	952	to a working compiler or interpreter, has these parts:
	953
	954	@enumerate
	955	@item
	956	Formally specify the grammar in a form recognized by Bison
	957	(@pxref{Grammar File, ,Bison Grammar Files}). For each grammatical rule
	958	in the language, describe the action that is to be taken when an
	959	instance of that rule is recognized. The action is described by a
	960	sequence of C statements.
	961
	962	@item
	963	Write a lexical analyzer to process input and pass tokens to the parser.
	964	The lexical analyzer may be written by hand in C (@pxref{Lexical, ,The
	965	Lexical Analyzer Function @code{yylex}}). It could also be produced
	966	using Lex, but the use of Lex is not discussed in this manual.
	967
	968	@item
	969	Write a controlling function that calls the Bison-produced parser.
	970
	971	@item
	972	Write error-reporting routines.
	973	@end enumerate
	974
	975	To turn this source code as written into a runnable program, you
	976	must follow these steps:
	977
	978	@enumerate
	979	@item
	980	Run Bison on the grammar to produce the parser.
	981
	982	@item
	983	Compile the code output by Bison, as well as any other source files.
	984
	985	@item
	986	Link the object files to produce the finished product.
	987	@end enumerate
	988
	989	@node Grammar Layout
	990	@section The Overall Layout of a Bison Grammar
	991	@cindex grammar file
	992	@cindex file format
	993	@cindex format of grammar file
	994	@cindex layout of Bison grammar
	995
	996	The input file for the Bison utility is a @dfn{Bison grammar file}. The
	997	general form of a Bison grammar file is as follows:
	998
	999	@example
	1000	%@{
	1001	@var{Prologue}
	1002	%@}
	1003
	1004	@var{Bison declarations}
	1005
	1006	%%
	1007	@var{Grammar rules}
	1008	%%
	1009	@var{Epilogue}
	1010	@end example
	1011
	1012	@noindent
	1013	The @samp{%%}, @samp{%@{} and @samp{%@}} are punctuation that appears
	1014	in every Bison grammar file to separate the sections.
	1015
	1016	The prologue may define types and variables used in the actions. You can
	1017	also use preprocessor commands to define macros used there, and use
	1018	@code{#include} to include header files that do any of these things.
	1019	You need to declare the lexical analyzer @code{yylex} and the error
	1020	printer @code{yyerror} here, along with any other global identifiers
	1021	used by the actions in the grammar rules.
	1022
	1023	The Bison declarations declare the names of the terminal and nonterminal
	1024	symbols, and may also describe operator precedence and the data types of
	1025	semantic values of various symbols.
	1026
	1027	The grammar rules define how to construct each nonterminal symbol from its
	1028	parts.
	1029
	1030	The epilogue can contain any code you want to use. Often the
	1031	definitions of functions declared in the prologue go here. In a
	1032	simple program, all the rest of the program can go here.
	1033
	1034	@node Examples
	1035	@chapter Examples
	1036	@cindex simple examples
	1037	@cindex examples, simple
	1038
	1039	Now we show and explain three sample programs written using Bison: a
	1040	reverse polish notation calculator, an algebraic (infix) notation
	1041	calculator, and a multi-function calculator. All three have been tested
	1042	under BSD Unix 4.3; each produces a usable, though limited, interactive
	1043	desk-top calculator.
	1044
	1045	These examples are simple, but Bison grammars for real programming
	1046	languages are written the same way.
	1047	@ifinfo
	1048	You can copy these examples out of the Info file and into a source file
	1049	to try them.
	1050	@end ifinfo
	1051
	1052	@menu
	1053	* RPN Calc:: Reverse polish notation calculator;
	1054	a first example with no operator precedence.
	1055	* Infix Calc:: Infix (algebraic) notation calculator.
	1056	Operator precedence is introduced.
	1057	* Simple Error Recovery:: Continuing after syntax errors.
	1058	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	1059	* Multi-function Calc:: Calculator with memory and trig functions.
	1060	It uses multiple data-types for semantic values.
	1061	* Exercises:: Ideas for improving the multi-function calculator.
	1062	@end menu
	1063
	1064	@node RPN Calc
	1065	@section Reverse Polish Notation Calculator
	1066	@cindex reverse polish notation
	1067	@cindex polish notation calculator
	1068	@cindex @code{rpcalc}
	1069	@cindex calculator, simple
	1070
	1071	The first example is that of a simple double-precision @dfn{reverse polish
	1072	notation} calculator (a calculator using postfix operators). This example
	1073	provides a good starting point, since operator precedence is not an issue.
	1074	The second example will illustrate how operator precedence is handled.
	1075
	1076	The source code for this calculator is named @file{rpcalc.y}. The
	1077	@samp{.y} extension is a convention used for Bison input files.
	1078
	1079	@menu
	1080	* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc.
	1081	* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation.
	1082	* Lexer: Rpcalc Lexer. The lexical analyzer.
	1083	* Main: Rpcalc Main. The controlling function.
	1084	* Error: Rpcalc Error. The error reporting function.
	1085	* Gen: Rpcalc Gen. Running Bison on the grammar file.
	1086	* Comp: Rpcalc Compile. Run the C compiler on the output code.
	1087	@end menu
	1088
	1089	@node Rpcalc Decls
	1090	@subsection Declarations for @code{rpcalc}
	1091
	1092	Here are the C and Bison declarations for the reverse polish notation
	1093	calculator. As in C, comments are placed between @samp{/@dots{}/}.
	1094
	1095	@example
	1096	/* Reverse polish notation calculator. */
	1097
	1098	%@{
	1099	#define YYSTYPE double
	1100	#include <math.h>
	1101	int yylex (void);
	1102	void yyerror (char const *);
	1103	%@}
	1104
	1105	%token NUM
	1106
	1107	%% /* Grammar rules and actions follow. */
	1108	@end example
	1109
	1110	The declarations section (@pxref{Prologue, , The prologue}) contains two
	1111	preprocessor directives and two forward declarations.
	1112
	1113	The @code{#define} directive defines the macro @code{YYSTYPE}, thus
	1114	specifying the C data type for semantic values of both tokens and
	1115	groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The
	1116	Bison parser will use whatever type @code{YYSTYPE} is defined as; if you
	1117	don't define it, @code{int} is the default. Because we specify
	1118	@code{double}, each token and each expression has an associated value,
	1119	which is a floating point number.
	1120
	1121	The @code{#include} directive is used to declare the exponentiation
	1122	function @code{pow}.
	1123
	1124	The forward declarations for @code{yylex} and @code{yyerror} are
	1125	needed because the C language requires that functions be declared
	1126	before they are used. These functions will be defined in the
	1127	epilogue, but the parser calls them so they must be declared in the
	1128	prologue.
	1129
	1130	The second section, Bison declarations, provides information to Bison
	1131	about the token types (@pxref{Bison Declarations, ,The Bison
	1132	Declarations Section}). Each terminal symbol that is not a
	1133	single-character literal must be declared here. (Single-character
	1134	literals normally don't need to be declared.) In this example, all the
	1135	arithmetic operators are designated by single-character literals, so the
	1136	only terminal symbol that needs to be declared is @code{NUM}, the token
	1137	type for numeric constants.
	1138
	1139	@node Rpcalc Rules
	1140	@subsection Grammar Rules for @code{rpcalc}
	1141
	1142	Here are the grammar rules for the reverse polish notation calculator.
	1143
	1144	@example
	1145	input: /* empty */
	1146	\| input line
	1147	;
	1148
	1149	line: '\n'
	1150	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1151	;
	1152
	1153	exp: NUM @{ $$ = $1; @}
	1154	\| exp exp '+' @{ $$ = $1 + $2; @}
	1155	\| exp exp '-' @{ $$ = $1 - $2; @}
	1156	\| exp exp '' @{ $$ = $1 $2; @}
	1157	\| exp exp '/' @{ $$ = $1 / $2; @}
	1158	/* Exponentiation */
	1159	\| exp exp '^' @{ $$ = pow ($1, $2); @}
	1160	/* Unary minus */
	1161	\| exp 'n' @{ $$ = -$1; @}
	1162	;
	1163	%%
	1164	@end example
	1165
	1166	The groupings of the rpcalc ``language'' defined here are the expression
	1167	(given the name @code{exp}), the line of input (@code{line}), and the
	1168	complete input transcript (@code{input}). Each of these nonterminal
	1169	symbols has several alternate rules, joined by the @samp{\|} punctuator
	1170	which is read as ``or''. The following sections explain what these rules
	1171	mean.
	1172
	1173	The semantics of the language is determined by the actions taken when a
	1174	grouping is recognized. The actions are the C code that appears inside
	1175	braces. @xref{Actions}.
	1176
	1177	You must specify these actions in C, but Bison provides the means for
	1178	passing semantic values between the rules. In each action, the
	1179	pseudo-variable @code{$$} stands for the semantic value for the grouping
	1180	that the rule is going to construct. Assigning a value to @code{$$} is the
	1181	main job of most actions. The semantic values of the components of the
	1182	rule are referred to as @code{$1}, @code{$2}, and so on.
	1183
	1184	@menu
	1185	* Rpcalc Input::
	1186	* Rpcalc Line::
	1187	* Rpcalc Expr::
	1188	@end menu
	1189
	1190	@node Rpcalc Input
	1191	@subsubsection Explanation of @code{input}
	1192
	1193	Consider the definition of @code{input}:
	1194
	1195	@example
	1196	input: /* empty */
	1197	\| input line
	1198	;
	1199	@end example
	1200
	1201	This definition reads as follows: ``A complete input is either an empty
	1202	string, or a complete input followed by an input line''. Notice that
	1203	``complete input'' is defined in terms of itself. This definition is said
	1204	to be @dfn{left recursive} since @code{input} appears always as the
	1205	leftmost symbol in the sequence. @xref{Recursion, ,Recursive Rules}.
	1206
	1207	The first alternative is empty because there are no symbols between the
	1208	colon and the first @samp{\|}; this means that @code{input} can match an
	1209	empty string of input (no tokens). We write the rules this way because it
	1210	is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
	1211	It's conventional to put an empty alternative first and write the comment
	1212	@samp{/* empty */} in it.
	1213
	1214	The second alternate rule (@code{input line}) handles all nontrivial input.
	1215	It means, ``After reading any number of lines, read one more line if
	1216	possible.'' The left recursion makes this rule into a loop. Since the
	1217	first alternative matches empty input, the loop can be executed zero or
	1218	more times.
	1219
	1220	The parser function @code{yyparse} continues to process input until a
	1221	grammatical error is seen or the lexical analyzer says there are no more
	1222	input tokens; we will arrange for the latter to happen at end-of-input.
	1223
	1224	@node Rpcalc Line
	1225	@subsubsection Explanation of @code{line}
	1226
	1227	Now consider the definition of @code{line}:
	1228
	1229	@example
	1230	line: '\n'
	1231	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1232	;
	1233	@end example
	1234
	1235	The first alternative is a token which is a newline character; this means
	1236	that rpcalc accepts a blank line (and ignores it, since there is no
	1237	action). The second alternative is an expression followed by a newline.
	1238	This is the alternative that makes rpcalc useful. The semantic value of
	1239	the @code{exp} grouping is the value of @code{$1} because the @code{exp} in
	1240	question is the first symbol in the alternative. The action prints this
	1241	value, which is the result of the computation the user asked for.
	1242
	1243	This action is unusual because it does not assign a value to @code{$$}. As
	1244	a consequence, the semantic value associated with the @code{line} is
	1245	uninitialized (its value will be unpredictable). This would be a bug if
	1246	that value were ever used, but we don't use it: once rpcalc has printed the
	1247	value of the user's input line, that value is no longer needed.
	1248
	1249	@node Rpcalc Expr
	1250	@subsubsection Explanation of @code{expr}
	1251
	1252	The @code{exp} grouping has several rules, one for each kind of expression.
	1253	The first rule handles the simplest expressions: those that are just numbers.
	1254	The second handles an addition-expression, which looks like two expressions
	1255	followed by a plus-sign. The third handles subtraction, and so on.
	1256
	1257	@example
	1258	exp: NUM
	1259	\| exp exp '+' @{ $$ = $1 + $2; @}
	1260	\| exp exp '-' @{ $$ = $1 - $2; @}
	1261	@dots{}
	1262	;
	1263	@end example
	1264
	1265	We have used @samp{\|} to join all the rules for @code{exp}, but we could
	1266	equally well have written them separately:
	1267
	1268	@example
	1269	exp: NUM ;
	1270	exp: exp exp '+' @{ $$ = $1 + $2; @} ;
	1271	exp: exp exp '-' @{ $$ = $1 - $2; @} ;
	1272	@dots{}
	1273	@end example
	1274
	1275	Most of the rules have actions that compute the value of the expression in
	1276	terms of the value of its parts. For example, in the rule for addition,
	1277	@code{$1} refers to the first component @code{exp} and @code{$2} refers to
	1278	the second one. The third component, @code{'+'}, has no meaningful
	1279	associated semantic value, but if it had one you could refer to it as
	1280	@code{$3}. When @code{yyparse} recognizes a sum expression using this
	1281	rule, the sum of the two subexpressions' values is produced as the value of
	1282	the entire expression. @xref{Actions}.
	1283
	1284	You don't have to give an action for every rule. When a rule has no
	1285	action, Bison by default copies the value of @code{$1} into @code{$$}.
	1286	This is what happens in the first rule (the one that uses @code{NUM}).
	1287
	1288	The formatting shown here is the recommended convention, but Bison does
	1289	not require it. You can add or change white space as much as you wish.
	1290	For example, this:
	1291
	1292	@example
	1293	exp : NUM \| exp exp '+' @{$$ = $1 + $2; @} \| @dots{}
	1294	@end example
	1295
	1296	@noindent
	1297	means the same thing as this:
	1298
	1299	@example
	1300	exp: NUM
	1301	\| exp exp '+' @{ $$ = $1 + $2; @}
	1302	\| @dots{}
	1303	@end example
	1304
	1305	@noindent
	1306	The latter, however, is much more readable.
	1307
	1308	@node Rpcalc Lexer
	1309	@subsection The @code{rpcalc} Lexical Analyzer
	1310	@cindex writing a lexical analyzer
	1311	@cindex lexical analyzer, writing
	1312
	1313	The lexical analyzer's job is low-level parsing: converting characters
	1314	or sequences of characters into tokens. The Bison parser gets its
	1315	tokens by calling the lexical analyzer. @xref{Lexical, ,The Lexical
	1316	Analyzer Function @code{yylex}}.
	1317
	1318	Only a simple lexical analyzer is needed for the @acronym{RPN}
	1319	calculator. This
	1320	lexical analyzer skips blanks and tabs, then reads in numbers as
	1321	@code{double} and returns them as @code{NUM} tokens. Any other character
	1322	that isn't part of a number is a separate token. Note that the token-code
	1323	for such a single-character token is the character itself.
	1324
	1325	The return value of the lexical analyzer function is a numeric code which
	1326	represents a token type. The same text used in Bison rules to stand for
	1327	this token type is also a C expression for the numeric code for the type.
	1328	This works in two ways. If the token type is a character literal, then its
	1329	numeric code is that of the character; you can use the same
	1330	character literal in the lexical analyzer to express the number. If the
	1331	token type is an identifier, that identifier is defined by Bison as a C
	1332	macro whose definition is the appropriate number. In this example,
	1333	therefore, @code{NUM} becomes a macro for @code{yylex} to use.
	1334
	1335	The semantic value of the token (if it has one) is stored into the
	1336	global variable @code{yylval}, which is where the Bison parser will look
	1337	for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was
	1338	defined at the beginning of the grammar; @pxref{Rpcalc Decls,
	1339	,Declarations for @code{rpcalc}}.)
	1340
	1341	A token type code of zero is returned if the end-of-input is encountered.
	1342	(Bison recognizes any nonpositive value as indicating end-of-input.)
	1343
	1344	Here is the code for the lexical analyzer:
	1345
	1346	@example
	1347	@group
	1348	/* The lexical analyzer returns a double floating point
	1349	number on the stack and the token NUM, or the numeric code
	1350	of the character read if not a number. It skips all blanks
	1351	and tabs, and returns 0 for end-of-input. */
	1352
	1353	#include <ctype.h>
	1354	@end group
	1355
	1356	@group
	1357	int
	1358	yylex (void)
	1359	@{
	1360	int c;
	1361
	1362	/* Skip white space. */
	1363	while ((c = getchar ()) == ' ' \|\| c == '\t')
	1364	;
	1365	@end group
	1366	@group
	1367	/* Process numbers. */
	1368	if (c == '.' \|\| isdigit (c))
	1369	@{
	1370	ungetc (c, stdin);
	1371	scanf ("%lf", &yylval);
	1372	return NUM;
	1373	@}
	1374	@end group
	1375	@group
	1376	/* Return end-of-input. */
	1377	if (c == EOF)
	1378	return 0;
	1379	/* Return a single char. */
	1380	return c;
	1381	@}
	1382	@end group
	1383	@end example
	1384
	1385	@node Rpcalc Main
	1386	@subsection The Controlling Function
	1387	@cindex controlling function
	1388	@cindex main function in simple example
	1389
	1390	In keeping with the spirit of this example, the controlling function is
	1391	kept to the bare minimum. The only requirement is that it call
	1392	@code{yyparse} to start the process of parsing.
	1393
	1394	@example
	1395	@group
	1396	int
	1397	main (void)
	1398	@{
	1399	return yyparse ();
	1400	@}
	1401	@end group
	1402	@end example
	1403
	1404	@node Rpcalc Error
	1405	@subsection The Error Reporting Routine
	1406	@cindex error reporting routine
	1407
	1408	When @code{yyparse} detects a syntax error, it calls the error reporting
	1409	function @code{yyerror} to print an error message (usually but not
	1410	always @code{"syntax error"}). It is up to the programmer to supply
	1411	@code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so
	1412	here is the definition we will use:
	1413
	1414	@example
	1415	@group
	1416	#include <stdio.h>
	1417
	1418	/* Called by yyparse on error. */
	1419	void
	1420	yyerror (char const *s)
	1421	@{
	1422	fprintf (stderr, "%s\n", s);
	1423	@}
	1424	@end group
	1425	@end example
	1426
	1427	After @code{yyerror} returns, the Bison parser may recover from the error
	1428	and continue parsing if the grammar contains a suitable error rule
	1429	(@pxref{Error Recovery}). Otherwise, @code{yyparse} returns nonzero. We
	1430	have not written any error rules in this example, so any invalid input will
	1431	cause the calculator program to exit. This is not clean behavior for a
	1432	real calculator, but it is adequate for the first example.
	1433
	1434	@node Rpcalc Gen
	1435	@subsection Running Bison to Make the Parser
	1436	@cindex running Bison (introduction)
	1437
	1438	Before running Bison to produce a parser, we need to decide how to
	1439	arrange all the source code in one or more source files. For such a
	1440	simple example, the easiest thing is to put everything in one file. The
	1441	definitions of @code{yylex}, @code{yyerror} and @code{main} go at the
	1442	end, in the epilogue of the file
	1443	(@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}).
	1444
	1445	For a large project, you would probably have several source files, and use
	1446	@code{make} to arrange to recompile them.
	1447
	1448	With all the source in a single file, you use the following command to
	1449	convert it into a parser file:
	1450
	1451	@example
	1452	bison @var{file_name}.y
	1453	@end example
	1454
	1455	@noindent
	1456	In this example the file was called @file{rpcalc.y} (for ``Reverse Polish
	1457	@sc{calc}ulator''). Bison produces a file named @file{@var{file_name}.tab.c},
	1458	removing the @samp{.y} from the original file name. The file output by
	1459	Bison contains the source code for @code{yyparse}. The additional
	1460	functions in the input file (@code{yylex}, @code{yyerror} and @code{main})
	1461	are copied verbatim to the output.
	1462
	1463	@node Rpcalc Compile
	1464	@subsection Compiling the Parser File
	1465	@cindex compiling the parser
	1466
	1467	Here is how to compile and run the parser file:
	1468
	1469	@example
	1470	@group
	1471	# @r{List files in current directory.}
	1472	$ @kbd{ls}
	1473	rpcalc.tab.c rpcalc.y
	1474	@end group
	1475
	1476	@group
	1477	# @r{Compile the Bison parser.}
	1478	# @r{@samp{-lm} tells compiler to search math library for @code{pow}.}
	1479	$ @kbd{cc -lm -o rpcalc rpcalc.tab.c}
	1480	@end group
	1481
	1482	@group
	1483	# @r{List files again.}
	1484	$ @kbd{ls}
	1485	rpcalc rpcalc.tab.c rpcalc.y
	1486	@end group
	1487	@end example
	1488
	1489	The file @file{rpcalc} now contains the executable code. Here is an
	1490	example session using @code{rpcalc}.
	1491
	1492	@example
	1493	$ @kbd{rpcalc}
	1494	@kbd{4 9 +}
	1495	13
	1496	@kbd{3 7 + 3 4 5 *+-}
	1497	-13
	1498	@kbd{3 7 + 3 4 5 * + - n} @r{Note the unary minus, @samp{n}}
	1499	13
	1500	@kbd{5 6 / 4 n +}
	1501	-3.166666667
	1502	@kbd{3 4 ^} @r{Exponentiation}
	1503	81
	1504	@kbd{^D} @r{End-of-file indicator}
	1505	$
	1506	@end example
	1507
	1508	@node Infix Calc
	1509	@section Infix Notation Calculator: @code{calc}
	1510	@cindex infix notation calculator
	1511	@cindex @code{calc}
	1512	@cindex calculator, infix notation
	1513
	1514	We now modify rpcalc to handle infix operators instead of postfix. Infix
	1515	notation involves the concept of operator precedence and the need for
	1516	parentheses nested to arbitrary depth. Here is the Bison code for
	1517	@file{calc.y}, an infix desk-top calculator.
	1518
	1519	@example
	1520	/* Infix notation calculator. */
	1521
	1522	%@{
	1523	#define YYSTYPE double
	1524	#include <math.h>
	1525	#include <stdio.h>
	1526	int yylex (void);
	1527	void yyerror (char const *);
	1528	%@}
	1529
	1530	/* Bison declarations. */
	1531	%token NUM
	1532	%left '-' '+'
	1533	%left '*' '/'
	1534	%left NEG /* negation--unary minus */
	1535	%right '^' /* exponentiation */
	1536
	1537	%% /* The grammar follows. */
	1538	input: /* empty */
	1539	\| input line
	1540	;
	1541
	1542	line: '\n'
	1543	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1544	;
	1545
	1546	exp: NUM @{ $$ = $1; @}
	1547	\| exp '+' exp @{ $$ = $1 + $3; @}
	1548	\| exp '-' exp @{ $$ = $1 - $3; @}
	1549	\| exp '' exp @{ $$ = $1 $3; @}
	1550	\| exp '/' exp @{ $$ = $1 / $3; @}
	1551	\| '-' exp %prec NEG @{ $$ = -$2; @}
	1552	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1553	\| '(' exp ')' @{ $$ = $2; @}
	1554	;
	1555	%%
	1556	@end example
	1557
	1558	@noindent
	1559	The functions @code{yylex}, @code{yyerror} and @code{main} can be the
	1560	same as before.
	1561
	1562	There are two important new features shown in this code.
	1563
	1564	In the second section (Bison declarations), @code{%left} declares token
	1565	types and says they are left-associative operators. The declarations
	1566	@code{%left} and @code{%right} (right associativity) take the place of
	1567	@code{%token} which is used to declare a token type name without
	1568	associativity. (These tokens are single-character literals, which
	1569	ordinarily don't need to be declared. We declare them here to specify
	1570	the associativity.)
	1571
	1572	Operator precedence is determined by the line ordering of the
	1573	declarations; the higher the line number of the declaration (lower on
	1574	the page or screen), the higher the precedence. Hence, exponentiation
	1575	has the highest precedence, unary minus (@code{NEG}) is next, followed
	1576	by @samp{*} and @samp{/}, and so on. @xref{Precedence, ,Operator
	1577	Precedence}.
	1578
	1579	The other important new feature is the @code{%prec} in the grammar
	1580	section for the unary minus operator. The @code{%prec} simply instructs
	1581	Bison that the rule @samp{\| '-' exp} has the same precedence as
	1582	@code{NEG}---in this case the next-to-highest. @xref{Contextual
	1583	Precedence, ,Context-Dependent Precedence}.
	1584
	1585	Here is a sample run of @file{calc.y}:
	1586
	1587	@need 500
	1588	@example
	1589	$ @kbd{calc}
	1590	@kbd{4 + 4.5 - (34/(8*3+-3))}
	1591	6.880952381
	1592	@kbd{-56 + 2}
	1593	-54
	1594	@kbd{3 ^ 2}
	1595	9
	1596	@end example
	1597
	1598	@node Simple Error Recovery
	1599	@section Simple Error Recovery
	1600	@cindex error recovery, simple
	1601
	1602	Up to this point, this manual has not addressed the issue of @dfn{error
	1603	recovery}---how to continue parsing after the parser detects a syntax
	1604	error. All we have handled is error reporting with @code{yyerror}.
	1605	Recall that by default @code{yyparse} returns after calling
	1606	@code{yyerror}. This means that an erroneous input line causes the
	1607	calculator program to exit. Now we show how to rectify this deficiency.
	1608
	1609	The Bison language itself includes the reserved word @code{error}, which
	1610	may be included in the grammar rules. In the example below it has
	1611	been added to one of the alternatives for @code{line}:
	1612
	1613	@example
	1614	@group
	1615	line: '\n'
	1616	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1617	\| error '\n' @{ yyerrok; @}
	1618	;
	1619	@end group
	1620	@end example
	1621
	1622	This addition to the grammar allows for simple error recovery in the
	1623	event of a syntax error. If an expression that cannot be evaluated is
	1624	read, the error will be recognized by the third rule for @code{line},
	1625	and parsing will continue. (The @code{yyerror} function is still called
	1626	upon to print its message as well.) The action executes the statement
	1627	@code{yyerrok}, a macro defined automatically by Bison; its meaning is
	1628	that error recovery is complete (@pxref{Error Recovery}). Note the
	1629	difference between @code{yyerrok} and @code{yyerror}; neither one is a
	1630	misprint.
	1631
	1632	This form of error recovery deals with syntax errors. There are other
	1633	kinds of errors; for example, division by zero, which raises an exception
	1634	signal that is normally fatal. A real calculator program must handle this
	1635	signal and use @code{longjmp} to return to @code{main} and resume parsing
	1636	input lines; it would also have to discard the rest of the current line of
	1637	input. We won't discuss this issue further because it is not specific to
	1638	Bison programs.
	1639
	1640	@node Location Tracking Calc
	1641	@section Location Tracking Calculator: @code{ltcalc}
	1642	@cindex location tracking calculator
	1643	@cindex @code{ltcalc}
	1644	@cindex calculator, location tracking
	1645
	1646	This example extends the infix notation calculator with location
	1647	tracking. This feature will be used to improve the error messages. For
	1648	the sake of clarity, this example is a simple integer calculator, since
	1649	most of the work needed to use locations will be done in the lexical
	1650	analyzer.
	1651
	1652	@menu
	1653	* Decls: Ltcalc Decls. Bison and C declarations for ltcalc.
	1654	* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations.
	1655	* Lexer: Ltcalc Lexer. The lexical analyzer.
	1656	@end menu
	1657
	1658	@node Ltcalc Decls
	1659	@subsection Declarations for @code{ltcalc}
	1660
	1661	The C and Bison declarations for the location tracking calculator are
	1662	the same as the declarations for the infix notation calculator.
	1663
	1664	@example
	1665	/* Location tracking calculator. */
	1666
	1667	%@{
	1668	#define YYSTYPE int
	1669	#include <math.h>
	1670	int yylex (void);
	1671	void yyerror (char const *);
	1672	%@}
	1673
	1674	/* Bison declarations. */
	1675	%token NUM
	1676
	1677	%left '-' '+'
	1678	%left '*' '/'
	1679	%left NEG
	1680	%right '^'
	1681
	1682	%% /* The grammar follows. */
	1683	@end example
	1684
	1685	@noindent
	1686	Note there are no declarations specific to locations. Defining a data
	1687	type for storing locations is not needed: we will use the type provided
	1688	by default (@pxref{Location Type, ,Data Types of Locations}), which is a
	1689	four member structure with the following integer fields:
	1690	@code{first_line}, @code{first_column}, @code{last_line} and
	1691	@code{last_column}.
	1692
	1693	@node Ltcalc Rules
	1694	@subsection Grammar Rules for @code{ltcalc}
	1695
	1696	Whether handling locations or not has no effect on the syntax of your
	1697	language. Therefore, grammar rules for this example will be very close
	1698	to those of the previous example: we will only modify them to benefit
	1699	from the new information.
	1700
	1701	Here, we will use locations to report divisions by zero, and locate the
	1702	wrong expressions or subexpressions.
	1703
	1704	@example
	1705	@group
	1706	input : /* empty */
	1707	\| input line
	1708	;
	1709	@end group
	1710
	1711	@group
	1712	line : '\n'
	1713	\| exp '\n' @{ printf ("%d\n", $1); @}
	1714	;
	1715	@end group
	1716
	1717	@group
	1718	exp : NUM @{ $$ = $1; @}
	1719	\| exp '+' exp @{ $$ = $1 + $3; @}
	1720	\| exp '-' exp @{ $$ = $1 - $3; @}
	1721	\| exp '' exp @{ $$ = $1 $3; @}
	1722	@end group
	1723	@group
	1724	\| exp '/' exp
	1725	@{
	1726	if ($3)
	1727	$$ = $1 / $3;
	1728	else
	1729	@{
	1730	$$ = 1;
	1731	fprintf (stderr, "%d.%d-%d.%d: division by zero",
	1732	@@3.first_line, @@3.first_column,
	1733	@@3.last_line, @@3.last_column);
	1734	@}
	1735	@}
	1736	@end group
	1737	@group
	1738	\| '-' exp %preg NEG @{ $$ = -$2; @}
	1739	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1740	\| '(' exp ')' @{ $$ = $2; @}
	1741	@end group
	1742	@end example
	1743
	1744	This code shows how to reach locations inside of semantic actions, by
	1745	using the pseudo-variables @code{@@@var{n}} for rule components, and the
	1746	pseudo-variable @code{@@$} for groupings.
	1747
	1748	We don't need to assign a value to @code{@@$}: the output parser does it
	1749	automatically. By default, before executing the C code of each action,
	1750	@code{@@$} is set to range from the beginning of @code{@@1} to the end
	1751	of @code{@@@var{n}}, for a rule with @var{n} components. This behavior
	1752	can be redefined (@pxref{Location Default Action, , Default Action for
	1753	Locations}), and for very specific rules, @code{@@$} can be computed by
	1754	hand.
	1755
	1756	@node Ltcalc Lexer
	1757	@subsection The @code{ltcalc} Lexical Analyzer.
	1758
	1759	Until now, we relied on Bison's defaults to enable location
	1760	tracking. The next step is to rewrite the lexical analyzer, and make it
	1761	able to feed the parser with the token locations, as it already does for
	1762	semantic values.
	1763
	1764	To this end, we must take into account every single character of the
	1765	input text, to avoid the computed locations of being fuzzy or wrong:
	1766
	1767	@example
	1768	@group
	1769	int
	1770	yylex (void)
	1771	@{
	1772	int c;
	1773	@end group
	1774
	1775	@group
	1776	/* Skip white space. */
	1777	while ((c = getchar ()) == ' ' \|\| c == '\t')
	1778	++yylloc.last_column;
	1779	@end group
	1780
	1781	@group
	1782	/* Step. */
	1783	yylloc.first_line = yylloc.last_line;
	1784	yylloc.first_column = yylloc.last_column;
	1785	@end group
	1786
	1787	@group
	1788	/* Process numbers. */
	1789	if (isdigit (c))
	1790	@{
	1791	yylval = c - '0';
	1792	++yylloc.last_column;
	1793	while (isdigit (c = getchar ()))
	1794	@{
	1795	++yylloc.last_column;
	1796	yylval = yylval * 10 + c - '0';
	1797	@}
	1798	ungetc (c, stdin);
	1799	return NUM;
	1800	@}
	1801	@end group
	1802
	1803	/* Return end-of-input. */
	1804	if (c == EOF)
	1805	return 0;
	1806
	1807	/* Return a single char, and update location. */
	1808	if (c == '\n')
	1809	@{
	1810	++yylloc.last_line;
	1811	yylloc.last_column = 0;
	1812	@}
	1813	else
	1814	++yylloc.last_column;
	1815	return c;
	1816	@}
	1817	@end example
	1818
	1819	Basically, the lexical analyzer performs the same processing as before:
	1820	it skips blanks and tabs, and reads numbers or single-character tokens.
	1821	In addition, it updates @code{yylloc}, the global variable (of type
	1822	@code{YYLTYPE}) containing the token's location.
	1823
	1824	Now, each time this function returns a token, the parser has its number
	1825	as well as its semantic value, and its location in the text. The last
	1826	needed change is to initialize @code{yylloc}, for example in the
	1827	controlling function:
	1828
	1829	@example
	1830	@group
	1831	int
	1832	main (void)
	1833	@{
	1834	yylloc.first_line = yylloc.last_line = 1;
	1835	yylloc.first_column = yylloc.last_column = 0;
	1836	return yyparse ();
	1837	@}
	1838	@end group
	1839	@end example
	1840
	1841	Remember that computing locations is not a matter of syntax. Every
	1842	character must be associated to a location update, whether it is in
	1843	valid input, in comments, in literal strings, and so on.
	1844
	1845	@node Multi-function Calc
	1846	@section Multi-Function Calculator: @code{mfcalc}
	1847	@cindex multi-function calculator
	1848	@cindex @code{mfcalc}
	1849	@cindex calculator, multi-function
	1850
	1851	Now that the basics of Bison have been discussed, it is time to move on to
	1852	a more advanced problem. The above calculators provided only five
	1853	functions, @samp{+}, @samp{-}, @samp{*}, @samp{/} and @samp{^}. It would
	1854	be nice to have a calculator that provides other mathematical functions such
	1855	as @code{sin}, @code{cos}, etc.
	1856
	1857	It is easy to add new operators to the infix calculator as long as they are
	1858	only single-character literals. The lexical analyzer @code{yylex} passes
	1859	back all nonnumber characters as tokens, so new grammar rules suffice for
	1860	adding a new operator. But we want something more flexible: built-in
	1861	functions whose syntax has this form:
	1862
	1863	@example
	1864	@var{function_name} (@var{argument})
	1865	@end example
	1866
	1867	@noindent
	1868	At the same time, we will add memory to the calculator, by allowing you
	1869	to create named variables, store values in them, and use them later.
	1870	Here is a sample session with the multi-function calculator:
	1871
	1872	@example
	1873	$ @kbd{mfcalc}
	1874	@kbd{pi = 3.141592653589}
	1875	3.1415926536
	1876	@kbd{sin(pi)}
	1877	0.0000000000
	1878	@kbd{alpha = beta1 = 2.3}
	1879	2.3000000000
	1880	@kbd{alpha}
	1881	2.3000000000
	1882	@kbd{ln(alpha)}
	1883	0.8329091229
	1884	@kbd{exp(ln(beta1))}
	1885	2.3000000000
	1886	$
	1887	@end example
	1888
	1889	Note that multiple assignment and nested function calls are permitted.
	1890
	1891	@menu
	1892	* Decl: Mfcalc Decl. Bison declarations for multi-function calculator.
	1893	* Rules: Mfcalc Rules. Grammar rules for the calculator.
	1894	* Symtab: Mfcalc Symtab. Symbol table management subroutines.
	1895	@end menu
	1896
	1897	@node Mfcalc Decl
	1898	@subsection Declarations for @code{mfcalc}
	1899
	1900	Here are the C and Bison declarations for the multi-function calculator.
	1901
	1902	@smallexample
	1903	@group
	1904	%@{
	1905	#include <math.h> /* For math functions, cos(), sin(), etc. */
	1906	#include "calc.h" /* Contains definition of `symrec'. */
	1907	int yylex (void);
	1908	void yyerror (char const *);
	1909	%@}
	1910	@end group
	1911	@group
	1912	%union @{
	1913	double val; /* For returning numbers. */
	1914	symrec tptr; / For returning symbol-table pointers. */
	1915	@}
	1916	@end group
	1917	%token <val> NUM /* Simple double precision number. */
	1918	%token <tptr> VAR FNCT /* Variable and Function. */
	1919	%type <val> exp
	1920
	1921	@group
	1922	%right '='
	1923	%left '-' '+'
	1924	%left '*' '/'
	1925	%left NEG /* negation--unary minus */
	1926	%right '^' /* exponentiation */
	1927	@end group
	1928	%% /* The grammar follows. */
	1929	@end smallexample
	1930
	1931	The above grammar introduces only two new features of the Bison language.
	1932	These features allow semantic values to have various data types
	1933	(@pxref{Multiple Types, ,More Than One Value Type}).
	1934
	1935	The @code{%union} declaration specifies the entire list of possible types;
	1936	this is instead of defining @code{YYSTYPE}. The allowable types are now
	1937	double-floats (for @code{exp} and @code{NUM}) and pointers to entries in
	1938	the symbol table. @xref{Union Decl, ,The Collection of Value Types}.
	1939
	1940	Since values can now have various types, it is necessary to associate a
	1941	type with each grammar symbol whose semantic value is used. These symbols
	1942	are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their
	1943	declarations are augmented with information about their data type (placed
	1944	between angle brackets).
	1945
	1946	The Bison construct @code{%type} is used for declaring nonterminal
	1947	symbols, just as @code{%token} is used for declaring token types. We
	1948	have not used @code{%type} before because nonterminal symbols are
	1949	normally declared implicitly by the rules that define them. But
	1950	@code{exp} must be declared explicitly so we can specify its value type.
	1951	@xref{Type Decl, ,Nonterminal Symbols}.
	1952
	1953	@node Mfcalc Rules
	1954	@subsection Grammar Rules for @code{mfcalc}
	1955
	1956	Here are the grammar rules for the multi-function calculator.
	1957	Most of them are copied directly from @code{calc}; three rules,
	1958	those which mention @code{VAR} or @code{FNCT}, are new.
	1959
	1960	@smallexample
	1961	@group
	1962	input: /* empty */
	1963	\| input line
	1964	;
	1965	@end group
	1966
	1967	@group
	1968	line:
	1969	'\n'
	1970	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1971	\| error '\n' @{ yyerrok; @}
	1972	;
	1973	@end group
	1974
	1975	@group
	1976	exp: NUM @{ $$ = $1; @}
	1977	\| VAR @{ $$ = $1->value.var; @}
	1978	\| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
	1979	\| FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @}
	1980	\| exp '+' exp @{ $$ = $1 + $3; @}
	1981	\| exp '-' exp @{ $$ = $1 - $3; @}
	1982	\| exp '' exp @{ $$ = $1 $3; @}
	1983	\| exp '/' exp @{ $$ = $1 / $3; @}
	1984	\| '-' exp %prec NEG @{ $$ = -$2; @}
	1985	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1986	\| '(' exp ')' @{ $$ = $2; @}
	1987	;
	1988	@end group
	1989	/* End of grammar. */
	1990	%%
	1991	@end smallexample
	1992
	1993	@node Mfcalc Symtab
	1994	@subsection The @code{mfcalc} Symbol Table
	1995	@cindex symbol table example
	1996
	1997	The multi-function calculator requires a symbol table to keep track of the
	1998	names and meanings of variables and functions. This doesn't affect the
	1999	grammar rules (except for the actions) or the Bison declarations, but it
	2000	requires some additional C functions for support.
	2001
	2002	The symbol table itself consists of a linked list of records. Its
	2003	definition, which is kept in the header @file{calc.h}, is as follows. It
	2004	provides for either functions or variables to be placed in the table.
	2005
	2006	@smallexample
	2007	@group
	2008	/* Function type. */
	2009	typedef double (*func_t) (double);
	2010	@end group
	2011
	2012	@group
	2013	/* Data type for links in the chain of symbols. */
	2014	struct symrec
	2015	@{
	2016	char name; / name of symbol */
	2017	int type; /* type of symbol: either VAR or FNCT */
	2018	union
	2019	@{
	2020	double var; /* value of a VAR */
	2021	func_t fnctptr; /* value of a FNCT */
	2022	@} value;
	2023	struct symrec next; / link field */
	2024	@};
	2025	@end group
	2026
	2027	@group
	2028	typedef struct symrec symrec;
	2029
	2030	/* The symbol table: a chain of `struct symrec'. */
	2031	extern symrec *sym_table;
	2032
	2033	symrec putsym (char const , func_t);
	2034	symrec getsym (char const );
	2035	@end group
	2036	@end smallexample
	2037
	2038	The new version of @code{main} includes a call to @code{init_table}, a
	2039	function that initializes the symbol table. Here it is, and
	2040	@code{init_table} as well:
	2041
	2042	@smallexample
	2043	#include <stdio.h>
	2044
	2045	@group
	2046	/* Called by yyparse on error. */
	2047	void
	2048	yyerror (char const *s)
	2049	@{
	2050	printf ("%s\n", s);
	2051	@}
	2052	@end group
	2053
	2054	@group
	2055	struct init
	2056	@{
	2057	char const *fname;
	2058	double (*fnct) (double);
	2059	@};
	2060	@end group
	2061
	2062	@group
	2063	struct init const arith_fncts[] =
	2064	@{
	2065	"sin", sin,
	2066	"cos", cos,
	2067	"atan", atan,
	2068	"ln", log,
	2069	"exp", exp,
	2070	"sqrt", sqrt,
	2071	0, 0
	2072	@};
	2073	@end group
	2074
	2075	@group
	2076	/* The symbol table: a chain of `struct symrec'. */
	2077	symrec *sym_table;
	2078	@end group
	2079
	2080	@group
	2081	/* Put arithmetic functions in table. */
	2082	void
	2083	init_table (void)
	2084	@{
	2085	int i;
	2086	symrec *ptr;
	2087	for (i = 0; arith_fncts[i].fname != 0; i++)
	2088	@{
	2089	ptr = putsym (arith_fncts[i].fname, FNCT);
	2090	ptr->value.fnctptr = arith_fncts[i].fnct;
	2091	@}
	2092	@}
	2093	@end group
	2094
	2095	@group
	2096	int
	2097	main (void)
	2098	@{
	2099	init_table ();
	2100	return yyparse ();
	2101	@}
	2102	@end group
	2103	@end smallexample
	2104
	2105	By simply editing the initialization list and adding the necessary include
	2106	files, you can add additional functions to the calculator.
	2107
	2108	Two important functions allow look-up and installation of symbols in the
	2109	symbol table. The function @code{putsym} is passed a name and the type
	2110	(@code{VAR} or @code{FNCT}) of the object to be installed. The object is
	2111	linked to the front of the list, and a pointer to the object is returned.
	2112	The function @code{getsym} is passed the name of the symbol to look up. If
	2113	found, a pointer to that symbol is returned; otherwise zero is returned.
	2114
	2115	@smallexample
	2116	symrec *
	2117	putsym (char const *sym_name, int sym_type)
	2118	@{
	2119	symrec *ptr;
	2120	ptr = (symrec *) malloc (sizeof (symrec));
	2121	ptr->name = (char *) malloc (strlen (sym_name) + 1);
	2122	strcpy (ptr->name,sym_name);
	2123	ptr->type = sym_type;
	2124	ptr->value.var = 0; /* Set value to 0 even if fctn. */
	2125	ptr->next = (struct symrec *)sym_table;
	2126	sym_table = ptr;
	2127	return ptr;
	2128	@}
	2129
	2130	symrec *
	2131	getsym (char const *sym_name)
	2132	@{
	2133	symrec *ptr;
	2134	for (ptr = sym_table; ptr != (symrec *) 0;
	2135	ptr = (symrec *)ptr->next)
	2136	if (strcmp (ptr->name,sym_name) == 0)
	2137	return ptr;
	2138	return 0;
	2139	@}
	2140	@end smallexample
	2141
	2142	The function @code{yylex} must now recognize variables, numeric values, and
	2143	the single-character arithmetic operators. Strings of alphanumeric
	2144	characters with a leading non-digit are recognized as either variables or
	2145	functions depending on what the symbol table says about them.
	2146
	2147	The string is passed to @code{getsym} for look up in the symbol table. If
	2148	the name appears in the table, a pointer to its location and its type
	2149	(@code{VAR} or @code{FNCT}) is returned to @code{yyparse}. If it is not
	2150	already in the table, then it is installed as a @code{VAR} using
	2151	@code{putsym}. Again, a pointer and its type (which must be @code{VAR}) is
	2152	returned to @code{yyparse}.
	2153
	2154	No change is needed in the handling of numeric values and arithmetic
	2155	operators in @code{yylex}.
	2156
	2157	@smallexample
	2158	@group
	2159	#include <ctype.h>
	2160	@end group
	2161
	2162	@group
	2163	int
	2164	yylex (void)
	2165	@{
	2166	int c;
	2167
	2168	/* Ignore white space, get first nonwhite character. */
	2169	while ((c = getchar ()) == ' ' \|\| c == '\t');
	2170
	2171	if (c == EOF)
	2172	return 0;
	2173	@end group
	2174
	2175	@group
	2176	/* Char starts a number => parse the number. */
	2177	if (c == '.' \|\| isdigit (c))
	2178	@{
	2179	ungetc (c, stdin);
	2180	scanf ("%lf", &yylval.val);
	2181	return NUM;
	2182	@}
	2183	@end group
	2184
	2185	@group
	2186	/* Char starts an identifier => read the name. */
	2187	if (isalpha (c))
	2188	@{
	2189	symrec *s;
	2190	static char *symbuf = 0;
	2191	static int length = 0;
	2192	int i;
	2193	@end group
	2194
	2195	@group
	2196	/* Initially make the buffer long enough
	2197	for a 40-character symbol name. */
	2198	if (length == 0)
	2199	length = 40, symbuf = (char *)malloc (length + 1);
	2200
	2201	i = 0;
	2202	do
	2203	@end group
	2204	@group
	2205	@{
	2206	/* If buffer is full, make it bigger. */
	2207	if (i == length)
	2208	@{
	2209	length *= 2;
	2210	symbuf = (char *) realloc (symbuf, length + 1);
	2211	@}
	2212	/* Add this character to the buffer. */
	2213	symbuf[i++] = c;
	2214	/* Get another character. */
	2215	c = getchar ();
	2216	@}
	2217	@end group
	2218	@group
	2219	while (isalnum (c));
	2220
	2221	ungetc (c, stdin);
	2222	symbuf[i] = '\0';
	2223	@end group
	2224
	2225	@group
	2226	s = getsym (symbuf);
	2227	if (s == 0)
	2228	s = putsym (symbuf, VAR);
	2229	yylval.tptr = s;
	2230	return s->type;
	2231	@}
	2232
	2233	/* Any other character is a token by itself. */
	2234	return c;
	2235	@}
	2236	@end group
	2237	@end smallexample
	2238
	2239	This program is both powerful and flexible. You may easily add new
	2240	functions, and it is a simple job to modify this code to install
	2241	predefined variables such as @code{pi} or @code{e} as well.
	2242
	2243	@node Exercises
	2244	@section Exercises
	2245	@cindex exercises
	2246
	2247	@enumerate
	2248	@item
	2249	Add some new functions from @file{math.h} to the initialization list.
	2250
	2251	@item
	2252	Add another array that contains constants and their values. Then
	2253	modify @code{init_table} to add these constants to the symbol table.
	2254	It will be easiest to give the constants type @code{VAR}.
	2255
	2256	@item
	2257	Make the program report an error if the user refers to an
	2258	uninitialized variable in any way except to store a value in it.
	2259	@end enumerate
	2260
	2261	@node Grammar File
	2262	@chapter Bison Grammar Files
	2263
	2264	Bison takes as input a context-free grammar specification and produces a
	2265	C-language function that recognizes correct instances of the grammar.
	2266
	2267	The Bison grammar input file conventionally has a name ending in @samp{.y}.
	2268	@xref{Invocation, ,Invoking Bison}.
	2269
	2270	@menu
	2271	* Grammar Outline:: Overall layout of the grammar file.
	2272	* Symbols:: Terminal and nonterminal symbols.
	2273	* Rules:: How to write grammar rules.
	2274	* Recursion:: Writing recursive rules.
	2275	* Semantics:: Semantic values and actions.
	2276	* Locations:: Locations and actions.
	2277	* Declarations:: All kinds of Bison declarations are described here.
	2278	* Multiple Parsers:: Putting more than one Bison parser in one program.
	2279	@end menu
	2280
	2281	@node Grammar Outline
	2282	@section Outline of a Bison Grammar
	2283
	2284	A Bison grammar file has four main sections, shown here with the
	2285	appropriate delimiters:
	2286
	2287	@example
	2288	%@{
	2289	@var{Prologue}
	2290	%@}
	2291
	2292	@var{Bison declarations}
	2293
	2294	%%
	2295	@var{Grammar rules}
	2296	%%
	2297
	2298	@var{Epilogue}
	2299	@end example
	2300
	2301	Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections.
	2302	As a @acronym{GNU} extension, @samp{//} introduces a comment that
	2303	continues until end of line.
	2304
	2305	@menu
	2306	* Prologue:: Syntax and usage of the prologue.
	2307	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	2308	* Grammar Rules:: Syntax and usage of the grammar rules section.
	2309	* Epilogue:: Syntax and usage of the epilogue.
	2310	@end menu
	2311
	2312	@node Prologue
	2313	@subsection The prologue
	2314	@cindex declarations section
	2315	@cindex Prologue
	2316	@cindex declarations
	2317
	2318	The @var{Prologue} section contains macro definitions and
	2319	declarations of functions and variables that are used in the actions in the
	2320	grammar rules. These are copied to the beginning of the parser file so
	2321	that they precede the definition of @code{yyparse}. You can use
	2322	@samp{#include} to get the declarations from a header file. If you don't
	2323	need any C declarations, you may omit the @samp{%@{} and @samp{%@}}
	2324	delimiters that bracket this section.
	2325
	2326	You may have more than one @var{Prologue} section, intermixed with the
	2327	@var{Bison declarations}. This allows you to have C and Bison
	2328	declarations that refer to each other. For example, the @code{%union}
	2329	declaration may use types defined in a header file, and you may wish to
	2330	prototype functions that take arguments of type @code{YYSTYPE}. This
	2331	can be done with two @var{Prologue} blocks, one before and one after the
	2332	@code{%union} declaration.
	2333
	2334	@smallexample
	2335	%@{
	2336	#include <stdio.h>
	2337	#include "ptypes.h"
	2338	%@}
	2339
	2340	%union @{
	2341	long int n;
	2342	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2343	@}
	2344
	2345	%@{
	2346	static void print_token_value (FILE *, int, YYSTYPE);
	2347	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2348	%@}
	2349
	2350	@dots{}
	2351	@end smallexample
	2352
	2353	@node Bison Declarations
	2354	@subsection The Bison Declarations Section
	2355	@cindex Bison declarations (introduction)
	2356	@cindex declarations, Bison (introduction)
	2357
	2358	The @var{Bison declarations} section contains declarations that define
	2359	terminal and nonterminal symbols, specify precedence, and so on.
	2360	In some simple grammars you may not need any declarations.
	2361	@xref{Declarations, ,Bison Declarations}.
	2362
	2363	@node Grammar Rules
	2364	@subsection The Grammar Rules Section
	2365	@cindex grammar rules section
	2366	@cindex rules section for grammar
	2367
	2368	The @dfn{grammar rules} section contains one or more Bison grammar
	2369	rules, and nothing else. @xref{Rules, ,Syntax of Grammar Rules}.
	2370
	2371	There must always be at least one grammar rule, and the first
	2372	@samp{%%} (which precedes the grammar rules) may never be omitted even
	2373	if it is the first thing in the file.
	2374
	2375	@node Epilogue
	2376	@subsection The epilogue
	2377	@cindex additional C code section
	2378	@cindex epilogue
	2379	@cindex C code, section for additional
	2380
	2381	The @var{Epilogue} is copied verbatim to the end of the parser file, just as
	2382	the @var{Prologue} is copied to the beginning. This is the most convenient
	2383	place to put anything that you want to have in the parser file but which need
	2384	not come before the definition of @code{yyparse}. For example, the
	2385	definitions of @code{yylex} and @code{yyerror} often go here. Because
	2386	C requires functions to be declared before being used, you often need
	2387	to declare functions like @code{yylex} and @code{yyerror} in the Prologue,
	2388	even if you define them int he Epilogue.
	2389	@xref{Interface, ,Parser C-Language Interface}.
	2390
	2391	If the last section is empty, you may omit the @samp{%%} that separates it
	2392	from the grammar rules.
	2393
	2394	The Bison parser itself contains many macros and identifiers whose
	2395	names start with @samp{yy} or @samp{YY}, so it is a
	2396	good idea to avoid using any such names (except those documented in this
	2397	manual) in the epilogue of the grammar file.
	2398
	2399	@node Symbols
	2400	@section Symbols, Terminal and Nonterminal
	2401	@cindex nonterminal symbol
	2402	@cindex terminal symbol
	2403	@cindex token type
	2404	@cindex symbol
	2405
	2406	@dfn{Symbols} in Bison grammars represent the grammatical classifications
	2407	of the language.
	2408
	2409	A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
	2410	class of syntactically equivalent tokens. You use the symbol in grammar
	2411	rules to mean that a token in that class is allowed. The symbol is
	2412	represented in the Bison parser by a numeric code, and the @code{yylex}
	2413	function returns a token type code to indicate what kind of token has been
	2414	read. You don't need to know what the code value is; you can use the
	2415	symbol to stand for it.
	2416
	2417	A @dfn{nonterminal symbol} stands for a class of syntactically equivalent
	2418	groupings. The symbol name is used in writing grammar rules. By convention,
	2419	it should be all lower case.
	2420
	2421	Symbol names can contain letters, digits (not at the beginning),
	2422	underscores and periods. Periods make sense only in nonterminals.
	2423
	2424	There are three ways of writing terminal symbols in the grammar:
	2425
	2426	@itemize @bullet
	2427	@item
	2428	A @dfn{named token type} is written with an identifier, like an
	2429	identifier in C@. By convention, it should be all upper case. Each
	2430	such name must be defined with a Bison declaration such as
	2431	@code{%token}. @xref{Token Decl, ,Token Type Names}.
	2432
	2433	@item
	2434	@cindex character token
	2435	@cindex literal token
	2436	@cindex single-character literal
	2437	A @dfn{character token type} (or @dfn{literal character token}) is
	2438	written in the grammar using the same syntax used in C for character
	2439	constants; for example, @code{'+'} is a character token type. A
	2440	character token type doesn't need to be declared unless you need to
	2441	specify its semantic value data type (@pxref{Value Type, ,Data Types of
	2442	Semantic Values}), associativity, or precedence (@pxref{Precedence,
	2443	,Operator Precedence}).
	2444
	2445	By convention, a character token type is used only to represent a
	2446	token that consists of that particular character. Thus, the token
	2447	type @code{'+'} is used to represent the character @samp{+} as a
	2448	token. Nothing enforces this convention, but if you depart from it,
	2449	your program will confuse other readers.
	2450
	2451	All the usual escape sequences used in character literals in C can be
	2452	used in Bison as well, but you must not use the null character as a
	2453	character literal because its numeric code, zero, signifies
	2454	end-of-input (@pxref{Calling Convention, ,Calling Convention
	2455	for @code{yylex}}). Also, unlike standard C, trigraphs have no
	2456	special meaning in Bison character literals, nor is backslash-newline
	2457	allowed.
	2458
	2459	@item
	2460	@cindex string token
	2461	@cindex literal string token
	2462	@cindex multicharacter literal
	2463	A @dfn{literal string token} is written like a C string constant; for
	2464	example, @code{"<="} is a literal string token. A literal string token
	2465	doesn't need to be declared unless you need to specify its semantic
	2466	value data type (@pxref{Value Type}), associativity, or precedence
	2467	(@pxref{Precedence}).
	2468
	2469	You can associate the literal string token with a symbolic name as an
	2470	alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token
	2471	Declarations}). If you don't do that, the lexical analyzer has to
	2472	retrieve the token number for the literal string token from the
	2473	@code{yytname} table (@pxref{Calling Convention}).
	2474
	2475	@strong{Warning}: literal string tokens do not work in Yacc.
	2476
	2477	By convention, a literal string token is used only to represent a token
	2478	that consists of that particular string. Thus, you should use the token
	2479	type @code{"<="} to represent the string @samp{<=} as a token. Bison
	2480	does not enforce this convention, but if you depart from it, people who
	2481	read your program will be confused.
	2482
	2483	All the escape sequences used in string literals in C can be used in
	2484	Bison as well, except that you must not use a null character within a
	2485	string literal. Also, unlike Standard C, trigraphs have no special
	2486	meaning in Bison string literals, nor is backslash-newline allowed. A
	2487	literal string token must contain two or more characters; for a token
	2488	containing just one character, use a character token (see above).
	2489	@end itemize
	2490
	2491	How you choose to write a terminal symbol has no effect on its
	2492	grammatical meaning. That depends only on where it appears in rules and
	2493	on when the parser function returns that symbol.
	2494
	2495	The value returned by @code{yylex} is always one of the terminal
	2496	symbols, except that a zero or negative value signifies end-of-input.
	2497	Whichever way you write the token type in the grammar rules, you write
	2498	it the same way in the definition of @code{yylex}. The numeric code
	2499	for a character token type is simply the positive numeric code of the
	2500	character, so @code{yylex} can use the identical value to generate the
	2501	requisite code, though you may need to convert it to @code{unsigned
	2502	char} to avoid sign-extension on hosts where @code{char} is signed.
	2503	Each named token type becomes a C macro in
	2504	the parser file, so @code{yylex} can use the name to stand for the code.
	2505	(This is why periods don't make sense in terminal symbols.)
	2506	@xref{Calling Convention, ,Calling Convention for @code{yylex}}.
	2507
	2508	If @code{yylex} is defined in a separate file, you need to arrange for the
	2509	token-type macro definitions to be available there. Use the @samp{-d}
	2510	option when you run Bison, so that it will write these macro definitions
	2511	into a separate header file @file{@var{name}.tab.h} which you can include
	2512	in the other source files that need it. @xref{Invocation, ,Invoking Bison}.
	2513
	2514	If you want to write a grammar that is portable to any Standard C
	2515	host, you must use only non-null character tokens taken from the basic
	2516	execution character set of Standard C@. This set consists of the ten
	2517	digits, the 52 lower- and upper-case English letters, and the
	2518	characters in the following C-language string:
	2519
	2520	@example
	2521	"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{\|@}~"
	2522	@end example
	2523
	2524	The @code{yylex} function and Bison must use a consistent character
	2525	set and encoding for character tokens. For example, if you run Bison in an
	2526	@acronym{ASCII} environment, but then compile and run the resulting program
	2527	in an environment that uses an incompatible character set like
	2528	@acronym{EBCDIC}, the resulting program may not work because the
	2529	tables generated by Bison will assume @acronym{ASCII} numeric values for
	2530	character tokens. It is standard
	2531	practice for software distributions to contain C source files that
	2532	were generated by Bison in an @acronym{ASCII} environment, so installers on
	2533	platforms that are incompatible with @acronym{ASCII} must rebuild those
	2534	files before compiling them.
	2535
	2536	The symbol @code{error} is a terminal symbol reserved for error recovery
	2537	(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
	2538	In particular, @code{yylex} should never return this value. The default
	2539	value of the error token is 256, unless you explicitly assigned 256 to
	2540	one of your tokens with a @code{%token} declaration.
	2541
	2542	@node Rules
	2543	@section Syntax of Grammar Rules
	2544	@cindex rule syntax
	2545	@cindex grammar rule syntax
	2546	@cindex syntax of grammar rules
	2547
	2548	A Bison grammar rule has the following general form:
	2549
	2550	@example
	2551	@group
	2552	@var{result}: @var{components}@dots{}
	2553	;
	2554	@end group
	2555	@end example
	2556
	2557	@noindent
	2558	where @var{result} is the nonterminal symbol that this rule describes,
	2559	and @var{components} are various terminal and nonterminal symbols that
	2560	are put together by this rule (@pxref{Symbols}).
	2561
	2562	For example,
	2563
	2564	@example
	2565	@group
	2566	exp: exp '+' exp
	2567	;
	2568	@end group
	2569	@end example
	2570
	2571	@noindent
	2572	says that two groupings of type @code{exp}, with a @samp{+} token in between,
	2573	can be combined into a larger grouping of type @code{exp}.
	2574
	2575	White space in rules is significant only to separate symbols. You can add
	2576	extra white space as you wish.
	2577
	2578	Scattered among the components can be @var{actions} that determine
	2579	the semantics of the rule. An action looks like this:
	2580
	2581	@example
	2582	@{@var{C statements}@}
	2583	@end example
	2584
	2585	@noindent
	2586	Usually there is only one action and it follows the components.
	2587	@xref{Actions}.
	2588
	2589	@findex \|
	2590	Multiple rules for the same @var{result} can be written separately or can
	2591	be joined with the vertical-bar character @samp{\|} as follows:
	2592
	2593	@ifinfo
	2594	@example
	2595	@var{result}: @var{rule1-components}@dots{}
	2596	\| @var{rule2-components}@dots{}
	2597	@dots{}
	2598	;
	2599	@end example
	2600	@end ifinfo
	2601	@iftex
	2602	@example
	2603	@group
	2604	@var{result}: @var{rule1-components}@dots{}
	2605	\| @var{rule2-components}@dots{}
	2606	@dots{}
	2607	;
	2608	@end group
	2609	@end example
	2610	@end iftex
	2611
	2612	@noindent
	2613	They are still considered distinct rules even when joined in this way.
	2614
	2615	If @var{components} in a rule is empty, it means that @var{result} can
	2616	match the empty string. For example, here is how to define a
	2617	comma-separated sequence of zero or more @code{exp} groupings:
	2618
	2619	@example
	2620	@group
	2621	expseq: /* empty */
	2622	\| expseq1
	2623	;
	2624	@end group
	2625
	2626	@group
	2627	expseq1: exp
	2628	\| expseq1 ',' exp
	2629	;
	2630	@end group
	2631	@end example
	2632
	2633	@noindent
	2634	It is customary to write a comment @samp{/* empty */} in each rule
	2635	with no components.
	2636
	2637	@node Recursion
	2638	@section Recursive Rules
	2639	@cindex recursive rule
	2640
	2641	A rule is called @dfn{recursive} when its @var{result} nonterminal appears
	2642	also on its right hand side. Nearly all Bison grammars need to use
	2643	recursion, because that is the only way to define a sequence of any number
	2644	of a particular thing. Consider this recursive definition of a
	2645	comma-separated sequence of one or more expressions:
	2646
	2647	@example
	2648	@group
	2649	expseq1: exp
	2650	\| expseq1 ',' exp
	2651	;
	2652	@end group
	2653	@end example
	2654
	2655	@cindex left recursion
	2656	@cindex right recursion
	2657	@noindent
	2658	Since the recursive use of @code{expseq1} is the leftmost symbol in the
	2659	right hand side, we call this @dfn{left recursion}. By contrast, here
	2660	the same construct is defined using @dfn{right recursion}:
	2661
	2662	@example
	2663	@group
	2664	expseq1: exp
	2665	\| exp ',' expseq1
	2666	;
	2667	@end group
	2668	@end example
	2669
	2670	@noindent
	2671	Any kind of sequence can be defined using either left recursion or right
	2672	recursion, but you should always use left recursion, because it can
	2673	parse a sequence of any number of elements with bounded stack space.
	2674	Right recursion uses up space on the Bison stack in proportion to the
	2675	number of elements in the sequence, because all the elements must be
	2676	shifted onto the stack before the rule can be applied even once.
	2677	@xref{Algorithm, ,The Bison Parser Algorithm}, for further explanation
	2678	of this.
	2679
	2680	@cindex mutual recursion
	2681	@dfn{Indirect} or @dfn{mutual} recursion occurs when the result of the
	2682	rule does not appear directly on its right hand side, but does appear
	2683	in rules for other nonterminals which do appear on its right hand
	2684	side.
	2685
	2686	For example:
	2687
	2688	@example
	2689	@group
	2690	expr: primary
	2691	\| primary '+' primary
	2692	;
	2693	@end group
	2694
	2695	@group
	2696	primary: constant
	2697	\| '(' expr ')'
	2698	;
	2699	@end group
	2700	@end example
	2701
	2702	@noindent
	2703	defines two mutually-recursive nonterminals, since each refers to the
	2704	other.
	2705
	2706	@node Semantics
	2707	@section Defining Language Semantics
	2708	@cindex defining language semantics
	2709	@cindex language semantics, defining
	2710
	2711	The grammar rules for a language determine only the syntax. The semantics
	2712	are determined by the semantic values associated with various tokens and
	2713	groupings, and by the actions taken when various groupings are recognized.
	2714
	2715	For example, the calculator calculates properly because the value
	2716	associated with each expression is the proper number; it adds properly
	2717	because the action for the grouping @w{@samp{@var{x} + @var{y}}} is to add
	2718	the numbers associated with @var{x} and @var{y}.
	2719
	2720	@menu
	2721	* Value Type:: Specifying one data type for all semantic values.
	2722	* Multiple Types:: Specifying several alternative data types.
	2723	* Actions:: An action is the semantic definition of a grammar rule.
	2724	* Action Types:: Specifying data types for actions to operate on.
	2725	* Mid-Rule Actions:: Most actions go at the end of a rule.
	2726	This says when, why and how to use the exceptional
	2727	action in the middle of a rule.
	2728	@end menu
	2729
	2730	@node Value Type
	2731	@subsection Data Types of Semantic Values
	2732	@cindex semantic value type
	2733	@cindex value type, semantic
	2734	@cindex data types of semantic values
	2735	@cindex default data type
	2736
	2737	In a simple program it may be sufficient to use the same data type for
	2738	the semantic values of all language constructs. This was true in the
	2739	@acronym{RPN} and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish
	2740	Notation Calculator}).
	2741
	2742	Bison's default is to use type @code{int} for all semantic values. To
	2743	specify some other type, define @code{YYSTYPE} as a macro, like this:
	2744
	2745	@example
	2746	#define YYSTYPE double
	2747	@end example
	2748
	2749	@noindent
	2750	This macro definition must go in the prologue of the grammar file
	2751	(@pxref{Grammar Outline, ,Outline of a Bison Grammar}).
	2752
	2753	@node Multiple Types
	2754	@subsection More Than One Value Type
	2755
	2756	In most programs, you will need different data types for different kinds
	2757	of tokens and groupings. For example, a numeric constant may need type
	2758	@code{int} or @code{long int}, while a string constant needs type @code{char *},
	2759	and an identifier might need a pointer to an entry in the symbol table.
	2760
	2761	To use more than one data type for semantic values in one parser, Bison
	2762	requires you to do two things:
	2763
	2764	@itemize @bullet
	2765	@item
	2766	Specify the entire collection of possible data types, with the
	2767	@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
	2768	Value Types}).
	2769
	2770	@item
	2771	Choose one of those types for each symbol (terminal or nonterminal) for
	2772	which semantic values are used. This is done for tokens with the
	2773	@code{%token} Bison declaration (@pxref{Token Decl, ,Token Type Names})
	2774	and for groupings with the @code{%type} Bison declaration (@pxref{Type
	2775	Decl, ,Nonterminal Symbols}).
	2776	@end itemize
	2777
	2778	@node Actions
	2779	@subsection Actions
	2780	@cindex action
	2781	@vindex $$
	2782	@vindex $@var{n}
	2783
	2784	An action accompanies a syntactic rule and contains C code to be executed
	2785	each time an instance of that rule is recognized. The task of most actions
	2786	is to compute a semantic value for the grouping built by the rule from the
	2787	semantic values associated with tokens or smaller groupings.
	2788
	2789	An action consists of C statements surrounded by braces, much like a
	2790	compound statement in C@. An action can contain any sequence of C
	2791	statements. Bison does not look for trigraphs, though, so if your C
	2792	code uses trigraphs you should ensure that they do not affect the
	2793	nesting of braces or the boundaries of comments, strings, or character
	2794	literals.
	2795
	2796	An action can be placed at any position in the rule;
	2797	it is executed at that position. Most rules have just one action at the
	2798	end of the rule, following all the components. Actions in the middle of
	2799	a rule are tricky and used only for special purposes (@pxref{Mid-Rule
	2800	Actions, ,Actions in Mid-Rule}).
	2801
	2802	The C code in an action can refer to the semantic values of the components
	2803	matched by the rule with the construct @code{$@var{n}}, which stands for
	2804	the value of the @var{n}th component. The semantic value for the grouping
	2805	being constructed is @code{$$}. Bison translates both of these
	2806	constructs into expressions of the appropriate type when it copies the
	2807	actions into the parser file. @code{$$} is translated to a modifiable
	2808	lvalue, so it can be assigned to.
	2809
	2810	Here is a typical example:
	2811
	2812	@example
	2813	@group
	2814	exp: @dots{}
	2815	\| exp '+' exp
	2816	@{ $$ = $1 + $3; @}
	2817	@end group
	2818	@end example
	2819
	2820	@noindent
	2821	This rule constructs an @code{exp} from two smaller @code{exp} groupings
	2822	connected by a plus-sign token. In the action, @code{$1} and @code{$3}
	2823	refer to the semantic values of the two component @code{exp} groupings,
	2824	which are the first and third symbols on the right hand side of the rule.
	2825	The sum is stored into @code{$$} so that it becomes the semantic value of
	2826	the addition-expression just recognized by the rule. If there were a
	2827	useful semantic value associated with the @samp{+} token, it could be
	2828	referred to as @code{$2}.
	2829
	2830	Note that the vertical-bar character @samp{\|} is really a rule
	2831	separator, and actions are attached to a single rule. This is a
	2832	difference with tools like Flex, for which @samp{\|} stands for either
	2833	``or'', or ``the same action as that of the next rule''. In the
	2834	following example, the action is triggered only when @samp{b} is found:
	2835
	2836	@example
	2837	@group
	2838	a-or-b: 'a'\|'b' @{ a_or_b_found = 1; @};
	2839	@end group
	2840	@end example
	2841
	2842	@cindex default action
	2843	If you don't specify an action for a rule, Bison supplies a default:
	2844	@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule
	2845	becomes the value of the whole rule. Of course, the default action is
	2846	valid only if the two data types match. There is no meaningful default
	2847	action for an empty rule; every empty rule must have an explicit action
	2848	unless the rule's value does not matter.
	2849
	2850	@code{$@var{n}} with @var{n} zero or negative is allowed for reference
	2851	to tokens and groupings on the stack @emph{before} those that match the
	2852	current rule. This is a very risky practice, and to use it reliably
	2853	you must be certain of the context in which the rule is applied. Here
	2854	is a case in which you can use this reliably:
	2855
	2856	@example
	2857	@group
	2858	foo: expr bar '+' expr @{ @dots{} @}
	2859	\| expr bar '-' expr @{ @dots{} @}
	2860	;
	2861	@end group
	2862
	2863	@group
	2864	bar: /* empty */
	2865	@{ previous_expr = $0; @}
	2866	;
	2867	@end group
	2868	@end example
	2869
	2870	As long as @code{bar} is used only in the fashion shown here, @code{$0}
	2871	always refers to the @code{expr} which precedes @code{bar} in the
	2872	definition of @code{foo}.
	2873
	2874	@node Action Types
	2875	@subsection Data Types of Values in Actions
	2876	@cindex action data types
	2877	@cindex data types in actions
	2878
	2879	If you have chosen a single data type for semantic values, the @code{$$}
	2880	and @code{$@var{n}} constructs always have that data type.
	2881
	2882	If you have used @code{%union} to specify a variety of data types, then you
	2883	must declare a choice among these types for each terminal or nonterminal
	2884	symbol that can have a semantic value. Then each time you use @code{$$} or
	2885	@code{$@var{n}}, its data type is determined by which symbol it refers to
	2886	in the rule. In this example,
	2887
	2888	@example
	2889	@group
	2890	exp: @dots{}
	2891	\| exp '+' exp
	2892	@{ $$ = $1 + $3; @}
	2893	@end group
	2894	@end example
	2895
	2896	@noindent
	2897	@code{$1} and @code{$3} refer to instances of @code{exp}, so they all
	2898	have the data type declared for the nonterminal symbol @code{exp}. If
	2899	@code{$2} were used, it would have the data type declared for the
	2900	terminal symbol @code{'+'}, whatever that might be.
	2901
	2902	Alternatively, you can specify the data type when you refer to the value,
	2903	by inserting @samp{<@var{type}>} after the @samp{$} at the beginning of the
	2904	reference. For example, if you have defined types as shown here:
	2905
	2906	@example
	2907	@group
	2908	%union @{
	2909	int itype;
	2910	double dtype;
	2911	@}
	2912	@end group
	2913	@end example
	2914
	2915	@noindent
	2916	then you can write @code{$<itype>1} to refer to the first subunit of the
	2917	rule as an integer, or @code{$<dtype>1} to refer to it as a double.
	2918
	2919	@node Mid-Rule Actions
	2920	@subsection Actions in Mid-Rule
	2921	@cindex actions in mid-rule
	2922	@cindex mid-rule actions
	2923
	2924	Occasionally it is useful to put an action in the middle of a rule.
	2925	These actions are written just like usual end-of-rule actions, but they
	2926	are executed before the parser even recognizes the following components.
	2927
	2928	A mid-rule action may refer to the components preceding it using
	2929	@code{$@var{n}}, but it may not refer to subsequent components because
	2930	it is run before they are parsed.
	2931
	2932	The mid-rule action itself counts as one of the components of the rule.
	2933	This makes a difference when there is another action later in the same rule
	2934	(and usually there is another at the end): you have to count the actions
	2935	along with the symbols when working out which number @var{n} to use in
	2936	@code{$@var{n}}.
	2937
	2938	The mid-rule action can also have a semantic value. The action can set
	2939	its value with an assignment to @code{$$}, and actions later in the rule
	2940	can refer to the value using @code{$@var{n}}. Since there is no symbol
	2941	to name the action, there is no way to declare a data type for the value
	2942	in advance, so you must use the @samp{$<@dots{}>@var{n}} construct to
	2943	specify a data type each time you refer to this value.
	2944
	2945	There is no way to set the value of the entire rule with a mid-rule
	2946	action, because assignments to @code{$$} do not have that effect. The
	2947	only way to set the value for the entire rule is with an ordinary action
	2948	at the end of the rule.
	2949
	2950	Here is an example from a hypothetical compiler, handling a @code{let}
	2951	statement that looks like @samp{let (@var{variable}) @var{statement}} and
	2952	serves to create a variable named @var{variable} temporarily for the
	2953	duration of @var{statement}. To parse this construct, we must put
	2954	@var{variable} into the symbol table while @var{statement} is parsed, then
	2955	remove it afterward. Here is how it is done:
	2956
	2957	@example
	2958	@group
	2959	stmt: LET '(' var ')'
	2960	@{ $<context>$ = push_context ();
	2961	declare_variable ($3); @}
	2962	stmt @{ $$ = $6;
	2963	pop_context ($<context>5); @}
	2964	@end group
	2965	@end example
	2966
	2967	@noindent
	2968	As soon as @samp{let (@var{variable})} has been recognized, the first
	2969	action is run. It saves a copy of the current semantic context (the
	2970	list of accessible variables) as its semantic value, using alternative
	2971	@code{context} in the data-type union. Then it calls
	2972	@code{declare_variable} to add the new variable to that list. Once the
	2973	first action is finished, the embedded statement @code{stmt} can be
	2974	parsed. Note that the mid-rule action is component number 5, so the
	2975	@samp{stmt} is component number 6.
	2976
	2977	After the embedded statement is parsed, its semantic value becomes the
	2978	value of the entire @code{let}-statement. Then the semantic value from the
	2979	earlier action is used to restore the prior list of variables. This
	2980	removes the temporary @code{let}-variable from the list so that it won't
	2981	appear to exist while the rest of the program is parsed.
	2982
	2983	Taking action before a rule is completely recognized often leads to
	2984	conflicts since the parser must commit to a parse in order to execute the
	2985	action. For example, the following two rules, without mid-rule actions,
	2986	can coexist in a working parser because the parser can shift the open-brace
	2987	token and look at what follows before deciding whether there is a
	2988	declaration or not:
	2989
	2990	@example
	2991	@group
	2992	compound: '@{' declarations statements '@}'
	2993	\| '@{' statements '@}'
	2994	;
	2995	@end group
	2996	@end example
	2997
	2998	@noindent
	2999	But when we add a mid-rule action as follows, the rules become nonfunctional:
	3000
	3001	@example
	3002	@group
	3003	compound: @{ prepare_for_local_variables (); @}
	3004	'@{' declarations statements '@}'
	3005	@end group
	3006	@group
	3007	\| '@{' statements '@}'
	3008	;
	3009	@end group
	3010	@end example
	3011
	3012	@noindent
	3013	Now the parser is forced to decide whether to run the mid-rule action
	3014	when it has read no farther than the open-brace. In other words, it
	3015	must commit to using one rule or the other, without sufficient
	3016	information to do it correctly. (The open-brace token is what is called
	3017	the @dfn{look-ahead} token at this time, since the parser is still
	3018	deciding what to do about it. @xref{Look-Ahead, ,Look-Ahead Tokens}.)
	3019
	3020	You might think that you could correct the problem by putting identical
	3021	actions into the two rules, like this:
	3022
	3023	@example
	3024	@group
	3025	compound: @{ prepare_for_local_variables (); @}
	3026	'@{' declarations statements '@}'
	3027	\| @{ prepare_for_local_variables (); @}
	3028	'@{' statements '@}'
	3029	;
	3030	@end group
	3031	@end example
	3032
	3033	@noindent
	3034	But this does not help, because Bison does not realize that the two actions
	3035	are identical. (Bison never tries to understand the C code in an action.)
	3036
	3037	If the grammar is such that a declaration can be distinguished from a
	3038	statement by the first token (which is true in C), then one solution which
	3039	does work is to put the action after the open-brace, like this:
	3040
	3041	@example
	3042	@group
	3043	compound: '@{' @{ prepare_for_local_variables (); @}
	3044	declarations statements '@}'
	3045	\| '@{' statements '@}'
	3046	;
	3047	@end group
	3048	@end example
	3049
	3050	@noindent
	3051	Now the first token of the following declaration or statement,
	3052	which would in any case tell Bison which rule to use, can still do so.
	3053
	3054	Another solution is to bury the action inside a nonterminal symbol which
	3055	serves as a subroutine:
	3056
	3057	@example
	3058	@group
	3059	subroutine: /* empty */
	3060	@{ prepare_for_local_variables (); @}
	3061	;
	3062
	3063	@end group
	3064
	3065	@group
	3066	compound: subroutine
	3067	'@{' declarations statements '@}'
	3068	\| subroutine
	3069	'@{' statements '@}'
	3070	;
	3071	@end group
	3072	@end example
	3073
	3074	@noindent
	3075	Now Bison can execute the action in the rule for @code{subroutine} without
	3076	deciding which rule for @code{compound} it will eventually use. Note that
	3077	the action is now at the end of its rule. Any mid-rule action can be
	3078	converted to an end-of-rule action in this way, and this is what Bison
	3079	actually does to implement mid-rule actions.
	3080
	3081	@node Locations
	3082	@section Tracking Locations
	3083	@cindex location
	3084	@cindex textual location
	3085	@cindex location, textual
	3086
	3087	Though grammar rules and semantic actions are enough to write a fully
	3088	functional parser, it can be useful to process some additional information,
	3089	especially symbol locations.
	3090
	3091	The way locations are handled is defined by providing a data type, and
	3092	actions to take when rules are matched.
	3093
	3094	@menu
	3095	* Location Type:: Specifying a data type for locations.
	3096	* Actions and Locations:: Using locations in actions.
	3097	* Location Default Action:: Defining a general way to compute locations.
	3098	@end menu
	3099
	3100	@node Location Type
	3101	@subsection Data Type of Locations
	3102	@cindex data type of locations
	3103	@cindex default location type
	3104
	3105	Defining a data type for locations is much simpler than for semantic values,
	3106	since all tokens and groupings always use the same type.
	3107
	3108	The type of locations is specified by defining a macro called @code{YYLTYPE}.
	3109	When @code{YYLTYPE} is not defined, Bison uses a default structure type with
	3110	four members:
	3111
	3112	@example
	3113	typedef struct YYLTYPE
	3114	@{
	3115	int first_line;
	3116	int first_column;
	3117	int last_line;
	3118	int last_column;
	3119	@} YYLTYPE;
	3120	@end example
	3121
	3122	@node Actions and Locations
	3123	@subsection Actions and Locations
	3124	@cindex location actions
	3125	@cindex actions, location
	3126	@vindex @@$
	3127	@vindex @@@var{n}
	3128
	3129	Actions are not only useful for defining language semantics, but also for
	3130	describing the behavior of the output parser with locations.
	3131
	3132	The most obvious way for building locations of syntactic groupings is very
	3133	similar to the way semantic values are computed. In a given rule, several
	3134	constructs can be used to access the locations of the elements being matched.
	3135	The location of the @var{n}th component of the right hand side is
	3136	@code{@@@var{n}}, while the location of the left hand side grouping is
	3137	@code{@@$}.
	3138
	3139	Here is a basic example using the default data type for locations:
	3140
	3141	@example
	3142	@group
	3143	exp: @dots{}
	3144	\| exp '/' exp
	3145	@{
	3146	@@$.first_column = @@1.first_column;
	3147	@@$.first_line = @@1.first_line;
	3148	@@$.last_column = @@3.last_column;
	3149	@@$.last_line = @@3.last_line;
	3150	if ($3)
	3151	$$ = $1 / $3;
	3152	else
	3153	@{
	3154	$$ = 1;
	3155	fprintf (stderr,
	3156	"Division by zero, l%d,c%d-l%d,c%d",
	3157	@@3.first_line, @@3.first_column,
	3158	@@3.last_line, @@3.last_column);
	3159	@}
	3160	@}
	3161	@end group
	3162	@end example
	3163
	3164	As for semantic values, there is a default action for locations that is
	3165	run each time a rule is matched. It sets the beginning of @code{@@$} to the
	3166	beginning of the first symbol, and the end of @code{@@$} to the end of the
	3167	last symbol.
	3168
	3169	With this default action, the location tracking can be fully automatic. The
	3170	example above simply rewrites this way:
	3171
	3172	@example
	3173	@group
	3174	exp: @dots{}
	3175	\| exp '/' exp
	3176	@{
	3177	if ($3)
	3178	$$ = $1 / $3;
	3179	else
	3180	@{
	3181	$$ = 1;
	3182	fprintf (stderr,
	3183	"Division by zero, l%d,c%d-l%d,c%d",
	3184	@@3.first_line, @@3.first_column,
	3185	@@3.last_line, @@3.last_column);
	3186	@}
	3187	@}
	3188	@end group
	3189	@end example
	3190
	3191	@node Location Default Action
	3192	@subsection Default Action for Locations
	3193	@vindex YYLLOC_DEFAULT
	3194
	3195	Actually, actions are not the best place to compute locations. Since
	3196	locations are much more general than semantic values, there is room in
	3197	the output parser to redefine the default action to take for each
	3198	rule. The @code{YYLLOC_DEFAULT} macro is invoked each time a rule is
	3199	matched, before the associated action is run. It is also invoked
	3200	while processing a syntax error, to compute the error's location.
	3201
	3202	Most of the time, this macro is general enough to suppress location
	3203	dedicated code from semantic actions.
	3204
	3205	The @code{YYLLOC_DEFAULT} macro takes three parameters. The first one is
	3206	the location of the grouping (the result of the computation). When a
	3207	rule is matched, the second parameter is an array holding locations of
	3208	all right hand side elements of the rule being matched, and the third
	3209	parameter is the size of the rule's right hand side. When processing
	3210	a syntax error, the second parameter is an array holding locations of
	3211	the symbols that were discarded during error processing, and the third
	3212	parameter is the number of discarded symbols.
	3213
	3214	By default, @code{YYLLOC_DEFAULT} is defined this way for simple
	3215	@acronym{LALR}(1) parsers:
	3216
	3217	@example
	3218	@group
	3219	# define YYLLOC_DEFAULT(Current, Rhs, N) \
	3220	((Current).first_line = (Rhs)[1].first_line, \
	3221	(Current).first_column = (Rhs)[1].first_column, \
	3222	(Current).last_line = (Rhs)[N].last_line, \
	3223	(Current).last_column = (Rhs)[N].last_column)
	3224	@end group
	3225	@end example
	3226
	3227	@noindent
	3228	and like this for @acronym{GLR} parsers:
	3229
	3230	@example
	3231	@group
	3232	# define YYLLOC_DEFAULT(yyCurrent, yyRhs, YYN) \
	3233	((yyCurrent).first_line = YYRHSLOC(yyRhs, 1).first_line, \
	3234	(yyCurrent).first_column = YYRHSLOC(yyRhs, 1).first_column, \
	3235	(yyCurrent).last_line = YYRHSLOC(yyRhs, YYN).last_line, \
	3236	(yyCurrent).last_column = YYRHSLOC(yyRhs, YYN).last_column)
	3237	@end group
	3238	@end example
	3239
	3240	When defining @code{YYLLOC_DEFAULT}, you should consider that:
	3241
	3242	@itemize @bullet
	3243	@item
	3244	All arguments are free of side-effects. However, only the first one (the
	3245	result) should be modified by @code{YYLLOC_DEFAULT}.
	3246
	3247	@item
	3248	For consistency with semantic actions, valid indexes for the location
	3249	array range from 1 to @var{n}.
	3250
	3251	@item
	3252	Your macro should parenthesize its arguments, if need be, since the
	3253	actual arguments may not be surrounded by parentheses. Also, your
	3254	macro should expand to something that can be used as a single
	3255	statement when it is followed by a semicolon.
	3256	@end itemize
	3257
	3258	@node Declarations
	3259	@section Bison Declarations
	3260	@cindex declarations, Bison
	3261	@cindex Bison declarations
	3262
	3263	The @dfn{Bison declarations} section of a Bison grammar defines the symbols
	3264	used in formulating the grammar and the data types of semantic values.
	3265	@xref{Symbols}.
	3266
	3267	All token type names (but not single-character literal tokens such as
	3268	@code{'+'} and @code{'*'}) must be declared. Nonterminal symbols must be
	3269	declared if you need to specify which data type to use for the semantic
	3270	value (@pxref{Multiple Types, ,More Than One Value Type}).
	3271
	3272	The first rule in the file also specifies the start symbol, by default.
	3273	If you want some other symbol to be the start symbol, you must declare
	3274	it explicitly (@pxref{Language and Grammar, ,Languages and Context-Free
	3275	Grammars}).
	3276
	3277	@menu
	3278	* Token Decl:: Declaring terminal symbols.
	3279	* Precedence Decl:: Declaring terminals with precedence and associativity.
	3280	* Union Decl:: Declaring the set of all semantic value types.
	3281	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	3282	* Destructor Decl:: Declaring how symbols are freed.
	3283	* Expect Decl:: Suppressing warnings about parsing conflicts.
	3284	* Start Decl:: Specifying the start symbol.
	3285	* Pure Decl:: Requesting a reentrant parser.
	3286	* Decl Summary:: Table of all Bison declarations.
	3287	@end menu
	3288
	3289	@node Token Decl
	3290	@subsection Token Type Names
	3291	@cindex declaring token type names
	3292	@cindex token type names, declaring
	3293	@cindex declaring literal string tokens
	3294	@findex %token
	3295
	3296	The basic way to declare a token type name (terminal symbol) is as follows:
	3297
	3298	@example
	3299	%token @var{name}
	3300	@end example
	3301
	3302	Bison will convert this into a @code{#define} directive in
	3303	the parser, so that the function @code{yylex} (if it is in this file)
	3304	can use the name @var{name} to stand for this token type's code.
	3305
	3306	Alternatively, you can use @code{%left}, @code{%right}, or
	3307	@code{%nonassoc} instead of @code{%token}, if you wish to specify
	3308	associativity and precedence. @xref{Precedence Decl, ,Operator
	3309	Precedence}.
	3310
	3311	You can explicitly specify the numeric code for a token type by appending
	3312	a decimal or hexadecimal integer value in the field immediately
	3313	following the token name:
	3314
	3315	@example
	3316	%token NUM 300
	3317	%token XNUM 0x12d // a GNU extension
	3318	@end example
	3319
	3320	@noindent
	3321	It is generally best, however, to let Bison choose the numeric codes for
	3322	all token types. Bison will automatically select codes that don't conflict
	3323	with each other or with normal characters.
	3324
	3325	In the event that the stack type is a union, you must augment the
	3326	@code{%token} or other token declaration to include the data type
	3327	alternative delimited by angle-brackets (@pxref{Multiple Types, ,More
	3328	Than One Value Type}).
	3329
	3330	For example:
	3331
	3332	@example
	3333	@group
	3334	%union @{ /* define stack type */
	3335	double val;
	3336	symrec *tptr;
	3337	@}
	3338	%token <val> NUM /* define token NUM and its type */
	3339	@end group
	3340	@end example
	3341
	3342	You can associate a literal string token with a token type name by
	3343	writing the literal string at the end of a @code{%token}
	3344	declaration which declares the name. For example:
	3345
	3346	@example
	3347	%token arrow "=>"
	3348	@end example
	3349
	3350	@noindent
	3351	For example, a grammar for the C language might specify these names with
	3352	equivalent literal string tokens:
	3353
	3354	@example
	3355	%token <operator> OR "\|\|"
	3356	%token <operator> LE 134 "<="
	3357	%left OR "<="
	3358	@end example
	3359
	3360	@noindent
	3361	Once you equate the literal string and the token name, you can use them
	3362	interchangeably in further declarations or the grammar rules. The
	3363	@code{yylex} function can use the token name or the literal string to
	3364	obtain the token type code number (@pxref{Calling Convention}).
	3365
	3366	@node Precedence Decl
	3367	@subsection Operator Precedence
	3368	@cindex precedence declarations
	3369	@cindex declaring operator precedence
	3370	@cindex operator precedence, declaring
	3371
	3372	Use the @code{%left}, @code{%right} or @code{%nonassoc} declaration to
	3373	declare a token and specify its precedence and associativity, all at
	3374	once. These are called @dfn{precedence declarations}.
	3375	@xref{Precedence, ,Operator Precedence}, for general information on
	3376	operator precedence.
	3377
	3378	The syntax of a precedence declaration is the same as that of
	3379	@code{%token}: either
	3380
	3381	@example
	3382	%left @var{symbols}@dots{}
	3383	@end example
	3384
	3385	@noindent
	3386	or
	3387
	3388	@example
	3389	%left <@var{type}> @var{symbols}@dots{}
	3390	@end example
	3391
	3392	And indeed any of these declarations serves the purposes of @code{%token}.
	3393	But in addition, they specify the associativity and relative precedence for
	3394	all the @var{symbols}:
	3395
	3396	@itemize @bullet
	3397	@item
	3398	The associativity of an operator @var{op} determines how repeated uses
	3399	of the operator nest: whether @samp{@var{x} @var{op} @var{y} @var{op}
	3400	@var{z}} is parsed by grouping @var{x} with @var{y} first or by
	3401	grouping @var{y} with @var{z} first. @code{%left} specifies
	3402	left-associativity (grouping @var{x} with @var{y} first) and
	3403	@code{%right} specifies right-associativity (grouping @var{y} with
	3404	@var{z} first). @code{%nonassoc} specifies no associativity, which
	3405	means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is
	3406	considered a syntax error.
	3407
	3408	@item
	3409	The precedence of an operator determines how it nests with other operators.
	3410	All the tokens declared in a single precedence declaration have equal
	3411	precedence and nest together according to their associativity.
	3412	When two tokens declared in different precedence declarations associate,
	3413	the one declared later has the higher precedence and is grouped first.
	3414	@end itemize
	3415
	3416	@node Union Decl
	3417	@subsection The Collection of Value Types
	3418	@cindex declaring value types
	3419	@cindex value types, declaring
	3420	@findex %union
	3421
	3422	The @code{%union} declaration specifies the entire collection of possible
	3423	data types for semantic values. The keyword @code{%union} is followed by a
	3424	pair of braces containing the same thing that goes inside a @code{union} in
	3425	C.
	3426
	3427	For example:
	3428
	3429	@example
	3430	@group
	3431	%union @{
	3432	double val;
	3433	symrec *tptr;
	3434	@}
	3435	@end group
	3436	@end example
	3437
	3438	@noindent
	3439	This says that the two alternative types are @code{double} and @code{symrec
	3440	*}. They are given names @code{val} and @code{tptr}; these names are used
	3441	in the @code{%token} and @code{%type} declarations to pick one of the types
	3442	for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
	3443
	3444	As an extension to @acronym{POSIX}, a tag is allowed after the
	3445	@code{union}. For example:
	3446
	3447	@example
	3448	@group
	3449	%union value @{
	3450	double val;
	3451	symrec *tptr;
	3452	@}
	3453	@end group
	3454	@end example
	3455
	3456	specifies the union tag @code{value}, so the corresponding C type is
	3457	@code{union value}. If you do not specify a tag, it defaults to
	3458	@code{YYSTYPE}.
	3459
	3460	Note that, unlike making a @code{union} declaration in C, you need not write
	3461	a semicolon after the closing brace.
	3462
	3463	@node Type Decl
	3464	@subsection Nonterminal Symbols
	3465	@cindex declaring value types, nonterminals
	3466	@cindex value types, nonterminals, declaring
	3467	@findex %type
	3468
	3469	@noindent
	3470	When you use @code{%union} to specify multiple value types, you must
	3471	declare the value type of each nonterminal symbol for which values are
	3472	used. This is done with a @code{%type} declaration, like this:
	3473
	3474	@example
	3475	%type <@var{type}> @var{nonterminal}@dots{}
	3476	@end example
	3477
	3478	@noindent
	3479	Here @var{nonterminal} is the name of a nonterminal symbol, and
	3480	@var{type} is the name given in the @code{%union} to the alternative
	3481	that you want (@pxref{Union Decl, ,The Collection of Value Types}). You
	3482	can give any number of nonterminal symbols in the same @code{%type}
	3483	declaration, if they have the same value type. Use spaces to separate
	3484	the symbol names.
	3485
	3486	You can also declare the value type of a terminal symbol. To do this,
	3487	use the same @code{<@var{type}>} construction in a declaration for the
	3488	terminal symbol. All kinds of token declarations allow
	3489	@code{<@var{type}>}.
	3490
	3491	@node Destructor Decl
	3492	@subsection Freeing Discarded Symbols
	3493	@cindex freeing discarded symbols
	3494	@findex %destructor
	3495
	3496	Some symbols can be discarded by the parser, typically during error
	3497	recovery (@pxref{Error Recovery}). Basically, during error recovery,
	3498	embarrassing symbols already pushed on the stack, and embarrassing
	3499	tokens coming from the rest of the file are thrown away until the parser
	3500	falls on its feet. If these symbols convey heap based information, this
	3501	memory is lost. While this behavior is tolerable for batch parsers,
	3502	such as in compilers, it is unacceptable for parsers that can
	3503	possibility ``never end'' such as shells, or implementations of
	3504	communication protocols.
	3505
	3506	The @code{%destructor} directive allows for the definition of code that
	3507	is called when a symbol is thrown away.
	3508
	3509	@deffn {Directive} %destructor @{ @var{code} @} @var{symbols}
	3510	@findex %destructor
	3511	Declare that the @var{code} must be invoked for each of the
	3512	@var{symbols} that will be discarded by the parser. The @var{code}
	3513	should use @code{$$} to designate the semantic value associated to the
	3514	@var{symbols}. The additional parser parameters are also available
	3515	(@pxref{Parser Function, , The Parser Function @code{yyparse}}).
	3516
	3517	@strong{Warning:} as of Bison 1.875, this feature is still considered as
	3518	experimental, as there was not enough user feedback. In particular,
	3519	the syntax might still change.
	3520	@end deffn
	3521
	3522	For instance:
	3523
	3524	@smallexample
	3525	%union
	3526	@{
	3527	char *string;
	3528	@}
	3529	%token <string> STRING
	3530	%type <string> string
	3531	%destructor @{ free ($$); @} STRING string
	3532	@end smallexample
	3533
	3534	@noindent
	3535	guarantees that when a @code{STRING} or a @code{string} will be discarded,
	3536	its associated memory will be freed.
	3537
	3538	Note that in the future, Bison might also consider that right hand side
	3539	members that are not mentioned in the action can be destroyed. For
	3540	instance, in:
	3541
	3542	@smallexample
	3543	comment: "/" STRING "/";
	3544	@end smallexample
	3545
	3546	@noindent
	3547	the parser is entitled to destroy the semantic value of the
	3548	@code{string}. Of course, this will not apply to the default action;
	3549	compare:
	3550
	3551	@smallexample
	3552	typeless: string; // $$ = $1 does not apply; $1 is destroyed.
	3553	typefull: string; // $$ = $1 applies, $1 is not destroyed.
	3554	@end smallexample
	3555
	3556	@node Expect Decl
	3557	@subsection Suppressing Conflict Warnings
	3558	@cindex suppressing conflict warnings
	3559	@cindex preventing warnings about conflicts
	3560	@cindex warnings, preventing
	3561	@cindex conflicts, suppressing warnings of
	3562	@findex %expect
	3563	@findex %expect-rr
	3564
	3565	Bison normally warns if there are any conflicts in the grammar
	3566	(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars
	3567	have harmless shift/reduce conflicts which are resolved in a predictable
	3568	way and would be difficult to eliminate. It is desirable to suppress
	3569	the warning about these conflicts unless the number of conflicts
	3570	changes. You can do this with the @code{%expect} declaration.
	3571
	3572	The declaration looks like this:
	3573
	3574	@example
	3575	%expect @var{n}
	3576	@end example
	3577
	3578	Here @var{n} is a decimal integer. The declaration says there should be
	3579	no warning if there are @var{n} shift/reduce conflicts and no
	3580	reduce/reduce conflicts. The usual warning is
	3581	given if there are either more or fewer conflicts, or if there are any
	3582	reduce/reduce conflicts.
	3583
	3584	For normal LALR(1) parsers, reduce/reduce conflicts are more serious,
	3585	and should be eliminated entirely. Bison will always report
	3586	reduce/reduce conflicts for these parsers. With GLR parsers, however,
	3587	both shift/reduce and reduce/reduce are routine (otherwise, there
	3588	would be no need to use GLR parsing). Therefore, it is also possible
	3589	to specify an expected number of reduce/reduce conflicts in GLR
	3590	parsers, using the declaration:
	3591
	3592	@example
	3593	%expect-rr @var{n}
	3594	@end example
	3595
	3596	In general, using @code{%expect} involves these steps:
	3597
	3598	@itemize @bullet
	3599	@item
	3600	Compile your grammar without @code{%expect}. Use the @samp{-v} option
	3601	to get a verbose list of where the conflicts occur. Bison will also
	3602	print the number of conflicts.
	3603
	3604	@item
	3605	Check each of the conflicts to make sure that Bison's default
	3606	resolution is what you really want. If not, rewrite the grammar and
	3607	go back to the beginning.
	3608
	3609	@item
	3610	Add an @code{%expect} declaration, copying the number @var{n} from the
	3611	number which Bison printed.
	3612	@end itemize
	3613
	3614	Now Bison will stop annoying you if you do not change the number of
	3615	conflicts, but it will warn you again if changes in the grammar result
	3616	in more or fewer conflicts.
	3617
	3618	@node Start Decl
	3619	@subsection The Start-Symbol
	3620	@cindex declaring the start symbol
	3621	@cindex start symbol, declaring
	3622	@cindex default start symbol
	3623	@findex %start
	3624
	3625	Bison assumes by default that the start symbol for the grammar is the first
	3626	nonterminal specified in the grammar specification section. The programmer
	3627	may override this restriction with the @code{%start} declaration as follows:
	3628
	3629	@example
	3630	%start @var{symbol}
	3631	@end example
	3632
	3633	@node Pure Decl
	3634	@subsection A Pure (Reentrant) Parser
	3635	@cindex reentrant parser
	3636	@cindex pure parser
	3637	@findex %pure-parser
	3638
	3639	A @dfn{reentrant} program is one which does not alter in the course of
	3640	execution; in other words, it consists entirely of @dfn{pure} (read-only)
	3641	code. Reentrancy is important whenever asynchronous execution is possible;
	3642	for example, a non-reentrant program may not be safe to call from a signal
	3643	handler. In systems with multiple threads of control, a non-reentrant
	3644	program must be called only within interlocks.
	3645
	3646	Normally, Bison generates a parser which is not reentrant. This is
	3647	suitable for most uses, and it permits compatibility with Yacc. (The
	3648	standard Yacc interfaces are inherently nonreentrant, because they use
	3649	statically allocated variables for communication with @code{yylex},
	3650	including @code{yylval} and @code{yylloc}.)
	3651
	3652	Alternatively, you can generate a pure, reentrant parser. The Bison
	3653	declaration @code{%pure-parser} says that you want the parser to be
	3654	reentrant. It looks like this:
	3655
	3656	@example
	3657	%pure-parser
	3658	@end example
	3659
	3660	The result is that the communication variables @code{yylval} and
	3661	@code{yylloc} become local variables in @code{yyparse}, and a different
	3662	calling convention is used for the lexical analyzer function
	3663	@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
	3664	Parsers}, for the details of this. The variable @code{yynerrs} also
	3665	becomes local in @code{yyparse} (@pxref{Error Reporting, ,The Error
	3666	Reporting Function @code{yyerror}}). The convention for calling
	3667	@code{yyparse} itself is unchanged.
	3668
	3669	Whether the parser is pure has nothing to do with the grammar rules.
	3670	You can generate either a pure parser or a nonreentrant parser from any
	3671	valid grammar.
	3672
	3673	@node Decl Summary
	3674	@subsection Bison Declaration Summary
	3675	@cindex Bison declaration summary
	3676	@cindex declaration summary
	3677	@cindex summary, Bison declaration
	3678
	3679	Here is a summary of the declarations used to define a grammar:
	3680
	3681	@deffn {Directive} %union
	3682	Declare the collection of data types that semantic values may have
	3683	(@pxref{Union Decl, ,The Collection of Value Types}).
	3684	@end deffn
	3685
	3686	@deffn {Directive} %token
	3687	Declare a terminal symbol (token type name) with no precedence
	3688	or associativity specified (@pxref{Token Decl, ,Token Type Names}).
	3689	@end deffn
	3690
	3691	@deffn {Directive} %right
	3692	Declare a terminal symbol (token type name) that is right-associative
	3693	(@pxref{Precedence Decl, ,Operator Precedence}).
	3694	@end deffn
	3695
	3696	@deffn {Directive} %left
	3697	Declare a terminal symbol (token type name) that is left-associative
	3698	(@pxref{Precedence Decl, ,Operator Precedence}).
	3699	@end deffn
	3700
	3701	@deffn {Directive} %nonassoc
	3702	Declare a terminal symbol (token type name) that is nonassociative
	3703	(@pxref{Precedence Decl, ,Operator Precedence}).
	3704	Using it in a way that would be associative is a syntax error.
	3705	@end deffn
	3706
	3707	@ifset defaultprec
	3708	@deffn {Directive} %default-prec
	3709	Assign a precedence to rules lacking an explicit @code{%prec} modifier
	3710	(@pxref{Contextual Precedence, ,Context-Dependent Precedence}).
	3711	@end deffn
	3712	@end ifset
	3713
	3714	@deffn {Directive} %type
	3715	Declare the type of semantic values for a nonterminal symbol
	3716	(@pxref{Type Decl, ,Nonterminal Symbols}).
	3717	@end deffn
	3718
	3719	@deffn {Directive} %start
	3720	Specify the grammar's start symbol (@pxref{Start Decl, ,The
	3721	Start-Symbol}).
	3722	@end deffn
	3723
	3724	@deffn {Directive} %expect
	3725	Declare the expected number of shift-reduce conflicts
	3726	(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
	3727	@end deffn
	3728
	3729
	3730	@sp 1
	3731	@noindent
	3732	In order to change the behavior of @command{bison}, use the following
	3733	directives:
	3734
	3735	@deffn {Directive} %debug
	3736	In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
	3737	already defined, so that the debugging facilities are compiled.
	3738	@end deffn
	3739	@xref{Tracing, ,Tracing Your Parser}.
	3740
	3741	@deffn {Directive} %defines
	3742	Write an extra output file containing macro definitions for the token
	3743	type names defined in the grammar and the semantic value type
	3744	@code{YYSTYPE}, as well as a few @code{extern} variable declarations.
	3745
	3746	If the parser output file is named @file{@var{name}.c} then this file
	3747	is named @file{@var{name}.h}.
	3748
	3749	This output file is essential if you wish to put the definition of
	3750	@code{yylex} in a separate source file, because @code{yylex} needs to
	3751	be able to refer to token type codes and the variable
	3752	@code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}.
	3753	@end deffn
	3754
	3755	@deffn {Directive} %destructor
	3756	Specifying how the parser should reclaim the memory associated to
	3757	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	3758	@end deffn
	3759
	3760	@deffn {Directive} %file-prefix="@var{prefix}"
	3761	Specify a prefix to use for all Bison output file names. The names are
	3762	chosen as if the input file were named @file{@var{prefix}.y}.
	3763	@end deffn
	3764
	3765	@deffn {Directive} %locations
	3766	Generate the code processing the locations (@pxref{Action Features,
	3767	,Special Features for Use in Actions}). This mode is enabled as soon as
	3768	the grammar uses the special @samp{@@@var{n}} tokens, but if your
	3769	grammar does not use it, using @samp{%locations} allows for more
	3770	accurate syntax error messages.
	3771	@end deffn
	3772
	3773	@deffn {Directive} %name-prefix="@var{prefix}"
	3774	Rename the external symbols used in the parser so that they start with
	3775	@var{prefix} instead of @samp{yy}. The precise list of symbols renamed
	3776	is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
	3777	@code{yylval}, @code{yylloc}, @code{yychar}, @code{yydebug}, and
	3778	possible @code{yylloc}. For example, if you use
	3779	@samp{%name-prefix="c_"}, the names become @code{c_parse}, @code{c_lex},
	3780	and so on. @xref{Multiple Parsers, ,Multiple Parsers in the Same
	3781	Program}.
	3782	@end deffn
	3783
	3784	@ifset defaultprec
	3785	@deffn {Directive} %no-default-prec
	3786	Do not assign a precedence to rules lacking an explicit @code{%prec}
	3787	modifier (@pxref{Contextual Precedence, ,Context-Dependent
	3788	Precedence}).
	3789	@end deffn
	3790	@end ifset
	3791
	3792	@deffn {Directive} %no-parser
	3793	Do not include any C code in the parser file; generate tables only. The
	3794	parser file contains just @code{#define} directives and static variable
	3795	declarations.
	3796
	3797	This option also tells Bison to write the C code for the grammar actions
	3798	into a file named @file{@var{filename}.act}, in the form of a
	3799	brace-surrounded body fit for a @code{switch} statement.
	3800	@end deffn
	3801
	3802	@deffn {Directive} %no-lines
	3803	Don't generate any @code{#line} preprocessor commands in the parser
	3804	file. Ordinarily Bison writes these commands in the parser file so that
	3805	the C compiler and debuggers will associate errors and object code with
	3806	your source file (the grammar file). This directive causes them to
	3807	associate errors with the parser file, treating it an independent source
	3808	file in its own right.
	3809	@end deffn
	3810
	3811	@deffn {Directive} %output="@var{filename}"
	3812	Specify the @var{filename} for the parser file.
	3813	@end deffn
	3814
	3815	@deffn {Directive} %pure-parser
	3816	Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure
	3817	(Reentrant) Parser}).
	3818	@end deffn
	3819
	3820	@deffn {Directive} %token-table
	3821	Generate an array of token names in the parser file. The name of the
	3822	array is @code{yytname}; @code{yytname[@var{i}]} is the name of the
	3823	token whose internal Bison token code number is @var{i}. The first
	3824	three elements of @code{yytname} correspond to the predefined tokens
	3825	@code{"$end"},
	3826	@code{"error"}, and @code{"$undefined"}; after these come the symbols
	3827	defined in the grammar file.
	3828
	3829	For single-character literal tokens and literal string tokens, the name
	3830	in the table includes the single-quote or double-quote characters: for
	3831	example, @code{"'+'"} is a single-character literal and @code{"\"<=\""}
	3832	is a literal string token. All the characters of the literal string
	3833	token appear verbatim in the string found in the table; even
	3834	double-quote characters are not escaped. For example, if the token
	3835	consists of three characters @samp{"}, its string in @code{yytname}
	3836	contains @samp{"""}. (In C, that would be written as
	3837	@code{"\"\"\""}).
	3838
	3839	When you specify @code{%token-table}, Bison also generates macro
	3840	definitions for macros @code{YYNTOKENS}, @code{YYNNTS}, and
	3841	@code{YYNRULES}, and @code{YYNSTATES}:
	3842
	3843	@table @code
	3844	@item YYNTOKENS
	3845	The highest token number, plus one.
	3846	@item YYNNTS
	3847	The number of nonterminal symbols.
	3848	@item YYNRULES
	3849	The number of grammar rules,
	3850	@item YYNSTATES
	3851	The number of parser states (@pxref{Parser States}).
	3852	@end table
	3853	@end deffn
	3854
	3855	@deffn {Directive} %verbose
	3856	Write an extra output file containing verbose descriptions of the
	3857	parser states and what is done for each type of look-ahead token in
	3858	that state. @xref{Understanding, , Understanding Your Parser}, for more
	3859	information.
	3860	@end deffn
	3861
	3862	@deffn {Directive} %yacc
	3863	Pretend the option @option{--yacc} was given, i.e., imitate Yacc,
	3864	including its naming conventions. @xref{Bison Options}, for more.
	3865	@end deffn
	3866
	3867
	3868	@node Multiple Parsers
	3869	@section Multiple Parsers in the Same Program
	3870
	3871	Most programs that use Bison parse only one language and therefore contain
	3872	only one Bison parser. But what if you want to parse more than one
	3873	language with the same program? Then you need to avoid a name conflict
	3874	between different definitions of @code{yyparse}, @code{yylval}, and so on.
	3875
	3876	The easy way to do this is to use the option @samp{-p @var{prefix}}
	3877	(@pxref{Invocation, ,Invoking Bison}). This renames the interface
	3878	functions and variables of the Bison parser to start with @var{prefix}
	3879	instead of @samp{yy}. You can use this to give each parser distinct
	3880	names that do not conflict.
	3881
	3882	The precise list of symbols renamed is @code{yyparse}, @code{yylex},
	3883	@code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yylloc},
	3884	@code{yychar} and @code{yydebug}. For example, if you use @samp{-p c},
	3885	the names become @code{cparse}, @code{clex}, and so on.
	3886
	3887	@strong{All the other variables and macros associated with Bison are not
	3888	renamed.} These others are not global; there is no conflict if the same
	3889	name is used in different parsers. For example, @code{YYSTYPE} is not
	3890	renamed, but defining this in different ways in different parsers causes
	3891	no trouble (@pxref{Value Type, ,Data Types of Semantic Values}).
	3892
	3893	The @samp{-p} option works by adding macro definitions to the beginning
	3894	of the parser source file, defining @code{yyparse} as
	3895	@code{@var{prefix}parse}, and so on. This effectively substitutes one
	3896	name for the other in the entire parser file.
	3897
	3898	@node Interface
	3899	@chapter Parser C-Language Interface
	3900	@cindex C-language interface
	3901	@cindex interface
	3902
	3903	The Bison parser is actually a C function named @code{yyparse}. Here we
	3904	describe the interface conventions of @code{yyparse} and the other
	3905	functions that it needs to use.
	3906
	3907	Keep in mind that the parser uses many C identifiers starting with
	3908	@samp{yy} and @samp{YY} for internal purposes. If you use such an
	3909	identifier (aside from those in this manual) in an action or in epilogue
	3910	in the grammar file, you are likely to run into trouble.
	3911
	3912	@menu
	3913	* Parser Function:: How to call @code{yyparse} and what it returns.
	3914	* Lexical:: You must supply a function @code{yylex}
	3915	which reads tokens.
	3916	* Error Reporting:: You must supply a function @code{yyerror}.
	3917	* Action Features:: Special features for use in actions.
	3918	@end menu
	3919
	3920	@node Parser Function
	3921	@section The Parser Function @code{yyparse}
	3922	@findex yyparse
	3923
	3924	You call the function @code{yyparse} to cause parsing to occur. This
	3925	function reads tokens, executes actions, and ultimately returns when it
	3926	encounters end-of-input or an unrecoverable syntax error. You can also
	3927	write an action which directs @code{yyparse} to return immediately
	3928	without reading further.
	3929
	3930
	3931	@deftypefun int yyparse (void)
	3932	The value returned by @code{yyparse} is 0 if parsing was successful (return
	3933	is due to end-of-input).
	3934
	3935	The value is 1 if parsing failed (return is due to a syntax error).
	3936	@end deftypefun
	3937
	3938	In an action, you can cause immediate return from @code{yyparse} by using
	3939	these macros:
	3940
	3941	@defmac YYACCEPT
	3942	@findex YYACCEPT
	3943	Return immediately with value 0 (to report success).
	3944	@end defmac
	3945
	3946	@defmac YYABORT
	3947	@findex YYABORT
	3948	Return immediately with value 1 (to report failure).
	3949	@end defmac
	3950
	3951	If you use a reentrant parser, you can optionally pass additional
	3952	parameter information to it in a reentrant way. To do so, use the
	3953	declaration @code{%parse-param}:
	3954
	3955	@deffn {Directive} %parse-param @{@var{argument-declaration}@}
	3956	@findex %parse-param
	3957	Declare that an argument declared by @code{argument-declaration} is an
	3958	additional @code{yyparse} argument.
	3959	The @var{argument-declaration} is used when declaring
	3960	functions or prototypes. The last identifier in
	3961	@var{argument-declaration} must be the argument name.
	3962	@end deffn
	3963
	3964	Here's an example. Write this in the parser:
	3965
	3966	@example
	3967	%parse-param @{int *nastiness@}
	3968	%parse-param @{int *randomness@}
	3969	@end example
	3970
	3971	@noindent
	3972	Then call the parser like this:
	3973
	3974	@example
	3975	@{
	3976	int nastiness, randomness;
	3977	@dots{} /* @r{Store proper data in @code{nastiness} and @code{randomness}.} */
	3978	value = yyparse (&nastiness, &randomness);
	3979	@dots{}
	3980	@}
	3981	@end example
	3982
	3983	@noindent
	3984	In the grammar actions, use expressions like this to refer to the data:
	3985
	3986	@example
	3987	exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @}
	3988	@end example
	3989
	3990
	3991	@node Lexical
	3992	@section The Lexical Analyzer Function @code{yylex}
	3993	@findex yylex
	3994	@cindex lexical analyzer
	3995
	3996	The @dfn{lexical analyzer} function, @code{yylex}, recognizes tokens from
	3997	the input stream and returns them to the parser. Bison does not create
	3998	this function automatically; you must write it so that @code{yyparse} can
	3999	call it. The function is sometimes referred to as a lexical scanner.
	4000
	4001	In simple programs, @code{yylex} is often defined at the end of the Bison
	4002	grammar file. If @code{yylex} is defined in a separate source file, you
	4003	need to arrange for the token-type macro definitions to be available there.
	4004	To do this, use the @samp{-d} option when you run Bison, so that it will
	4005	write these macro definitions into a separate header file
	4006	@file{@var{name}.tab.h} which you can include in the other source files
	4007	that need it. @xref{Invocation, ,Invoking Bison}.
	4008
	4009	@menu
	4010	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	4011	* Token Values:: How @code{yylex} must return the semantic value
	4012	of the token it has read.
	4013	* Token Locations:: How @code{yylex} must return the text location
	4014	(line number, etc.) of the token, if the
	4015	actions want that.
	4016	* Pure Calling:: How the calling convention differs
	4017	in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	4018	@end menu
	4019
	4020	@node Calling Convention
	4021	@subsection Calling Convention for @code{yylex}
	4022
	4023	The value that @code{yylex} returns must be the positive numeric code
	4024	for the type of token it has just found; a zero or negative value
	4025	signifies end-of-input.
	4026
	4027	When a token is referred to in the grammar rules by a name, that name
	4028	in the parser file becomes a C macro whose definition is the proper
	4029	numeric code for that token type. So @code{yylex} can use the name
	4030	to indicate that type. @xref{Symbols}.
	4031
	4032	When a token is referred to in the grammar rules by a character literal,
	4033	the numeric code for that character is also the code for the token type.
	4034	So @code{yylex} can simply return that character code, possibly converted
	4035	to @code{unsigned char} to avoid sign-extension. The null character
	4036	must not be used this way, because its code is zero and that
	4037	signifies end-of-input.
	4038
	4039	Here is an example showing these things:
	4040
	4041	@example
	4042	int
	4043	yylex (void)
	4044	@{
	4045	@dots{}
	4046	if (c == EOF) /* Detect end-of-input. */
	4047	return 0;
	4048	@dots{}
	4049	if (c == '+' \|\| c == '-')
	4050	return c; /* Assume token type for `+' is '+'. */
	4051	@dots{}
	4052	return INT; /* Return the type of the token. */
	4053	@dots{}
	4054	@}
	4055	@end example
	4056
	4057	@noindent
	4058	This interface has been designed so that the output from the @code{lex}
	4059	utility can be used without change as the definition of @code{yylex}.
	4060
	4061	If the grammar uses literal string tokens, there are two ways that
	4062	@code{yylex} can determine the token type codes for them:
	4063
	4064	@itemize @bullet
	4065	@item
	4066	If the grammar defines symbolic token names as aliases for the
	4067	literal string tokens, @code{yylex} can use these symbolic names like
	4068	all others. In this case, the use of the literal string tokens in
	4069	the grammar file has no effect on @code{yylex}.
	4070
	4071	@item
	4072	@code{yylex} can find the multicharacter token in the @code{yytname}
	4073	table. The index of the token in the table is the token type's code.
	4074	The name of a multicharacter token is recorded in @code{yytname} with a
	4075	double-quote, the token's characters, and another double-quote. The
	4076	token's characters are not escaped in any way; they appear verbatim in
	4077	the contents of the string in the table.
	4078
	4079	Here's code for looking up a token in @code{yytname}, assuming that the
	4080	characters of the token are stored in @code{token_buffer}.
	4081
	4082	@smallexample
	4083	for (i = 0; i < YYNTOKENS; i++)
	4084	@{
	4085	if (yytname[i] != 0
	4086	&& yytname[i][0] == '"'
	4087	&& ! strncmp (yytname[i] + 1, token_buffer,
	4088	strlen (token_buffer))
	4089	&& yytname[i][strlen (token_buffer) + 1] == '"'
	4090	&& yytname[i][strlen (token_buffer) + 2] == 0)
	4091	break;
	4092	@}
	4093	@end smallexample
	4094
	4095	The @code{yytname} table is generated only if you use the
	4096	@code{%token-table} declaration. @xref{Decl Summary}.
	4097	@end itemize
	4098
	4099	@node Token Values
	4100	@subsection Semantic Values of Tokens
	4101
	4102	@vindex yylval
	4103	In an ordinary (non-reentrant) parser, the semantic value of the token must
	4104	be stored into the global variable @code{yylval}. When you are using
	4105	just one data type for semantic values, @code{yylval} has that type.
	4106	Thus, if the type is @code{int} (the default), you might write this in
	4107	@code{yylex}:
	4108
	4109	@example
	4110	@group
	4111	@dots{}
	4112	yylval = value; /* Put value onto Bison stack. */
	4113	return INT; /* Return the type of the token. */
	4114	@dots{}
	4115	@end group
	4116	@end example
	4117
	4118	When you are using multiple data types, @code{yylval}'s type is a union
	4119	made from the @code{%union} declaration (@pxref{Union Decl, ,The
	4120	Collection of Value Types}). So when you store a token's value, you
	4121	must use the proper member of the union. If the @code{%union}
	4122	declaration looks like this:
	4123
	4124	@example
	4125	@group
	4126	%union @{
	4127	int intval;
	4128	double val;
	4129	symrec *tptr;
	4130	@}
	4131	@end group
	4132	@end example
	4133
	4134	@noindent
	4135	then the code in @code{yylex} might look like this:
	4136
	4137	@example
	4138	@group
	4139	@dots{}
	4140	yylval.intval = value; /* Put value onto Bison stack. */
	4141	return INT; /* Return the type of the token. */
	4142	@dots{}
	4143	@end group
	4144	@end example
	4145
	4146	@node Token Locations
	4147	@subsection Textual Locations of Tokens
	4148
	4149	@vindex yylloc
	4150	If you are using the @samp{@@@var{n}}-feature (@pxref{Locations, ,
	4151	Tracking Locations}) in actions to keep track of the
	4152	textual locations of tokens and groupings, then you must provide this
	4153	information in @code{yylex}. The function @code{yyparse} expects to
	4154	find the textual location of a token just parsed in the global variable
	4155	@code{yylloc}. So @code{yylex} must store the proper data in that
	4156	variable.
	4157
	4158	By default, the value of @code{yylloc} is a structure and you need only
	4159	initialize the members that are going to be used by the actions. The
	4160	four members are called @code{first_line}, @code{first_column},
	4161	@code{last_line} and @code{last_column}. Note that the use of this
	4162	feature makes the parser noticeably slower.
	4163
	4164	@tindex YYLTYPE
	4165	The data type of @code{yylloc} has the name @code{YYLTYPE}.
	4166
	4167	@node Pure Calling
	4168	@subsection Calling Conventions for Pure Parsers
	4169
	4170	When you use the Bison declaration @code{%pure-parser} to request a
	4171	pure, reentrant parser, the global communication variables @code{yylval}
	4172	and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant)
	4173	Parser}.) In such parsers the two global variables are replaced by
	4174	pointers passed as arguments to @code{yylex}. You must declare them as
	4175	shown here, and pass the information back by storing it through those
	4176	pointers.
	4177
	4178	@example
	4179	int
	4180	yylex (YYSTYPE lvalp, YYLTYPE llocp)
	4181	@{
	4182	@dots{}
	4183	lvalp = value; / Put value onto Bison stack. */
	4184	return INT; /* Return the type of the token. */
	4185	@dots{}
	4186	@}
	4187	@end example
	4188
	4189	If the grammar file does not use the @samp{@@} constructs to refer to
	4190	textual locations, then the type @code{YYLTYPE} will not be defined. In
	4191	this case, omit the second argument; @code{yylex} will be called with
	4192	only one argument.
	4193
	4194
	4195	If you wish to pass the additional parameter data to @code{yylex}, use
	4196	@code{%lex-param} just like @code{%parse-param} (@pxref{Parser
	4197	Function}).
	4198
	4199	@deffn {Directive} lex-param @{@var{argument-declaration}@}
	4200	@findex %lex-param
	4201	Declare that @code{argument-declaration} is an additional @code{yylex}
	4202	argument declaration.
	4203	@end deffn
	4204
	4205	For instance:
	4206
	4207	@example
	4208	%parse-param @{int *nastiness@}
	4209	%lex-param @{int *nastiness@}
	4210	%parse-param @{int *randomness@}
	4211	@end example
	4212
	4213	@noindent
	4214	results in the following signature:
	4215
	4216	@example
	4217	int yylex (int *nastiness);
	4218	int yyparse (int nastiness, int randomness);
	4219	@end example
	4220
	4221	If @code{%pure-parser} is added:
	4222
	4223	@example
	4224	int yylex (YYSTYPE lvalp, int nastiness);
	4225	int yyparse (int nastiness, int randomness);
	4226	@end example
	4227
	4228	@noindent
	4229	and finally, if both @code{%pure-parser} and @code{%locations} are used:
	4230
	4231	@example
	4232	int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);
	4233	int yyparse (int nastiness, int randomness);
	4234	@end example
	4235
	4236	@node Error Reporting
	4237	@section The Error Reporting Function @code{yyerror}
	4238	@cindex error reporting function
	4239	@findex yyerror
	4240	@cindex parse error
	4241	@cindex syntax error
	4242
	4243	The Bison parser detects a @dfn{syntax error} or @dfn{parse error}
	4244	whenever it reads a token which cannot satisfy any syntax rule. An
	4245	action in the grammar can also explicitly proclaim an error, using the
	4246	macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use
	4247	in Actions}).
	4248
	4249	The Bison parser expects to report the error by calling an error
	4250	reporting function named @code{yyerror}, which you must supply. It is
	4251	called by @code{yyparse} whenever a syntax error is found, and it
	4252	receives one argument. For a syntax error, the string is normally
	4253	@w{@code{"syntax error"}}.
	4254
	4255	@findex %error-verbose
	4256	If you invoke the directive @code{%error-verbose} in the Bison
	4257	declarations section (@pxref{Bison Declarations, ,The Bison Declarations
	4258	Section}), then Bison provides a more verbose and specific error message
	4259	string instead of just plain @w{@code{"syntax error"}}.
	4260
	4261	The parser can detect one other kind of error: stack overflow. This
	4262	happens when the input contains constructions that are very deeply
	4263	nested. It isn't likely you will encounter this, since the Bison
	4264	parser extends its stack automatically up to a very large limit. But
	4265	if overflow happens, @code{yyparse} calls @code{yyerror} in the usual
	4266	fashion, except that the argument string is @w{@code{"parser stack
	4267	overflow"}}.
	4268
	4269	The following definition suffices in simple programs:
	4270
	4271	@example
	4272	@group
	4273	void
	4274	yyerror (char const *s)
	4275	@{
	4276	@end group
	4277	@group
	4278	fprintf (stderr, "%s\n", s);
	4279	@}
	4280	@end group
	4281	@end example
	4282
	4283	After @code{yyerror} returns to @code{yyparse}, the latter will attempt
	4284	error recovery if you have written suitable error recovery grammar rules
	4285	(@pxref{Error Recovery}). If recovery is impossible, @code{yyparse} will
	4286	immediately return 1.
	4287
	4288	Obviously, in location tracking pure parsers, @code{yyerror} should have
	4289	an access to the current location. This is indeed the case for the GLR
	4290	parsers, but not for the Yacc parser, for historical reasons. I.e., if
	4291	@samp{%locations %pure-parser} is passed then the prototypes for
	4292	@code{yyerror} are:
	4293
	4294	@example
	4295	void yyerror (char const msg); / Yacc parsers. */
	4296	void yyerror (YYLTYPE locp, char const msg); /* GLR parsers. */
	4297	@end example
	4298
	4299	If @samp{%parse-param @{int *nastiness@}} is used, then:
	4300
	4301	@example
	4302	void yyerror (int nastiness, char const msg); /* Yacc parsers. */
	4303	void yyerror (int nastiness, char const msg); /* GLR parsers. */
	4304	@end example
	4305
	4306	Finally, GLR and Yacc parsers share the same @code{yyerror} calling
	4307	convention for absolutely pure parsers, i.e., when the calling
	4308	convention of @code{yylex} @emph{and} the calling convention of
	4309	@code{%pure-parser} are pure. I.e.:
	4310
	4311	@example
	4312	/* Location tracking. */
	4313	%locations
	4314	/* Pure yylex. */
	4315	%pure-parser
	4316	%lex-param @{int *nastiness@}
	4317	/* Pure yyparse. */
	4318	%parse-param @{int *nastiness@}
	4319	%parse-param @{int *randomness@}
	4320	@end example
	4321
	4322	@noindent
	4323	results in the following signatures for all the parser kinds:
	4324
	4325	@example
	4326	int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);
	4327	int yyparse (int nastiness, int randomness);
	4328	void yyerror (YYLTYPE *locp,
	4329	int nastiness, int randomness,
	4330	char const *msg);
	4331	@end example
	4332
	4333	@noindent
	4334	The prototypes are only indications of how the code produced by Bison
	4335	uses @code{yyerror}. Bison-generated code always ignores the returned
	4336	value, so @code{yyerror} can return any type, including @code{void}.
	4337	Also, @code{yyerror} can be a variadic function; that is why the
	4338	message is always passed last.
	4339
	4340	Traditionally @code{yyerror} returns an @code{int} that is always
	4341	ignored, but this is purely for historical reasons, and @code{void} is
	4342	preferable since it more accurately describes the return type for
	4343	@code{yyerror}.
	4344
	4345	@vindex yynerrs
	4346	The variable @code{yynerrs} contains the number of syntax errors
	4347	encountered so far. Normally this variable is global; but if you
	4348	request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser})
	4349	then it is a local variable which only the actions can access.
	4350
	4351	@node Action Features
	4352	@section Special Features for Use in Actions
	4353	@cindex summary, action features
	4354	@cindex action features summary
	4355
	4356	Here is a table of Bison constructs, variables and macros that
	4357	are useful in actions.
	4358
	4359	@deffn {Variable} $$
	4360	Acts like a variable that contains the semantic value for the
	4361	grouping made by the current rule. @xref{Actions}.
	4362	@end deffn
	4363
	4364	@deffn {Variable} $@var{n}
	4365	Acts like a variable that contains the semantic value for the
	4366	@var{n}th component of the current rule. @xref{Actions}.
	4367	@end deffn
	4368
	4369	@deffn {Variable} $<@var{typealt}>$
	4370	Like @code{$$} but specifies alternative @var{typealt} in the union
	4371	specified by the @code{%union} declaration. @xref{Action Types, ,Data
	4372	Types of Values in Actions}.
	4373	@end deffn
	4374
	4375	@deffn {Variable} $<@var{typealt}>@var{n}
	4376	Like @code{$@var{n}} but specifies alternative @var{typealt} in the
	4377	union specified by the @code{%union} declaration.
	4378	@xref{Action Types, ,Data Types of Values in Actions}.
	4379	@end deffn
	4380
	4381	@deffn {Macro} YYABORT;
	4382	Return immediately from @code{yyparse}, indicating failure.
	4383	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	4384	@end deffn
	4385
	4386	@deffn {Macro} YYACCEPT;
	4387	Return immediately from @code{yyparse}, indicating success.
	4388	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	4389	@end deffn
	4390
	4391	@deffn {Macro} YYBACKUP (@var{token}, @var{value});
	4392	@findex YYBACKUP
	4393	Unshift a token. This macro is allowed only for rules that reduce
	4394	a single value, and only when there is no look-ahead token.
	4395	It is also disallowed in @acronym{GLR} parsers.
	4396	It installs a look-ahead token with token type @var{token} and
	4397	semantic value @var{value}; then it discards the value that was
	4398	going to be reduced by this rule.
	4399
	4400	If the macro is used when it is not valid, such as when there is
	4401	a look-ahead token already, then it reports a syntax error with
	4402	a message @samp{cannot back up} and performs ordinary error
	4403	recovery.
	4404
	4405	In either case, the rest of the action is not executed.
	4406	@end deffn
	4407
	4408	@deffn {Macro} YYEMPTY
	4409	@vindex YYEMPTY
	4410	Value stored in @code{yychar} when there is no look-ahead token.
	4411	@end deffn
	4412
	4413	@deffn {Macro} YYERROR;
	4414	@findex YYERROR
	4415	Cause an immediate syntax error. This statement initiates error
	4416	recovery just as if the parser itself had detected an error; however, it
	4417	does not call @code{yyerror}, and does not print any message. If you
	4418	want to print an error message, call @code{yyerror} explicitly before
	4419	the @samp{YYERROR;} statement. @xref{Error Recovery}.
	4420	@end deffn
	4421
	4422	@deffn {Macro} YYRECOVERING
	4423	This macro stands for an expression that has the value 1 when the parser
	4424	is recovering from a syntax error, and 0 the rest of the time.
	4425	@xref{Error Recovery}.
	4426	@end deffn
	4427
	4428	@deffn {Variable} yychar
	4429	Variable containing the current look-ahead token. (In a pure parser,
	4430	this is actually a local variable within @code{yyparse}.) When there is
	4431	no look-ahead token, the value @code{YYEMPTY} is stored in the variable.
	4432	@xref{Look-Ahead, ,Look-Ahead Tokens}.
	4433	@end deffn
	4434
	4435	@deffn {Macro} yyclearin;
	4436	Discard the current look-ahead token. This is useful primarily in
	4437	error rules. @xref{Error Recovery}.
	4438	@end deffn
	4439
	4440	@deffn {Macro} yyerrok;
	4441	Resume generating error messages immediately for subsequent syntax
	4442	errors. This is useful primarily in error rules.
	4443	@xref{Error Recovery}.
	4444	@end deffn
	4445
	4446	@deffn {Value} @@$
	4447	@findex @@$
	4448	Acts like a structure variable containing information on the textual location
	4449	of the grouping made by the current rule. @xref{Locations, ,
	4450	Tracking Locations}.
	4451
	4452	@c Check if those paragraphs are still useful or not.
	4453
	4454	@c @example
	4455	@c struct @{
	4456	@c int first_line, last_line;
	4457	@c int first_column, last_column;
	4458	@c @};
	4459	@c @end example
	4460
	4461	@c Thus, to get the starting line number of the third component, you would
	4462	@c use @samp{@@3.first_line}.
	4463
	4464	@c In order for the members of this structure to contain valid information,
	4465	@c you must make @code{yylex} supply this information about each token.
	4466	@c If you need only certain members, then @code{yylex} need only fill in
	4467	@c those members.
	4468
	4469	@c The use of this feature makes the parser noticeably slower.
	4470	@end deffn
	4471
	4472	@deffn {Value} @@@var{n}
	4473	@findex @@@var{n}
	4474	Acts like a structure variable containing information on the textual location
	4475	of the @var{n}th component of the current rule. @xref{Locations, ,
	4476	Tracking Locations}.
	4477	@end deffn
	4478
	4479
	4480	@node Algorithm
	4481	@chapter The Bison Parser Algorithm
	4482	@cindex Bison parser algorithm
	4483	@cindex algorithm of parser
	4484	@cindex shifting
	4485	@cindex reduction
	4486	@cindex parser stack
	4487	@cindex stack, parser
	4488
	4489	As Bison reads tokens, it pushes them onto a stack along with their
	4490	semantic values. The stack is called the @dfn{parser stack}. Pushing a
	4491	token is traditionally called @dfn{shifting}.
	4492
	4493	For example, suppose the infix calculator has read @samp{1 + 5 *}, with a
	4494	@samp{3} to come. The stack will have four elements, one for each token
	4495	that was shifted.
	4496
	4497	But the stack does not always have an element for each token read. When
	4498	the last @var{n} tokens and groupings shifted match the components of a
	4499	grammar rule, they can be combined according to that rule. This is called
	4500	@dfn{reduction}. Those tokens and groupings are replaced on the stack by a
	4501	single grouping whose symbol is the result (left hand side) of that rule.
	4502	Running the rule's action is part of the process of reduction, because this
	4503	is what computes the semantic value of the resulting grouping.
	4504
	4505	For example, if the infix calculator's parser stack contains this:
	4506
	4507	@example
	4508	1 + 5 * 3
	4509	@end example
	4510
	4511	@noindent
	4512	and the next input token is a newline character, then the last three
	4513	elements can be reduced to 15 via the rule:
	4514
	4515	@example
	4516	expr: expr '*' expr;
	4517	@end example
	4518
	4519	@noindent
	4520	Then the stack contains just these three elements:
	4521
	4522	@example
	4523	1 + 15
	4524	@end example
	4525
	4526	@noindent
	4527	At this point, another reduction can be made, resulting in the single value
	4528	16. Then the newline token can be shifted.
	4529
	4530	The parser tries, by shifts and reductions, to reduce the entire input down
	4531	to a single grouping whose symbol is the grammar's start-symbol
	4532	(@pxref{Language and Grammar, ,Languages and Context-Free Grammars}).
	4533
	4534	This kind of parser is known in the literature as a bottom-up parser.
	4535
	4536	@menu
	4537	* Look-Ahead:: Parser looks one token ahead when deciding what to do.
	4538	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	4539	* Precedence:: Operator precedence works by resolving conflicts.
	4540	* Contextual Precedence:: When an operator's precedence depends on context.
	4541	* Parser States:: The parser is a finite-state-machine with stack.
	4542	* Reduce/Reduce:: When two rules are applicable in the same situation.
	4543	* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
	4544	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	4545	* Stack Overflow:: What happens when stack gets full. How to avoid it.
	4546	@end menu
	4547
	4548	@node Look-Ahead
	4549	@section Look-Ahead Tokens
	4550	@cindex look-ahead token
	4551
	4552	The Bison parser does @emph{not} always reduce immediately as soon as the
	4553	last @var{n} tokens and groupings match a rule. This is because such a
	4554	simple strategy is inadequate to handle most languages. Instead, when a
	4555	reduction is possible, the parser sometimes ``looks ahead'' at the next
	4556	token in order to decide what to do.
	4557
	4558	When a token is read, it is not immediately shifted; first it becomes the
	4559	@dfn{look-ahead token}, which is not on the stack. Now the parser can
	4560	perform one or more reductions of tokens and groupings on the stack, while
	4561	the look-ahead token remains off to the side. When no more reductions
	4562	should take place, the look-ahead token is shifted onto the stack. This
	4563	does not mean that all possible reductions have been done; depending on the
	4564	token type of the look-ahead token, some rules may choose to delay their
	4565	application.
	4566
	4567	Here is a simple case where look-ahead is needed. These three rules define
	4568	expressions which contain binary addition operators and postfix unary
	4569	factorial operators (@samp{!}), and allow parentheses for grouping.
	4570
	4571	@example
	4572	@group
	4573	expr: term '+' expr
	4574	\| term
	4575	;
	4576	@end group
	4577
	4578	@group
	4579	term: '(' expr ')'
	4580	\| term '!'
	4581	\| NUMBER
	4582	;
	4583	@end group
	4584	@end example
	4585
	4586	Suppose that the tokens @w{@samp{1 + 2}} have been read and shifted; what
	4587	should be done? If the following token is @samp{)}, then the first three
	4588	tokens must be reduced to form an @code{expr}. This is the only valid
	4589	course, because shifting the @samp{)} would produce a sequence of symbols
	4590	@w{@code{term ')'}}, and no rule allows this.
	4591
	4592	If the following token is @samp{!}, then it must be shifted immediately so
	4593	that @w{@samp{2 !}} can be reduced to make a @code{term}. If instead the
	4594	parser were to reduce before shifting, @w{@samp{1 + 2}} would become an
	4595	@code{expr}. It would then be impossible to shift the @samp{!} because
	4596	doing so would produce on the stack the sequence of symbols @code{expr
	4597	'!'}. No rule allows that sequence.
	4598
	4599	@vindex yychar
	4600	The current look-ahead token is stored in the variable @code{yychar}.
	4601	@xref{Action Features, ,Special Features for Use in Actions}.
	4602
	4603	@node Shift/Reduce
	4604	@section Shift/Reduce Conflicts
	4605	@cindex conflicts
	4606	@cindex shift/reduce conflicts
	4607	@cindex dangling @code{else}
	4608	@cindex @code{else}, dangling
	4609
	4610	Suppose we are parsing a language which has if-then and if-then-else
	4611	statements, with a pair of rules like this:
	4612
	4613	@example
	4614	@group
	4615	if_stmt:
	4616	IF expr THEN stmt
	4617	\| IF expr THEN stmt ELSE stmt
	4618	;
	4619	@end group
	4620	@end example
	4621
	4622	@noindent
	4623	Here we assume that @code{IF}, @code{THEN} and @code{ELSE} are
	4624	terminal symbols for specific keyword tokens.
	4625
	4626	When the @code{ELSE} token is read and becomes the look-ahead token, the
	4627	contents of the stack (assuming the input is valid) are just right for
	4628	reduction by the first rule. But it is also legitimate to shift the
	4629	@code{ELSE}, because that would lead to eventual reduction by the second
	4630	rule.
	4631
	4632	This situation, where either a shift or a reduction would be valid, is
	4633	called a @dfn{shift/reduce conflict}. Bison is designed to resolve
	4634	these conflicts by choosing to shift, unless otherwise directed by
	4635	operator precedence declarations. To see the reason for this, let's
	4636	contrast it with the other alternative.
	4637
	4638	Since the parser prefers to shift the @code{ELSE}, the result is to attach
	4639	the else-clause to the innermost if-statement, making these two inputs
	4640	equivalent:
	4641
	4642	@example
	4643	if x then if y then win (); else lose;
	4644
	4645	if x then do; if y then win (); else lose; end;
	4646	@end example
	4647
	4648	But if the parser chose to reduce when possible rather than shift, the
	4649	result would be to attach the else-clause to the outermost if-statement,
	4650	making these two inputs equivalent:
	4651
	4652	@example
	4653	if x then if y then win (); else lose;
	4654
	4655	if x then do; if y then win (); end; else lose;
	4656	@end example
	4657
	4658	The conflict exists because the grammar as written is ambiguous: either
	4659	parsing of the simple nested if-statement is legitimate. The established
	4660	convention is that these ambiguities are resolved by attaching the
	4661	else-clause to the innermost if-statement; this is what Bison accomplishes
	4662	by choosing to shift rather than reduce. (It would ideally be cleaner to
	4663	write an unambiguous grammar, but that is very hard to do in this case.)
	4664	This particular ambiguity was first encountered in the specifications of
	4665	Algol 60 and is called the ``dangling @code{else}'' ambiguity.
	4666
	4667	To avoid warnings from Bison about predictable, legitimate shift/reduce
	4668	conflicts, use the @code{%expect @var{n}} declaration. There will be no
	4669	warning as long as the number of shift/reduce conflicts is exactly @var{n}.
	4670	@xref{Expect Decl, ,Suppressing Conflict Warnings}.
	4671
	4672	The definition of @code{if_stmt} above is solely to blame for the
	4673	conflict, but the conflict does not actually appear without additional
	4674	rules. Here is a complete Bison input file that actually manifests the
	4675	conflict:
	4676
	4677	@example
	4678	@group
	4679	%token IF THEN ELSE variable
	4680	%%
	4681	@end group
	4682	@group
	4683	stmt: expr
	4684	\| if_stmt
	4685	;
	4686	@end group
	4687
	4688	@group
	4689	if_stmt:
	4690	IF expr THEN stmt
	4691	\| IF expr THEN stmt ELSE stmt
	4692	;
	4693	@end group
	4694
	4695	expr: variable
	4696	;
	4697	@end example
	4698
	4699	@node Precedence
	4700	@section Operator Precedence
	4701	@cindex operator precedence
	4702	@cindex precedence of operators
	4703
	4704	Another situation where shift/reduce conflicts appear is in arithmetic
	4705	expressions. Here shifting is not always the preferred resolution; the
	4706	Bison declarations for operator precedence allow you to specify when to
	4707	shift and when to reduce.
	4708
	4709	@menu
	4710	* Why Precedence:: An example showing why precedence is needed.
	4711	* Using Precedence:: How to specify precedence in Bison grammars.
	4712	* Precedence Examples:: How these features are used in the previous example.
	4713	* How Precedence:: How they work.
	4714	@end menu
	4715
	4716	@node Why Precedence
	4717	@subsection When Precedence is Needed
	4718
	4719	Consider the following ambiguous grammar fragment (ambiguous because the
	4720	input @w{@samp{1 - 2 * 3}} can be parsed in two different ways):
	4721
	4722	@example
	4723	@group
	4724	expr: expr '-' expr
	4725	\| expr '*' expr
	4726	\| expr '<' expr
	4727	\| '(' expr ')'
	4728	@dots{}
	4729	;
	4730	@end group
	4731	@end example
	4732
	4733	@noindent
	4734	Suppose the parser has seen the tokens @samp{1}, @samp{-} and @samp{2};
	4735	should it reduce them via the rule for the subtraction operator? It
	4736	depends on the next token. Of course, if the next token is @samp{)}, we
	4737	must reduce; shifting is invalid because no single rule can reduce the
	4738	token sequence @w{@samp{- 2 )}} or anything starting with that. But if
	4739	the next token is @samp{*} or @samp{<}, we have a choice: either
	4740	shifting or reduction would allow the parse to complete, but with
	4741	different results.
	4742
	4743	To decide which one Bison should do, we must consider the results. If
	4744	the next operator token @var{op} is shifted, then it must be reduced
	4745	first in order to permit another opportunity to reduce the difference.
	4746	The result is (in effect) @w{@samp{1 - (2 @var{op} 3)}}. On the other
	4747	hand, if the subtraction is reduced before shifting @var{op}, the result
	4748	is @w{@samp{(1 - 2) @var{op} 3}}. Clearly, then, the choice of shift or
	4749	reduce should depend on the relative precedence of the operators
	4750	@samp{-} and @var{op}: @samp{*} should be shifted first, but not
	4751	@samp{<}.
	4752
	4753	@cindex associativity
	4754	What about input such as @w{@samp{1 - 2 - 5}}; should this be
	4755	@w{@samp{(1 - 2) - 5}} or should it be @w{@samp{1 - (2 - 5)}}? For most
	4756	operators we prefer the former, which is called @dfn{left association}.
	4757	The latter alternative, @dfn{right association}, is desirable for
	4758	assignment operators. The choice of left or right association is a
	4759	matter of whether the parser chooses to shift or reduce when the stack
	4760	contains @w{@samp{1 - 2}} and the look-ahead token is @samp{-}: shifting
	4761	makes right-associativity.
	4762
	4763	@node Using Precedence
	4764	@subsection Specifying Operator Precedence
	4765	@findex %left
	4766	@findex %right
	4767	@findex %nonassoc
	4768
	4769	Bison allows you to specify these choices with the operator precedence
	4770	declarations @code{%left} and @code{%right}. Each such declaration
	4771	contains a list of tokens, which are operators whose precedence and
	4772	associativity is being declared. The @code{%left} declaration makes all
	4773	those operators left-associative and the @code{%right} declaration makes
	4774	them right-associative. A third alternative is @code{%nonassoc}, which
	4775	declares that it is a syntax error to find the same operator twice ``in a
	4776	row''.
	4777
	4778	The relative precedence of different operators is controlled by the
	4779	order in which they are declared. The first @code{%left} or
	4780	@code{%right} declaration in the file declares the operators whose
	4781	precedence is lowest, the next such declaration declares the operators
	4782	whose precedence is a little higher, and so on.
	4783
	4784	@node Precedence Examples
	4785	@subsection Precedence Examples
	4786
	4787	In our example, we would want the following declarations:
	4788
	4789	@example
	4790	%left '<'
	4791	%left '-'
	4792	%left '*'
	4793	@end example
	4794
	4795	In a more complete example, which supports other operators as well, we
	4796	would declare them in groups of equal precedence. For example, @code{'+'} is
	4797	declared with @code{'-'}:
	4798
	4799	@example
	4800	%left '<' '>' '=' NE LE GE
	4801	%left '+' '-'
	4802	%left '*' '/'
	4803	@end example
	4804
	4805	@noindent
	4806	(Here @code{NE} and so on stand for the operators for ``not equal''
	4807	and so on. We assume that these tokens are more than one character long
	4808	and therefore are represented by names, not character literals.)
	4809
	4810	@node How Precedence
	4811	@subsection How Precedence Works
	4812
	4813	The first effect of the precedence declarations is to assign precedence
	4814	levels to the terminal symbols declared. The second effect is to assign
	4815	precedence levels to certain rules: each rule gets its precedence from
	4816	the last terminal symbol mentioned in the components. (You can also
	4817	specify explicitly the precedence of a rule. @xref{Contextual
	4818	Precedence, ,Context-Dependent Precedence}.)
	4819
	4820	Finally, the resolution of conflicts works by comparing the precedence
	4821	of the rule being considered with that of the look-ahead token. If the
	4822	token's precedence is higher, the choice is to shift. If the rule's
	4823	precedence is higher, the choice is to reduce. If they have equal
	4824	precedence, the choice is made based on the associativity of that
	4825	precedence level. The verbose output file made by @samp{-v}
	4826	(@pxref{Invocation, ,Invoking Bison}) says how each conflict was
	4827	resolved.
	4828
	4829	Not all rules and not all tokens have precedence. If either the rule or
	4830	the look-ahead token has no precedence, then the default is to shift.
	4831
	4832	@node Contextual Precedence
	4833	@section Context-Dependent Precedence
	4834	@cindex context-dependent precedence
	4835	@cindex unary operator precedence
	4836	@cindex precedence, context-dependent
	4837	@cindex precedence, unary operator
	4838	@findex %prec
	4839
	4840	Often the precedence of an operator depends on the context. This sounds
	4841	outlandish at first, but it is really very common. For example, a minus
	4842	sign typically has a very high precedence as a unary operator, and a
	4843	somewhat lower precedence (lower than multiplication) as a binary operator.
	4844
	4845	The Bison precedence declarations, @code{%left}, @code{%right} and
	4846	@code{%nonassoc}, can only be used once for a given token; so a token has
	4847	only one precedence declared in this way. For context-dependent
	4848	precedence, you need to use an additional mechanism: the @code{%prec}
	4849	modifier for rules.
	4850
	4851	The @code{%prec} modifier declares the precedence of a particular rule by
	4852	specifying a terminal symbol whose precedence should be used for that rule.
	4853	It's not necessary for that symbol to appear otherwise in the rule. The
	4854	modifier's syntax is:
	4855
	4856	@example
	4857	%prec @var{terminal-symbol}
	4858	@end example
	4859
	4860	@noindent
	4861	and it is written after the components of the rule. Its effect is to
	4862	assign the rule the precedence of @var{terminal-symbol}, overriding
	4863	the precedence that would be deduced for it in the ordinary way. The
	4864	altered rule precedence then affects how conflicts involving that rule
	4865	are resolved (@pxref{Precedence, ,Operator Precedence}).
	4866
	4867	Here is how @code{%prec} solves the problem of unary minus. First, declare
	4868	a precedence for a fictitious terminal symbol named @code{UMINUS}. There
	4869	are no tokens of this type, but the symbol serves to stand for its
	4870	precedence:
	4871
	4872	@example
	4873	@dots{}
	4874	%left '+' '-'
	4875	%left '*'
	4876	%left UMINUS
	4877	@end example
	4878
	4879	Now the precedence of @code{UMINUS} can be used in specific rules:
	4880
	4881	@example
	4882	@group
	4883	exp: @dots{}
	4884	\| exp '-' exp
	4885	@dots{}
	4886	\| '-' exp %prec UMINUS
	4887	@end group
	4888	@end example
	4889
	4890	@ifset defaultprec
	4891	If you forget to append @code{%prec UMINUS} to the rule for unary
	4892	minus, Bison silently assumes that minus has its usual precedence.
	4893	This kind of problem can be tricky to debug, since one typically
	4894	discovers the mistake only by testing the code.
	4895
	4896	The @code{%no-default-prec;} declaration makes it easier to discover
	4897	this kind of problem systematically. It causes rules that lack a
	4898	@code{%prec} modifier to have no precedence, even if the last terminal
	4899	symbol mentioned in their components has a declared precedence.
	4900
	4901	If @code{%no-default-prec;} is in effect, you must specify @code{%prec}
	4902	for all rules that participate in precedence conflict resolution.
	4903	Then you will see any shift/reduce conflict until you tell Bison how
	4904	to resolve it, either by changing your grammar or by adding an
	4905	explicit precedence. This will probably add declarations to the
	4906	grammar, but it helps to protect against incorrect rule precedences.
	4907
	4908	The effect of @code{%no-default-prec;} can be reversed by giving
	4909	@code{%default-prec;}, which is the default.
	4910	@end ifset
	4911
	4912	@node Parser States
	4913	@section Parser States
	4914	@cindex finite-state machine
	4915	@cindex parser state
	4916	@cindex state (of parser)
	4917
	4918	The function @code{yyparse} is implemented using a finite-state machine.
	4919	The values pushed on the parser stack are not simply token type codes; they
	4920	represent the entire sequence of terminal and nonterminal symbols at or
	4921	near the top of the stack. The current state collects all the information
	4922	about previous input which is relevant to deciding what to do next.
	4923
	4924	Each time a look-ahead token is read, the current parser state together
	4925	with the type of look-ahead token are looked up in a table. This table
	4926	entry can say, ``Shift the look-ahead token.'' In this case, it also
	4927	specifies the new parser state, which is pushed onto the top of the
	4928	parser stack. Or it can say, ``Reduce using rule number @var{n}.''
	4929	This means that a certain number of tokens or groupings are taken off
	4930	the top of the stack, and replaced by one grouping. In other words,
	4931	that number of states are popped from the stack, and one new state is
	4932	pushed.
	4933
	4934	There is one other alternative: the table can say that the look-ahead token
	4935	is erroneous in the current state. This causes error processing to begin
	4936	(@pxref{Error Recovery}).
	4937
	4938	@node Reduce/Reduce
	4939	@section Reduce/Reduce Conflicts
	4940	@cindex reduce/reduce conflict
	4941	@cindex conflicts, reduce/reduce
	4942
	4943	A reduce/reduce conflict occurs if there are two or more rules that apply
	4944	to the same sequence of input. This usually indicates a serious error
	4945	in the grammar.
	4946
	4947	For example, here is an erroneous attempt to define a sequence
	4948	of zero or more @code{word} groupings.
	4949
	4950	@example
	4951	sequence: /* empty */
	4952	@{ printf ("empty sequence\n"); @}
	4953	\| maybeword
	4954	\| sequence word
	4955	@{ printf ("added word %s\n", $2); @}
	4956	;
	4957
	4958	maybeword: /* empty */
	4959	@{ printf ("empty maybeword\n"); @}
	4960	\| word
	4961	@{ printf ("single word %s\n", $1); @}
	4962	;
	4963	@end example
	4964
	4965	@noindent
	4966	The error is an ambiguity: there is more than one way to parse a single
	4967	@code{word} into a @code{sequence}. It could be reduced to a
	4968	@code{maybeword} and then into a @code{sequence} via the second rule.
	4969	Alternatively, nothing-at-all could be reduced into a @code{sequence}
	4970	via the first rule, and this could be combined with the @code{word}
	4971	using the third rule for @code{sequence}.
	4972
	4973	There is also more than one way to reduce nothing-at-all into a
	4974	@code{sequence}. This can be done directly via the first rule,
	4975	or indirectly via @code{maybeword} and then the second rule.
	4976
	4977	You might think that this is a distinction without a difference, because it
	4978	does not change whether any particular input is valid or not. But it does
	4979	affect which actions are run. One parsing order runs the second rule's
	4980	action; the other runs the first rule's action and the third rule's action.
	4981	In this example, the output of the program changes.
	4982
	4983	Bison resolves a reduce/reduce conflict by choosing to use the rule that
	4984	appears first in the grammar, but it is very risky to rely on this. Every
	4985	reduce/reduce conflict must be studied and usually eliminated. Here is the
	4986	proper way to define @code{sequence}:
	4987
	4988	@example
	4989	sequence: /* empty */
	4990	@{ printf ("empty sequence\n"); @}
	4991	\| sequence word
	4992	@{ printf ("added word %s\n", $2); @}
	4993	;
	4994	@end example
	4995
	4996	Here is another common error that yields a reduce/reduce conflict:
	4997
	4998	@example
	4999	sequence: /* empty */
	5000	\| sequence words
	5001	\| sequence redirects
	5002	;
	5003
	5004	words: /* empty */
	5005	\| words word
	5006	;
	5007
	5008	redirects:/* empty */
	5009	\| redirects redirect
	5010	;
	5011	@end example
	5012
	5013	@noindent
	5014	The intention here is to define a sequence which can contain either
	5015	@code{word} or @code{redirect} groupings. The individual definitions of
	5016	@code{sequence}, @code{words} and @code{redirects} are error-free, but the
	5017	three together make a subtle ambiguity: even an empty input can be parsed
	5018	in infinitely many ways!
	5019
	5020	Consider: nothing-at-all could be a @code{words}. Or it could be two
	5021	@code{words} in a row, or three, or any number. It could equally well be a
	5022	@code{redirects}, or two, or any number. Or it could be a @code{words}
	5023	followed by three @code{redirects} and another @code{words}. And so on.
	5024
	5025	Here are two ways to correct these rules. First, to make it a single level
	5026	of sequence:
	5027
	5028	@example
	5029	sequence: /* empty */
	5030	\| sequence word
	5031	\| sequence redirect
	5032	;
	5033	@end example
	5034
	5035	Second, to prevent either a @code{words} or a @code{redirects}
	5036	from being empty:
	5037
	5038	@example
	5039	sequence: /* empty */
	5040	\| sequence words
	5041	\| sequence redirects
	5042	;
	5043
	5044	words: word
	5045	\| words word
	5046	;
	5047
	5048	redirects:redirect
	5049	\| redirects redirect
	5050	;
	5051	@end example
	5052
	5053	@node Mystery Conflicts
	5054	@section Mysterious Reduce/Reduce Conflicts
	5055
	5056	Sometimes reduce/reduce conflicts can occur that don't look warranted.
	5057	Here is an example:
	5058
	5059	@example
	5060	@group
	5061	%token ID
	5062
	5063	%%
	5064	def: param_spec return_spec ','
	5065	;
	5066	param_spec:
	5067	type
	5068	\| name_list ':' type
	5069	;
	5070	@end group
	5071	@group
	5072	return_spec:
	5073	type
	5074	\| name ':' type
	5075	;
	5076	@end group
	5077	@group
	5078	type: ID
	5079	;
	5080	@end group
	5081	@group
	5082	name: ID
	5083	;
	5084	name_list:
	5085	name
	5086	\| name ',' name_list
	5087	;
	5088	@end group
	5089	@end example
	5090
	5091	It would seem that this grammar can be parsed with only a single token
	5092	of look-ahead: when a @code{param_spec} is being read, an @code{ID} is
	5093	a @code{name} if a comma or colon follows, or a @code{type} if another
	5094	@code{ID} follows. In other words, this grammar is @acronym{LR}(1).
	5095
	5096	@cindex @acronym{LR}(1)
	5097	@cindex @acronym{LALR}(1)
	5098	However, Bison, like most parser generators, cannot actually handle all
	5099	@acronym{LR}(1) grammars. In this grammar, two contexts, that after
	5100	an @code{ID}
	5101	at the beginning of a @code{param_spec} and likewise at the beginning of
	5102	a @code{return_spec}, are similar enough that Bison assumes they are the
	5103	same. They appear similar because the same set of rules would be
	5104	active---the rule for reducing to a @code{name} and that for reducing to
	5105	a @code{type}. Bison is unable to determine at that stage of processing
	5106	that the rules would require different look-ahead tokens in the two
	5107	contexts, so it makes a single parser state for them both. Combining
	5108	the two contexts causes a conflict later. In parser terminology, this
	5109	occurrence means that the grammar is not @acronym{LALR}(1).
	5110
	5111	In general, it is better to fix deficiencies than to document them. But
	5112	this particular deficiency is intrinsically hard to fix; parser
	5113	generators that can handle @acronym{LR}(1) grammars are hard to write
	5114	and tend to
	5115	produce parsers that are very large. In practice, Bison is more useful
	5116	as it is now.
	5117
	5118	When the problem arises, you can often fix it by identifying the two
	5119	parser states that are being confused, and adding something to make them
	5120	look distinct. In the above example, adding one rule to
	5121	@code{return_spec} as follows makes the problem go away:
	5122
	5123	@example
	5124	@group
	5125	%token BOGUS
	5126	@dots{}
	5127	%%
	5128	@dots{}
	5129	return_spec:
	5130	type
	5131	\| name ':' type
	5132	/* This rule is never used. */
	5133	\| ID BOGUS
	5134	;
	5135	@end group
	5136	@end example
	5137
	5138	This corrects the problem because it introduces the possibility of an
	5139	additional active rule in the context after the @code{ID} at the beginning of
	5140	@code{return_spec}. This rule is not active in the corresponding context
	5141	in a @code{param_spec}, so the two contexts receive distinct parser states.
	5142	As long as the token @code{BOGUS} is never generated by @code{yylex},
	5143	the added rule cannot alter the way actual input is parsed.
	5144
	5145	In this particular example, there is another way to solve the problem:
	5146	rewrite the rule for @code{return_spec} to use @code{ID} directly
	5147	instead of via @code{name}. This also causes the two confusing
	5148	contexts to have different sets of active rules, because the one for
	5149	@code{return_spec} activates the altered rule for @code{return_spec}
	5150	rather than the one for @code{name}.
	5151
	5152	@example
	5153	param_spec:
	5154	type
	5155	\| name_list ':' type
	5156	;
	5157	return_spec:
	5158	type
	5159	\| ID ':' type
	5160	;
	5161	@end example
	5162
	5163	@node Generalized LR Parsing
	5164	@section Generalized @acronym{LR} (@acronym{GLR}) Parsing
	5165	@cindex @acronym{GLR} parsing
	5166	@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
	5167	@cindex ambiguous grammars
	5168	@cindex non-deterministic parsing
	5169
	5170	Bison produces @emph{deterministic} parsers that choose uniquely
	5171	when to reduce and which reduction to apply
	5172	based on a summary of the preceding input and on one extra token of lookahead.
	5173	As a result, normal Bison handles a proper subset of the family of
	5174	context-free languages.
	5175	Ambiguous grammars, since they have strings with more than one possible
	5176	sequence of reductions cannot have deterministic parsers in this sense.
	5177	The same is true of languages that require more than one symbol of
	5178	lookahead, since the parser lacks the information necessary to make a
	5179	decision at the point it must be made in a shift-reduce parser.
	5180	Finally, as previously mentioned (@pxref{Mystery Conflicts}),
	5181	there are languages where Bison's particular choice of how to
	5182	summarize the input seen so far loses necessary information.
	5183
	5184	When you use the @samp{%glr-parser} declaration in your grammar file,
	5185	Bison generates a parser that uses a different algorithm, called
	5186	Generalized @acronym{LR} (or @acronym{GLR}). A Bison @acronym{GLR}
	5187	parser uses the same basic
	5188	algorithm for parsing as an ordinary Bison parser, but behaves
	5189	differently in cases where there is a shift-reduce conflict that has not
	5190	been resolved by precedence rules (@pxref{Precedence}) or a
	5191	reduce-reduce conflict. When a @acronym{GLR} parser encounters such a
	5192	situation, it
	5193	effectively @emph{splits} into a several parsers, one for each possible
	5194	shift or reduction. These parsers then proceed as usual, consuming
	5195	tokens in lock-step. Some of the stacks may encounter other conflicts
	5196	and split further, with the result that instead of a sequence of states,
	5197	a Bison @acronym{GLR} parsing stack is what is in effect a tree of states.
	5198
	5199	In effect, each stack represents a guess as to what the proper parse
	5200	is. Additional input may indicate that a guess was wrong, in which case
	5201	the appropriate stack silently disappears. Otherwise, the semantics
	5202	actions generated in each stack are saved, rather than being executed
	5203	immediately. When a stack disappears, its saved semantic actions never
	5204	get executed. When a reduction causes two stacks to become equivalent,
	5205	their sets of semantic actions are both saved with the state that
	5206	results from the reduction. We say that two stacks are equivalent
	5207	when they both represent the same sequence of states,
	5208	and each pair of corresponding states represents a
	5209	grammar symbol that produces the same segment of the input token
	5210	stream.
	5211
	5212	Whenever the parser makes a transition from having multiple
	5213	states to having one, it reverts to the normal @acronym{LALR}(1) parsing
	5214	algorithm, after resolving and executing the saved-up actions.
	5215	At this transition, some of the states on the stack will have semantic
	5216	values that are sets (actually multisets) of possible actions. The
	5217	parser tries to pick one of the actions by first finding one whose rule
	5218	has the highest dynamic precedence, as set by the @samp{%dprec}
	5219	declaration. Otherwise, if the alternative actions are not ordered by
	5220	precedence, but there the same merging function is declared for both
	5221	rules by the @samp{%merge} declaration,
	5222	Bison resolves and evaluates both and then calls the merge function on
	5223	the result. Otherwise, it reports an ambiguity.
	5224
	5225	It is possible to use a data structure for the @acronym{GLR} parsing tree that
	5226	permits the processing of any @acronym{LALR}(1) grammar in linear time (in the
	5227	size of the input), any unambiguous (not necessarily
	5228	@acronym{LALR}(1)) grammar in
	5229	quadratic worst-case time, and any general (possibly ambiguous)
	5230	context-free grammar in cubic worst-case time. However, Bison currently
	5231	uses a simpler data structure that requires time proportional to the
	5232	length of the input times the maximum number of stacks required for any
	5233	prefix of the input. Thus, really ambiguous or non-deterministic
	5234	grammars can require exponential time and space to process. Such badly
	5235	behaving examples, however, are not generally of practical interest.
	5236	Usually, non-determinism in a grammar is local---the parser is ``in
	5237	doubt'' only for a few tokens at a time. Therefore, the current data
	5238	structure should generally be adequate. On @acronym{LALR}(1) portions of a
	5239	grammar, in particular, it is only slightly slower than with the default
	5240	Bison parser.
	5241
	5242	For a more detailed exposition of GLR parsers, please see: Elizabeth
	5243	Scott, Adrian Johnstone and Shamsa Sadaf Hussain, Tomita-Style
	5244	Generalised @acronym{LR} Parsers, Royal Holloway, University of
	5245	London, Department of Computer Science, TR-00-12,
	5246	@uref{http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps},
	5247	(2000-12-24).
	5248
	5249	@node Stack Overflow
	5250	@section Stack Overflow, and How to Avoid It
	5251	@cindex stack overflow
	5252	@cindex parser stack overflow
	5253	@cindex overflow of parser stack
	5254
	5255	The Bison parser stack can overflow if too many tokens are shifted and
	5256	not reduced. When this happens, the parser function @code{yyparse}
	5257	returns a nonzero value, pausing only to call @code{yyerror} to report
	5258	the overflow.
	5259
	5260	Because Bison parsers have growing stacks, hitting the upper limit
	5261	usually results from using a right recursion instead of a left
	5262	recursion, @xref{Recursion, ,Recursive Rules}.
	5263
	5264	@vindex YYMAXDEPTH
	5265	By defining the macro @code{YYMAXDEPTH}, you can control how deep the
	5266	parser stack can become before a stack overflow occurs. Define the
	5267	macro with a value that is an integer. This value is the maximum number
	5268	of tokens that can be shifted (and not reduced) before overflow.
	5269	It must be a constant expression whose value is known at compile time.
	5270
	5271	The stack space allowed is not necessarily allocated. If you specify a
	5272	large value for @code{YYMAXDEPTH}, the parser actually allocates a small
	5273	stack at first, and then makes it bigger by stages as needed. This
	5274	increasing allocation happens automatically and silently. Therefore,
	5275	you do not need to make @code{YYMAXDEPTH} painfully small merely to save
	5276	space for ordinary inputs that do not need much stack.
	5277
	5278	@cindex default stack limit
	5279	The default value of @code{YYMAXDEPTH}, if you do not define it, is
	5280	10000.
	5281
	5282	@vindex YYINITDEPTH
	5283	You can control how much stack is allocated initially by defining the
	5284	macro @code{YYINITDEPTH}. This value too must be a compile-time
	5285	constant integer. The default is 200.
	5286
	5287	@c FIXME: C++ output.
	5288	Because of semantical differences between C and C++, the
	5289	@acronym{LALR}(1) parsers
	5290	in C produced by Bison by compiled as C++ cannot grow. In this precise
	5291	case (compiling a C parser as C++) you are suggested to grow
	5292	@code{YYINITDEPTH}. In the near future, a C++ output output will be
	5293	provided which addresses this issue.
	5294
	5295	@node Error Recovery
	5296	@chapter Error Recovery
	5297	@cindex error recovery
	5298	@cindex recovery from errors
	5299
	5300	It is not usually acceptable to have a program terminate on a syntax
	5301	error. For example, a compiler should recover sufficiently to parse the
	5302	rest of the input file and check it for errors; a calculator should accept
	5303	another expression.
	5304
	5305	In a simple interactive command parser where each input is one line, it may
	5306	be sufficient to allow @code{yyparse} to return 1 on error and have the
	5307	caller ignore the rest of the input line when that happens (and then call
	5308	@code{yyparse} again). But this is inadequate for a compiler, because it
	5309	forgets all the syntactic context leading up to the error. A syntax error
	5310	deep within a function in the compiler input should not cause the compiler
	5311	to treat the following line like the beginning of a source file.
	5312
	5313	@findex error
	5314	You can define how to recover from a syntax error by writing rules to
	5315	recognize the special token @code{error}. This is a terminal symbol that
	5316	is always defined (you need not declare it) and reserved for error
	5317	handling. The Bison parser generates an @code{error} token whenever a
	5318	syntax error happens; if you have provided a rule to recognize this token
	5319	in the current context, the parse can continue.
	5320
	5321	For example:
	5322
	5323	@example
	5324	stmnts: /* empty string */
	5325	\| stmnts '\n'
	5326	\| stmnts exp '\n'
	5327	\| stmnts error '\n'
	5328	@end example
	5329
	5330	The fourth rule in this example says that an error followed by a newline
	5331	makes a valid addition to any @code{stmnts}.
	5332
	5333	What happens if a syntax error occurs in the middle of an @code{exp}? The
	5334	error recovery rule, interpreted strictly, applies to the precise sequence
	5335	of a @code{stmnts}, an @code{error} and a newline. If an error occurs in
	5336	the middle of an @code{exp}, there will probably be some additional tokens
	5337	and subexpressions on the stack after the last @code{stmnts}, and there
	5338	will be tokens to read before the next newline. So the rule is not
	5339	applicable in the ordinary way.
	5340
	5341	But Bison can force the situation to fit the rule, by discarding part of
	5342	the semantic context and part of the input. First it discards states
	5343	and objects from the stack until it gets back to a state in which the
	5344	@code{error} token is acceptable. (This means that the subexpressions
	5345	already parsed are discarded, back to the last complete @code{stmnts}.)
	5346	At this point the @code{error} token can be shifted. Then, if the old
	5347	look-ahead token is not acceptable to be shifted next, the parser reads
	5348	tokens and discards them until it finds a token which is acceptable. In
	5349	this example, Bison reads and discards input until the next newline so
	5350	that the fourth rule can apply. Note that discarded symbols are
	5351	possible sources of memory leaks, see @ref{Destructor Decl, , Freeing
	5352	Discarded Symbols}, for a means to reclaim this memory.
	5353
	5354	The choice of error rules in the grammar is a choice of strategies for
	5355	error recovery. A simple and useful strategy is simply to skip the rest of
	5356	the current input line or current statement if an error is detected:
	5357
	5358	@example
	5359	stmnt: error ';' /* On error, skip until ';' is read. */
	5360	@end example
	5361
	5362	It is also useful to recover to the matching close-delimiter of an
	5363	opening-delimiter that has already been parsed. Otherwise the
	5364	close-delimiter will probably appear to be unmatched, and generate another,
	5365	spurious error message:
	5366
	5367	@example
	5368	primary: '(' expr ')'
	5369	\| '(' error ')'
	5370	@dots{}
	5371	;
	5372	@end example
	5373
	5374	Error recovery strategies are necessarily guesses. When they guess wrong,
	5375	one syntax error often leads to another. In the above example, the error
	5376	recovery rule guesses that an error is due to bad input within one
	5377	@code{stmnt}. Suppose that instead a spurious semicolon is inserted in the
	5378	middle of a valid @code{stmnt}. After the error recovery rule recovers
	5379	from the first error, another syntax error will be found straightaway,
	5380	since the text following the spurious semicolon is also an invalid
	5381	@code{stmnt}.
	5382
	5383	To prevent an outpouring of error messages, the parser will output no error
	5384	message for another syntax error that happens shortly after the first; only
	5385	after three consecutive input tokens have been successfully shifted will
	5386	error messages resume.
	5387
	5388	Note that rules which accept the @code{error} token may have actions, just
	5389	as any other rules can.
	5390
	5391	@findex yyerrok
	5392	You can make error messages resume immediately by using the macro
	5393	@code{yyerrok} in an action. If you do this in the error rule's action, no
	5394	error messages will be suppressed. This macro requires no arguments;
	5395	@samp{yyerrok;} is a valid C statement.
	5396
	5397	@findex yyclearin
	5398	The previous look-ahead token is reanalyzed immediately after an error. If
	5399	this is unacceptable, then the macro @code{yyclearin} may be used to clear
	5400	this token. Write the statement @samp{yyclearin;} in the error rule's
	5401	action.
	5402
	5403	For example, suppose that on a syntax error, an error handling routine is
	5404	called that advances the input stream to some point where parsing should
	5405	once again commence. The next symbol returned by the lexical scanner is
	5406	probably correct. The previous look-ahead token ought to be discarded
	5407	with @samp{yyclearin;}.
	5408
	5409	@vindex YYRECOVERING
	5410	The macro @code{YYRECOVERING} stands for an expression that has the
	5411	value 1 when the parser is recovering from a syntax error, and 0 the
	5412	rest of the time. A value of 1 indicates that error messages are
	5413	currently suppressed for new syntax errors.
	5414
	5415	@node Context Dependency
	5416	@chapter Handling Context Dependencies
	5417
	5418	The Bison paradigm is to parse tokens first, then group them into larger
	5419	syntactic units. In many languages, the meaning of a token is affected by
	5420	its context. Although this violates the Bison paradigm, certain techniques
	5421	(known as @dfn{kludges}) may enable you to write Bison parsers for such
	5422	languages.
	5423
	5424	@menu
	5425	* Semantic Tokens:: Token parsing can depend on the semantic context.
	5426	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	5427	* Tie-in Recovery:: Lexical tie-ins have implications for how
	5428	error recovery rules must be written.
	5429	@end menu
	5430
	5431	(Actually, ``kludge'' means any technique that gets its job done but is
	5432	neither clean nor robust.)
	5433
	5434	@node Semantic Tokens
	5435	@section Semantic Info in Token Types
	5436
	5437	The C language has a context dependency: the way an identifier is used
	5438	depends on what its current meaning is. For example, consider this:
	5439
	5440	@example
	5441	foo (x);
	5442	@end example
	5443
	5444	This looks like a function call statement, but if @code{foo} is a typedef
	5445	name, then this is actually a declaration of @code{x}. How can a Bison
	5446	parser for C decide how to parse this input?
	5447
	5448	The method used in @acronym{GNU} C is to have two different token types,
	5449	@code{IDENTIFIER} and @code{TYPENAME}. When @code{yylex} finds an
	5450	identifier, it looks up the current declaration of the identifier in order
	5451	to decide which token type to return: @code{TYPENAME} if the identifier is
	5452	declared as a typedef, @code{IDENTIFIER} otherwise.
	5453
	5454	The grammar rules can then express the context dependency by the choice of
	5455	token type to recognize. @code{IDENTIFIER} is accepted as an expression,
	5456	but @code{TYPENAME} is not. @code{TYPENAME} can start a declaration, but
	5457	@code{IDENTIFIER} cannot. In contexts where the meaning of the identifier
	5458	is @emph{not} significant, such as in declarations that can shadow a
	5459	typedef name, either @code{TYPENAME} or @code{IDENTIFIER} is
	5460	accepted---there is one rule for each of the two token types.
	5461
	5462	This technique is simple to use if the decision of which kinds of
	5463	identifiers to allow is made at a place close to where the identifier is
	5464	parsed. But in C this is not always so: C allows a declaration to
	5465	redeclare a typedef name provided an explicit type has been specified
	5466	earlier:
	5467
	5468	@example
	5469	typedef int foo, bar, lose;
	5470	static foo (bar); /* @r{redeclare @code{bar} as static variable} */
	5471	static int foo (lose); /* @r{redeclare @code{foo} as function} */
	5472	@end example
	5473
	5474	Unfortunately, the name being declared is separated from the declaration
	5475	construct itself by a complicated syntactic structure---the ``declarator''.
	5476
	5477	As a result, part of the Bison parser for C needs to be duplicated, with
	5478	all the nonterminal names changed: once for parsing a declaration in
	5479	which a typedef name can be redefined, and once for parsing a
	5480	declaration in which that can't be done. Here is a part of the
	5481	duplication, with actions omitted for brevity:
	5482
	5483	@example
	5484	initdcl:
	5485	declarator maybeasm '='
	5486	init
	5487	\| declarator maybeasm
	5488	;
	5489
	5490	notype_initdcl:
	5491	notype_declarator maybeasm '='
	5492	init
	5493	\| notype_declarator maybeasm
	5494	;
	5495	@end example
	5496
	5497	@noindent
	5498	Here @code{initdcl} can redeclare a typedef name, but @code{notype_initdcl}
	5499	cannot. The distinction between @code{declarator} and
	5500	@code{notype_declarator} is the same sort of thing.
	5501
	5502	There is some similarity between this technique and a lexical tie-in
	5503	(described next), in that information which alters the lexical analysis is
	5504	changed during parsing by other parts of the program. The difference is
	5505	here the information is global, and is used for other purposes in the
	5506	program. A true lexical tie-in has a special-purpose flag controlled by
	5507	the syntactic context.
	5508
	5509	@node Lexical Tie-ins
	5510	@section Lexical Tie-ins
	5511	@cindex lexical tie-in
	5512
	5513	One way to handle context-dependency is the @dfn{lexical tie-in}: a flag
	5514	which is set by Bison actions, whose purpose is to alter the way tokens are
	5515	parsed.
	5516
	5517	For example, suppose we have a language vaguely like C, but with a special
	5518	construct @samp{hex (@var{hex-expr})}. After the keyword @code{hex} comes
	5519	an expression in parentheses in which all integers are hexadecimal. In
	5520	particular, the token @samp{a1b} must be treated as an integer rather than
	5521	as an identifier if it appears in that context. Here is how you can do it:
	5522
	5523	@example
	5524	@group
	5525	%@{
	5526	int hexflag;
	5527	int yylex (void);
	5528	void yyerror (char const *);
	5529	%@}
	5530	%%
	5531	@dots{}
	5532	@end group
	5533	@group
	5534	expr: IDENTIFIER
	5535	\| constant
	5536	\| HEX '('
	5537	@{ hexflag = 1; @}
	5538	expr ')'
	5539	@{ hexflag = 0;
	5540	$$ = $4; @}
	5541	\| expr '+' expr
	5542	@{ $$ = make_sum ($1, $3); @}
	5543	@dots{}
	5544	;
	5545	@end group
	5546
	5547	@group
	5548	constant:
	5549	INTEGER
	5550	\| STRING
	5551	;
	5552	@end group
	5553	@end example
	5554
	5555	@noindent
	5556	Here we assume that @code{yylex} looks at the value of @code{hexflag}; when
	5557	it is nonzero, all integers are parsed in hexadecimal, and tokens starting
	5558	with letters are parsed as integers if possible.
	5559
	5560	The declaration of @code{hexflag} shown in the prologue of the parser file
	5561	is needed to make it accessible to the actions (@pxref{Prologue, ,The Prologue}).
	5562	You must also write the code in @code{yylex} to obey the flag.
	5563
	5564	@node Tie-in Recovery
	5565	@section Lexical Tie-ins and Error Recovery
	5566
	5567	Lexical tie-ins make strict demands on any error recovery rules you have.
	5568	@xref{Error Recovery}.
	5569
	5570	The reason for this is that the purpose of an error recovery rule is to
	5571	abort the parsing of one construct and resume in some larger construct.
	5572	For example, in C-like languages, a typical error recovery rule is to skip
	5573	tokens until the next semicolon, and then start a new statement, like this:
	5574
	5575	@example
	5576	stmt: expr ';'
	5577	\| IF '(' expr ')' stmt @{ @dots{} @}
	5578	@dots{}
	5579	error ';'
	5580	@{ hexflag = 0; @}
	5581	;
	5582	@end example
	5583
	5584	If there is a syntax error in the middle of a @samp{hex (@var{expr})}
	5585	construct, this error rule will apply, and then the action for the
	5586	completed @samp{hex (@var{expr})} will never run. So @code{hexflag} would
	5587	remain set for the entire rest of the input, or until the next @code{hex}
	5588	keyword, causing identifiers to be misinterpreted as integers.
	5589
	5590	To avoid this problem the error recovery rule itself clears @code{hexflag}.
	5591
	5592	There may also be an error recovery rule that works within expressions.
	5593	For example, there could be a rule which applies within parentheses
	5594	and skips to the close-parenthesis:
	5595
	5596	@example
	5597	@group
	5598	expr: @dots{}
	5599	\| '(' expr ')'
	5600	@{ $$ = $2; @}
	5601	\| '(' error ')'
	5602	@dots{}
	5603	@end group
	5604	@end example
	5605
	5606	If this rule acts within the @code{hex} construct, it is not going to abort
	5607	that construct (since it applies to an inner level of parentheses within
	5608	the construct). Therefore, it should not clear the flag: the rest of
	5609	the @code{hex} construct should be parsed with the flag still in effect.
	5610
	5611	What if there is an error recovery rule which might abort out of the
	5612	@code{hex} construct or might not, depending on circumstances? There is no
	5613	way you can write the action to determine whether a @code{hex} construct is
	5614	being aborted or not. So if you are using a lexical tie-in, you had better
	5615	make sure your error recovery rules are not of this kind. Each rule must
	5616	be such that you can be sure that it always will, or always won't, have to
	5617	clear the flag.
	5618
	5619	@c ================================================== Debugging Your Parser
	5620
	5621	@node Debugging
	5622	@chapter Debugging Your Parser
	5623
	5624	Developing a parser can be a challenge, especially if you don't
	5625	understand the algorithm (@pxref{Algorithm, ,The Bison Parser
	5626	Algorithm}). Even so, sometimes a detailed description of the automaton
	5627	can help (@pxref{Understanding, , Understanding Your Parser}), or
	5628	tracing the execution of the parser can give some insight on why it
	5629	behaves improperly (@pxref{Tracing, , Tracing Your Parser}).
	5630
	5631	@menu
	5632	* Understanding:: Understanding the structure of your parser.
	5633	* Tracing:: Tracing the execution of your parser.
	5634	@end menu
	5635
	5636	@node Understanding
	5637	@section Understanding Your Parser
	5638
	5639	As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm})
	5640	Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
	5641	frequent than one would hope), looking at this automaton is required to
	5642	tune or simply fix a parser. Bison provides two different
	5643	representation of it, either textually or graphically (as a @acronym{VCG}
	5644	file).
	5645
	5646	The textual file is generated when the options @option{--report} or
	5647	@option{--verbose} are specified, see @xref{Invocation, , Invoking
	5648	Bison}. Its name is made by removing @samp{.tab.c} or @samp{.c} from
	5649	the parser output file name, and adding @samp{.output} instead.
	5650	Therefore, if the input file is @file{foo.y}, then the parser file is
	5651	called @file{foo.tab.c} by default. As a consequence, the verbose
	5652	output file is called @file{foo.output}.
	5653
	5654	The following grammar file, @file{calc.y}, will be used in the sequel:
	5655
	5656	@example
	5657	%token NUM STR
	5658	%left '+' '-'
	5659	%left '*'
	5660	%%
	5661	exp: exp '+' exp
	5662	\| exp '-' exp
	5663	\| exp '*' exp
	5664	\| exp '/' exp
	5665	\| NUM
	5666	;
	5667	useless: STR;
	5668	%%
	5669	@end example
	5670
	5671	@command{bison} reports:
	5672
	5673	@example
	5674	calc.y: warning: 1 useless nonterminal and 1 useless rule
	5675	calc.y:11.1-7: warning: useless nonterminal: useless
	5676	calc.y:11.10-12: warning: useless rule: useless: STR
	5677	calc.y: conflicts: 7 shift/reduce
	5678	@end example
	5679
	5680	When given @option{--report=state}, in addition to @file{calc.tab.c}, it
	5681	creates a file @file{calc.output} with contents detailed below. The
	5682	order of the output and the exact presentation might vary, but the
	5683	interpretation is the same.
	5684
	5685	The first section includes details on conflicts that were solved thanks
	5686	to precedence and/or associativity:
	5687
	5688	@example
	5689	Conflict in state 8 between rule 2 and token '+' resolved as reduce.
	5690	Conflict in state 8 between rule 2 and token '-' resolved as reduce.
	5691	Conflict in state 8 between rule 2 and token '*' resolved as shift.
	5692	@exdent @dots{}
	5693	@end example
	5694
	5695	@noindent
	5696	The next section lists states that still have conflicts.
	5697
	5698	@example
	5699	State 8 conflicts: 1 shift/reduce
	5700	State 9 conflicts: 1 shift/reduce
	5701	State 10 conflicts: 1 shift/reduce
	5702	State 11 conflicts: 4 shift/reduce
	5703	@end example
	5704
	5705	@noindent
	5706	@cindex token, useless
	5707	@cindex useless token
	5708	@cindex nonterminal, useless
	5709	@cindex useless nonterminal
	5710	@cindex rule, useless
	5711	@cindex useless rule
	5712	The next section reports useless tokens, nonterminal and rules. Useless
	5713	nonterminals and rules are removed in order to produce a smaller parser,
	5714	but useless tokens are preserved, since they might be used by the
	5715	scanner (note the difference between ``useless'' and ``not used''
	5716	below):
	5717
	5718	@example
	5719	Useless nonterminals:
	5720	useless
	5721
	5722	Terminals which are not used:
	5723	STR
	5724
	5725	Useless rules:
	5726	#6 useless: STR;
	5727	@end example
	5728
	5729	@noindent
	5730	The next section reproduces the exact grammar that Bison used:
	5731
	5732	@example
	5733	Grammar
	5734
	5735	Number, Line, Rule
	5736	0 5 $accept -> exp $end
	5737	1 5 exp -> exp '+' exp
	5738	2 6 exp -> exp '-' exp
	5739	3 7 exp -> exp '*' exp
	5740	4 8 exp -> exp '/' exp
	5741	5 9 exp -> NUM
	5742	@end example
	5743
	5744	@noindent
	5745	and reports the uses of the symbols:
	5746
	5747	@example
	5748	Terminals, with rules where they appear
	5749
	5750	$end (0) 0
	5751	'*' (42) 3
	5752	'+' (43) 1
	5753	'-' (45) 2
	5754	'/' (47) 4
	5755	error (256)
	5756	NUM (258) 5
	5757
	5758	Nonterminals, with rules where they appear
	5759
	5760	$accept (8)
	5761	on left: 0
	5762	exp (9)
	5763	on left: 1 2 3 4 5, on right: 0 1 2 3 4
	5764	@end example
	5765
	5766	@noindent
	5767	@cindex item
	5768	@cindex pointed rule
	5769	@cindex rule, pointed
	5770	Bison then proceeds onto the automaton itself, describing each state
	5771	with it set of @dfn{items}, also known as @dfn{pointed rules}. Each
	5772	item is a production rule together with a point (marked by @samp{.})
	5773	that the input cursor.
	5774
	5775	@example
	5776	state 0
	5777
	5778	$accept -> . exp $ (rule 0)
	5779
	5780	NUM shift, and go to state 1
	5781
	5782	exp go to state 2
	5783	@end example
	5784
	5785	This reads as follows: ``state 0 corresponds to being at the very
	5786	beginning of the parsing, in the initial rule, right before the start
	5787	symbol (here, @code{exp}). When the parser returns to this state right
	5788	after having reduced a rule that produced an @code{exp}, the control
	5789	flow jumps to state 2. If there is no such transition on a nonterminal
	5790	symbol, and the lookahead is a @code{NUM}, then this token is shifted on
	5791	the parse stack, and the control flow jumps to state 1. Any other
	5792	lookahead triggers a syntax error.''
	5793
	5794	@cindex core, item set
	5795	@cindex item set core
	5796	@cindex kernel, item set
	5797	@cindex item set core
	5798	Even though the only active rule in state 0 seems to be rule 0, the
	5799	report lists @code{NUM} as a lookahead symbol because @code{NUM} can be
	5800	at the beginning of any rule deriving an @code{exp}. By default Bison
	5801	reports the so-called @dfn{core} or @dfn{kernel} of the item set, but if
	5802	you want to see more detail you can invoke @command{bison} with
	5803	@option{--report=itemset} to list all the items, include those that can
	5804	be derived:
	5805
	5806	@example
	5807	state 0
	5808
	5809	$accept -> . exp $ (rule 0)
	5810	exp -> . exp '+' exp (rule 1)
	5811	exp -> . exp '-' exp (rule 2)
	5812	exp -> . exp '*' exp (rule 3)
	5813	exp -> . exp '/' exp (rule 4)
	5814	exp -> . NUM (rule 5)
	5815
	5816	NUM shift, and go to state 1
	5817
	5818	exp go to state 2
	5819	@end example
	5820
	5821	@noindent
	5822	In the state 1...
	5823
	5824	@example
	5825	state 1
	5826
	5827	exp -> NUM . (rule 5)
	5828
	5829	$default reduce using rule 5 (exp)
	5830	@end example
	5831
	5832	@noindent
	5833	the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead
	5834	(@samp{$default}), the parser will reduce it. If it was coming from
	5835	state 0, then, after this reduction it will return to state 0, and will
	5836	jump to state 2 (@samp{exp: go to state 2}).
	5837
	5838	@example
	5839	state 2
	5840
	5841	$accept -> exp . $ (rule 0)
	5842	exp -> exp . '+' exp (rule 1)
	5843	exp -> exp . '-' exp (rule 2)
	5844	exp -> exp . '*' exp (rule 3)
	5845	exp -> exp . '/' exp (rule 4)
	5846
	5847	$ shift, and go to state 3
	5848	'+' shift, and go to state 4
	5849	'-' shift, and go to state 5
	5850	'*' shift, and go to state 6
	5851	'/' shift, and go to state 7
	5852	@end example
	5853
	5854	@noindent
	5855	In state 2, the automaton can only shift a symbol. For instance,
	5856	because of the item @samp{exp -> exp . '+' exp}, if the lookahead if
	5857	@samp{+}, it will be shifted on the parse stack, and the automaton
	5858	control will jump to state 4, corresponding to the item @samp{exp -> exp
	5859	'+' . exp}. Since there is no default action, any other token than
	5860	those listed above will trigger a syntax error.
	5861
	5862	The state 3 is named the @dfn{final state}, or the @dfn{accepting
	5863	state}:
	5864
	5865	@example
	5866	state 3
	5867
	5868	$accept -> exp $ . (rule 0)
	5869
	5870	$default accept
	5871	@end example
	5872
	5873	@noindent
	5874	the initial rule is completed (the start symbol and the end
	5875	of input were read), the parsing exits successfully.
	5876
	5877	The interpretation of states 4 to 7 is straightforward, and is left to
	5878	the reader.
	5879
	5880	@example
	5881	state 4
	5882
	5883	exp -> exp '+' . exp (rule 1)
	5884
	5885	NUM shift, and go to state 1
	5886
	5887	exp go to state 8
	5888
	5889	state 5
	5890
	5891	exp -> exp '-' . exp (rule 2)
	5892
	5893	NUM shift, and go to state 1
	5894
	5895	exp go to state 9
	5896
	5897	state 6
	5898
	5899	exp -> exp '*' . exp (rule 3)
	5900
	5901	NUM shift, and go to state 1
	5902
	5903	exp go to state 10
	5904
	5905	state 7
	5906
	5907	exp -> exp '/' . exp (rule 4)
	5908
	5909	NUM shift, and go to state 1
	5910
	5911	exp go to state 11
	5912	@end example
	5913
	5914	As was announced in beginning of the report, @samp{State 8 conflicts:
	5915	1 shift/reduce}:
	5916
	5917	@example
	5918	state 8
	5919
	5920	exp -> exp . '+' exp (rule 1)
	5921	exp -> exp '+' exp . (rule 1)
	5922	exp -> exp . '-' exp (rule 2)
	5923	exp -> exp . '*' exp (rule 3)
	5924	exp -> exp . '/' exp (rule 4)
	5925
	5926	'*' shift, and go to state 6
	5927	'/' shift, and go to state 7
	5928
	5929	'/' [reduce using rule 1 (exp)]
	5930	$default reduce using rule 1 (exp)
	5931	@end example
	5932
	5933	Indeed, there are two actions associated to the lookahead @samp{/}:
	5934	either shifting (and going to state 7), or reducing rule 1. The
	5935	conflict means that either the grammar is ambiguous, or the parser lacks
	5936	information to make the right decision. Indeed the grammar is
	5937	ambiguous, as, since we did not specify the precedence of @samp{/}, the
	5938	sentence @samp{NUM + NUM / NUM} can be parsed as @samp{NUM + (NUM /
	5939	NUM)}, which corresponds to shifting @samp{/}, or as @samp{(NUM + NUM) /
	5940	NUM}, which corresponds to reducing rule 1.
	5941
	5942	Because in @acronym{LALR}(1) parsing a single decision can be made, Bison
	5943	arbitrarily chose to disable the reduction, see @ref{Shift/Reduce, ,
	5944	Shift/Reduce Conflicts}. Discarded actions are reported in between
	5945	square brackets.
	5946
	5947	Note that all the previous states had a single possible action: either
	5948	shifting the next token and going to the corresponding state, or
	5949	reducing a single rule. In the other cases, i.e., when shifting
	5950	@emph{and} reducing is possible or when @emph{several} reductions are
	5951	possible, the lookahead is required to select the action. State 8 is
	5952	one such state: if the lookahead is @samp{*} or @samp{/} then the action
	5953	is shifting, otherwise the action is reducing rule 1. In other words,
	5954	the first two items, corresponding to rule 1, are not eligible when the
	5955	lookahead is @samp{}, since we specified that @samp{} has higher
	5956	precedence that @samp{+}. More generally, some items are eligible only
	5957	with some set of possible lookaheads. When run with
	5958	@option{--report=lookahead}, Bison specifies these lookaheads:
	5959
	5960	@example
	5961	state 8
	5962
	5963	exp -> exp . '+' exp [$, '+', '-', '/'] (rule 1)
	5964	exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1)
	5965	exp -> exp . '-' exp (rule 2)
	5966	exp -> exp . '*' exp (rule 3)
	5967	exp -> exp . '/' exp (rule 4)
	5968
	5969	'*' shift, and go to state 6
	5970	'/' shift, and go to state 7
	5971
	5972	'/' [reduce using rule 1 (exp)]
	5973	$default reduce using rule 1 (exp)
	5974	@end example
	5975
	5976	The remaining states are similar:
	5977
	5978	@example
	5979	state 9
	5980
	5981	exp -> exp . '+' exp (rule 1)
	5982	exp -> exp . '-' exp (rule 2)
	5983	exp -> exp '-' exp . (rule 2)
	5984	exp -> exp . '*' exp (rule 3)
	5985	exp -> exp . '/' exp (rule 4)
	5986
	5987	'*' shift, and go to state 6
	5988	'/' shift, and go to state 7
	5989
	5990	'/' [reduce using rule 2 (exp)]
	5991	$default reduce using rule 2 (exp)
	5992
	5993	state 10
	5994
	5995	exp -> exp . '+' exp (rule 1)
	5996	exp -> exp . '-' exp (rule 2)
	5997	exp -> exp . '*' exp (rule 3)
	5998	exp -> exp '*' exp . (rule 3)
	5999	exp -> exp . '/' exp (rule 4)
	6000
	6001	'/' shift, and go to state 7
	6002
	6003	'/' [reduce using rule 3 (exp)]
	6004	$default reduce using rule 3 (exp)
	6005
	6006	state 11
	6007
	6008	exp -> exp . '+' exp (rule 1)
	6009	exp -> exp . '-' exp (rule 2)
	6010	exp -> exp . '*' exp (rule 3)
	6011	exp -> exp . '/' exp (rule 4)
	6012	exp -> exp '/' exp . (rule 4)
	6013
	6014	'+' shift, and go to state 4
	6015	'-' shift, and go to state 5
	6016	'*' shift, and go to state 6
	6017	'/' shift, and go to state 7
	6018
	6019	'+' [reduce using rule 4 (exp)]
	6020	'-' [reduce using rule 4 (exp)]
	6021	'*' [reduce using rule 4 (exp)]
	6022	'/' [reduce using rule 4 (exp)]
	6023	$default reduce using rule 4 (exp)
	6024	@end example
	6025
	6026	@noindent
	6027	Observe that state 11 contains conflicts due to the lack of precedence
	6028	of @samp{/} wrt @samp{+}, @samp{-}, and @samp{*}, but also because the
	6029	associativity of @samp{/} is not specified.
	6030
	6031
	6032	@node Tracing
	6033	@section Tracing Your Parser
	6034	@findex yydebug
	6035	@cindex debugging
	6036	@cindex tracing the parser
	6037
	6038	If a Bison grammar compiles properly but doesn't do what you want when it
	6039	runs, the @code{yydebug} parser-trace feature can help you figure out why.
	6040
	6041	There are several means to enable compilation of trace facilities:
	6042
	6043	@table @asis
	6044	@item the macro @code{YYDEBUG}
	6045	@findex YYDEBUG
	6046	Define the macro @code{YYDEBUG} to a nonzero value when you compile the
	6047	parser. This is compliant with @acronym{POSIX} Yacc. You could use
	6048	@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define
	6049	YYDEBUG 1} in the prologue of the grammar file (@pxref{Prologue, , The
	6050	Prologue}).
	6051
	6052	@item the option @option{-t}, @option{--debug}
	6053	Use the @samp{-t} option when you run Bison (@pxref{Invocation,
	6054	,Invoking Bison}). This is @acronym{POSIX} compliant too.
	6055
	6056	@item the directive @samp{%debug}
	6057	@findex %debug
	6058	Add the @code{%debug} directive (@pxref{Decl Summary, ,Bison
	6059	Declaration Summary}). This is a Bison extension, which will prove
	6060	useful when Bison will output parsers for languages that don't use a
	6061	preprocessor. Unless @acronym{POSIX} and Yacc portability matter to
	6062	you, this is
	6063	the preferred solution.
	6064	@end table
	6065
	6066	We suggest that you always enable the debug option so that debugging is
	6067	always possible.
	6068
	6069	The trace facility outputs messages with macro calls of the form
	6070	@code{YYFPRINTF (stderr, @var{format}, @var{args})} where
	6071	@var{format} and @var{args} are the usual @code{printf} format and
	6072	arguments. If you define @code{YYDEBUG} to a nonzero value but do not
	6073	define @code{YYFPRINTF}, @code{<stdio.h>} is automatically included
	6074	and @code{YYPRINTF} is defined to @code{fprintf}.
	6075
	6076	Once you have compiled the program with trace facilities, the way to
	6077	request a trace is to store a nonzero value in the variable @code{yydebug}.
	6078	You can do this by making the C code do it (in @code{main}, perhaps), or
	6079	you can alter the value with a C debugger.
	6080
	6081	Each step taken by the parser when @code{yydebug} is nonzero produces a
	6082	line or two of trace information, written on @code{stderr}. The trace
	6083	messages tell you these things:
	6084
	6085	@itemize @bullet
	6086	@item
	6087	Each time the parser calls @code{yylex}, what kind of token was read.
	6088
	6089	@item
	6090	Each time a token is shifted, the depth and complete contents of the
	6091	state stack (@pxref{Parser States}).
	6092
	6093	@item
	6094	Each time a rule is reduced, which rule it is, and the complete contents
	6095	of the state stack afterward.
	6096	@end itemize
	6097
	6098	To make sense of this information, it helps to refer to the listing file
	6099	produced by the Bison @samp{-v} option (@pxref{Invocation, ,Invoking
	6100	Bison}). This file shows the meaning of each state in terms of
	6101	positions in various rules, and also what each state will do with each
	6102	possible input token. As you read the successive trace messages, you
	6103	can see that the parser is functioning according to its specification in
	6104	the listing file. Eventually you will arrive at the place where
	6105	something undesirable happens, and you will see which parts of the
	6106	grammar are to blame.
	6107
	6108	The parser file is a C program and you can use C debuggers on it, but it's
	6109	not easy to interpret what it is doing. The parser function is a
	6110	finite-state machine interpreter, and aside from the actions it executes
	6111	the same code over and over. Only the values of variables show where in
	6112	the grammar it is working.
	6113
	6114	@findex YYPRINT
	6115	The debugging information normally gives the token type of each token
	6116	read, but not its semantic value. You can optionally define a macro
	6117	named @code{YYPRINT} to provide a way to print the value. If you define
	6118	@code{YYPRINT}, it should take three arguments. The parser will pass a
	6119	standard I/O stream, the numeric code for the token type, and the token
	6120	value (from @code{yylval}).
	6121
	6122	Here is an example of @code{YYPRINT} suitable for the multi-function
	6123	calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}):
	6124
	6125	@smallexample
	6126	%@{
	6127	static void print_token_value (FILE *, int, YYSTYPE);
	6128	#define YYPRINT(file, type, value) print_token_value (file, type, value)
	6129	%@}
	6130
	6131	@dots{} %% @dots{} %% @dots{}
	6132
	6133	static void
	6134	print_token_value (FILE *file, int type, YYSTYPE value)
	6135	@{
	6136	if (type == VAR)
	6137	fprintf (file, "%s", value.tptr->name);
	6138	else if (type == NUM)
	6139	fprintf (file, "%d", value.val);
	6140	@}
	6141	@end smallexample
	6142
	6143	@c ================================================= Invoking Bison
	6144
	6145	@node Invocation
	6146	@chapter Invoking Bison
	6147	@cindex invoking Bison
	6148	@cindex Bison invocation
	6149	@cindex options for invoking Bison
	6150
	6151	The usual way to invoke Bison is as follows:
	6152
	6153	@example
	6154	bison @var{infile}
	6155	@end example
	6156
	6157	Here @var{infile} is the grammar file name, which usually ends in
	6158	@samp{.y}. The parser file's name is made by replacing the @samp{.y}
	6159	with @samp{.tab.c}. Thus, the @samp{bison foo.y} filename yields
	6160	@file{foo.tab.c}, and the @samp{bison hack/foo.y} filename yields
	6161	@file{hack/foo.tab.c}. It's also possible, in case you are writing
	6162	C++ code instead of C in your grammar file, to name it @file{foo.ypp}
	6163	or @file{foo.y++}. Then, the output files will take an extension like
	6164	the given one as input (respectively @file{foo.tab.cpp} and
	6165	@file{foo.tab.c++}).
	6166	This feature takes effect with all options that manipulate filenames like
	6167	@samp{-o} or @samp{-d}.
	6168
	6169	For example :
	6170
	6171	@example
	6172	bison -d @var{infile.yxx}
	6173	@end example
	6174	@noindent
	6175	will produce @file{infile.tab.cxx} and @file{infile.tab.hxx}, and
	6176
	6177	@example
	6178	bison -d -o @var{output.c++} @var{infile.y}
	6179	@end example
	6180	@noindent
	6181	will produce @file{output.c++} and @file{outfile.h++}.
	6182
	6183	For compatibility with @acronym{POSIX}, the standard Bison
	6184	distribution also contains a shell script called @command{yacc} that
	6185	invokes Bison with the @option{-y} option.
	6186
	6187	@menu
	6188	* Bison Options:: All the options described in detail,
	6189	in alphabetical order by short options.
	6190	* Option Cross Key:: Alphabetical list of long options.
	6191	* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
	6192	@end menu
	6193
	6194	@node Bison Options
	6195	@section Bison Options
	6196
	6197	Bison supports both traditional single-letter options and mnemonic long
	6198	option names. Long option names are indicated with @samp{--} instead of
	6199	@samp{-}. Abbreviations for option names are allowed as long as they
	6200	are unique. When a long option takes an argument, like
	6201	@samp{--file-prefix}, connect the option name and the argument with
	6202	@samp{=}.
	6203
	6204	Here is a list of options that can be used with Bison, alphabetized by
	6205	short option. It is followed by a cross key alphabetized by long
	6206	option.
	6207
	6208	@c Please, keep this ordered as in `bison --help'.
	6209	@noindent
	6210	Operations modes:
	6211	@table @option
	6212	@item -h
	6213	@itemx --help
	6214	Print a summary of the command-line options to Bison and exit.
	6215
	6216	@item -V
	6217	@itemx --version
	6218	Print the version number of Bison and exit.
	6219
	6220	@need 1750
	6221	@item -y
	6222	@itemx --yacc
	6223	Equivalent to @samp{-o y.tab.c}; the parser output file is called
	6224	@file{y.tab.c}, and the other outputs are called @file{y.output} and
	6225	@file{y.tab.h}. The purpose of this option is to imitate Yacc's output
	6226	file name conventions. Thus, the following shell script can substitute
	6227	for Yacc, and the Bison distribution contains such a script for
	6228	compatibility with @acronym{POSIX}:
	6229
	6230	@example
	6231	#! /bin/sh
	6232	bison -y "$@@"
	6233	@end example
	6234	@end table
	6235
	6236	@noindent
	6237	Tuning the parser:
	6238
	6239	@table @option
	6240	@item -S @var{file}
	6241	@itemx --skeleton=@var{file}
	6242	Specify the skeleton to use. You probably don't need this option unless
	6243	you are developing Bison.
	6244
	6245	@item -t
	6246	@itemx --debug
	6247	In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
	6248	already defined, so that the debugging facilities are compiled.
	6249	@xref{Tracing, ,Tracing Your Parser}.
	6250
	6251	@item --locations
	6252	Pretend that @code{%locations} was specified. @xref{Decl Summary}.
	6253
	6254	@item -p @var{prefix}
	6255	@itemx --name-prefix=@var{prefix}
	6256	Pretend that @code{%name-prefix="@var{prefix}"} was specified.
	6257	@xref{Decl Summary}.
	6258
	6259	@item -l
	6260	@itemx --no-lines
	6261	Don't put any @code{#line} preprocessor commands in the parser file.
	6262	Ordinarily Bison puts them in the parser file so that the C compiler
	6263	and debuggers will associate errors with your source file, the
	6264	grammar file. This option causes them to associate errors with the
	6265	parser file, treating it as an independent source file in its own right.
	6266
	6267	@item -n
	6268	@itemx --no-parser
	6269	Pretend that @code{%no-parser} was specified. @xref{Decl Summary}.
	6270
	6271	@item -k
	6272	@itemx --token-table
	6273	Pretend that @code{%token-table} was specified. @xref{Decl Summary}.
	6274	@end table
	6275
	6276	@noindent
	6277	Adjust the output:
	6278
	6279	@table @option
	6280	@item -d
	6281	@itemx --defines
	6282	Pretend that @code{%defines} was specified, i.e., write an extra output
	6283	file containing macro definitions for the token type names defined in
	6284	the grammar and the semantic value type @code{YYSTYPE}, as well as a few
	6285	@code{extern} variable declarations. @xref{Decl Summary}.
	6286
	6287	@item --defines=@var{defines-file}
	6288	Same as above, but save in the file @var{defines-file}.
	6289
	6290	@item -b @var{file-prefix}
	6291	@itemx --file-prefix=@var{prefix}
	6292	Pretend that @code{%verbose} was specified, i.e, specify prefix to use
	6293	for all Bison output file names. @xref{Decl Summary}.
	6294
	6295	@item -r @var{things}
	6296	@itemx --report=@var{things}
	6297	Write an extra output file containing verbose description of the comma
	6298	separated list of @var{things} among:
	6299
	6300	@table @code
	6301	@item state
	6302	Description of the grammar, conflicts (resolved and unresolved), and
	6303	@acronym{LALR} automaton.
	6304
	6305	@item lookahead
	6306	Implies @code{state} and augments the description of the automaton with
	6307	each rule's lookahead set.
	6308
	6309	@item itemset
	6310	Implies @code{state} and augments the description of the automaton with
	6311	the full set of items for each state, instead of its core only.
	6312	@end table
	6313
	6314	For instance, on the following grammar
	6315
	6316	@item -v
	6317	@itemx --verbose
	6318	Pretend that @code{%verbose} was specified, i.e, write an extra output
	6319	file containing verbose descriptions of the grammar and
	6320	parser. @xref{Decl Summary}.
	6321
	6322	@item -o @var{filename}
	6323	@itemx --output=@var{filename}
	6324	Specify the @var{filename} for the parser file.
	6325
	6326	The other output files' names are constructed from @var{filename} as
	6327	described under the @samp{-v} and @samp{-d} options.
	6328
	6329	@item -g
	6330	Output a @acronym{VCG} definition of the @acronym{LALR}(1) grammar
	6331	automaton computed by Bison. If the grammar file is @file{foo.y}, the
	6332	@acronym{VCG} output file will
	6333	be @file{foo.vcg}.
	6334
	6335	@item --graph=@var{graph-file}
	6336	The behavior of @var{--graph} is the same than @samp{-g}. The only
	6337	difference is that it has an optional argument which is the name of
	6338	the output graph filename.
	6339	@end table
	6340
	6341	@node Option Cross Key
	6342	@section Option Cross Key
	6343
	6344	Here is a list of options, alphabetized by long option, to help you find
	6345	the corresponding short option.
	6346
	6347	@tex
	6348	\def\leaderfill{\leaders\hbox to 1em{\hss.\hss}\hfill}
	6349
	6350	{\tt
	6351	\line{ --debug \leaderfill -t}
	6352	\line{ --defines \leaderfill -d}
	6353	\line{ --file-prefix \leaderfill -b}
	6354	\line{ --graph \leaderfill -g}
	6355	\line{ --help \leaderfill -h}
	6356	\line{ --name-prefix \leaderfill -p}
	6357	\line{ --no-lines \leaderfill -l}
	6358	\line{ --no-parser \leaderfill -n}
	6359	\line{ --output \leaderfill -o}
	6360	\line{ --token-table \leaderfill -k}
	6361	\line{ --verbose \leaderfill -v}
	6362	\line{ --version \leaderfill -V}
	6363	\line{ --yacc \leaderfill -y}
	6364	}
	6365	@end tex
	6366
	6367	@ifinfo
	6368	@example
	6369	--debug -t
	6370	--defines=@var{defines-file} -d
	6371	--file-prefix=@var{prefix} -b @var{file-prefix}
	6372	--graph=@var{graph-file} -d
	6373	--help -h
	6374	--name-prefix=@var{prefix} -p @var{name-prefix}
	6375	--no-lines -l
	6376	--no-parser -n
	6377	--output=@var{outfile} -o @var{outfile}
	6378	--token-table -k
	6379	--verbose -v
	6380	--version -V
	6381	--yacc -y
	6382	@end example
	6383	@end ifinfo
	6384
	6385	@node Yacc Library
	6386	@section Yacc Library
	6387
	6388	The Yacc library contains default implementations of the
	6389	@code{yyerror} and @code{main} functions. These default
	6390	implementations are normally not useful, but @acronym{POSIX} requires
	6391	them. To use the Yacc library, link your program with the
	6392	@option{-ly} option. Note that Bison's implementation of the Yacc
	6393	library is distributed under the terms of the @acronym{GNU} General
	6394	Public License (@pxref{Copying}).
	6395
	6396	If you use the Yacc library's @code{yyerror} function, you should
	6397	declare @code{yyerror} as follows:
	6398
	6399	@example
	6400	int yyerror (char const *);
	6401	@end example
	6402
	6403	Bison ignores the @code{int} value returned by this @code{yyerror}.
	6404	If you use the Yacc library's @code{main} function, your
	6405	@code{yyparse} function should have the following type signature:
	6406
	6407	@example
	6408	int yyparse (void);
	6409	@end example
	6410
	6411	@c ================================================= Invoking Bison
	6412
	6413	@node FAQ
	6414	@chapter Frequently Asked Questions
	6415	@cindex frequently asked questions
	6416	@cindex questions
	6417
	6418	Several questions about Bison come up occasionally. Here some of them
	6419	are addressed.
	6420
	6421	@menu
	6422	* Parser Stack Overflow:: Breaking the Stack Limits
	6423	* How Can I Reset the Parser:: @code{yyparse} Keeps some State
	6424	* Strings are Destroyed:: @code{yylval} Loses Track of Strings
	6425	* C++ Parsers:: Compiling Parsers with C++ Compilers
	6426	* Implementing Loops:: Control Flow in the Calculator
	6427	@end menu
	6428
	6429	@node Parser Stack Overflow
	6430	@section Parser Stack Overflow
	6431
	6432	@display
	6433	My parser returns with error with a @samp{parser stack overflow}
	6434	message. What can I do?
	6435	@end display
	6436
	6437	This question is already addressed elsewhere, @xref{Recursion,
	6438	,Recursive Rules}.
	6439
	6440	@node How Can I Reset the Parser
	6441	@section How Can I Reset the Parser
	6442
	6443	The following phenomenon has several symptoms, resulting in the
	6444	following typical questions:
	6445
	6446	@display
	6447	I invoke @code{yyparse} several times, and on correct input it works
	6448	properly; but when a parse error is found, all the other calls fail
	6449	too. How can I reset the error flag of @code{yyparse}?
	6450	@end display
	6451
	6452	@noindent
	6453	or
	6454
	6455	@display
	6456	My parser includes support for an @samp{#include}-like feature, in
	6457	which case I run @code{yyparse} from @code{yyparse}. This fails
	6458	although I did specify I needed a @code{%pure-parser}.
	6459	@end display
	6460
	6461	These problems typically come not from Bison itself, but from
	6462	Lex-generated scanners. Because these scanners use large buffers for
	6463	speed, they might not notice a change of input file. As a
	6464	demonstration, consider the following source file,
	6465	@file{first-line.l}:
	6466
	6467	@verbatim
	6468	%{
	6469	#include <stdio.h>
	6470	#include <stdlib.h>
	6471	%}
	6472	%%
	6473	.*\n ECHO; return 1;
	6474	%%
	6475	int
	6476	yyparse (char const *file)
	6477	{
	6478	yyin = fopen (file, "r");
	6479	if (!yyin)
	6480	exit (2);
	6481	/* One token only. */
	6482	yylex ();
	6483	if (fclose (yyin) != 0)
	6484	exit (3);
	6485	return 0;
	6486	}
	6487
	6488	int
	6489	main (void)
	6490	{
	6491	yyparse ("input");
	6492	yyparse ("input");
	6493	return 0;
	6494	}
	6495	@end verbatim
	6496
	6497	@noindent
	6498	If the file @file{input} contains
	6499
	6500	@verbatim
	6501	input:1: Hello,
	6502	input:2: World!
	6503	@end verbatim
	6504
	6505	@noindent
	6506	then instead of getting the first line twice, you get:
	6507
	6508	@example
	6509	$ @kbd{flex -ofirst-line.c first-line.l}
	6510	$ @kbd{gcc -ofirst-line first-line.c -ll}
	6511	$ @kbd{./first-line}
	6512	input:1: Hello,
	6513	input:2: World!
	6514	@end example
	6515
	6516	Therefore, whenever you change @code{yyin}, you must tell the
	6517	Lex-generated scanner to discard its current buffer and switch to the
	6518	new one. This depends upon your implementation of Lex; see its
	6519	documentation for more. For Flex, it suffices to call
	6520	@samp{YY_FLUSH_BUFFER} after each change to @code{yyin}. If your
	6521	Flex-generated scanner needs to read from several input streams to
	6522	handle features like include files, you might consider using Flex
	6523	functions like @samp{yy_switch_to_buffer} that manipulate multiple
	6524	input buffers.
	6525
	6526	If your Flex-generated scanner uses start conditions (@pxref{Start
	6527	conditions, , Start conditions, flex, The Flex Manual}), you might
	6528	also want to reset the scanner's state, i.e., go back to the initial
	6529	start condition, through a call to @samp{BEGIN (0)}.
	6530
	6531	@node Strings are Destroyed
	6532	@section Strings are Destroyed
	6533
	6534	@display
	6535	My parser seems to destroy old strings, or maybe it loses track of
	6536	them. Instead of reporting @samp{"foo", "bar"}, it reports
	6537	@samp{"bar", "bar"}, or even @samp{"foo\nbar", "bar"}.
	6538	@end display
	6539
	6540	This error is probably the single most frequent ``bug report'' sent to
	6541	Bison lists, but is only concerned with a misunderstanding of the role
	6542	of scanner. Consider the following Lex code:
	6543
	6544	@verbatim
	6545	%{
	6546	#include <stdio.h>
	6547	char *yylval = NULL;
	6548	%}
	6549	%%
	6550	.* yylval = yytext; return 1;
	6551	\n /* IGNORE */
	6552	%%
	6553	int
	6554	main ()
	6555	{
	6556	/* Similar to using $1, $2 in a Bison action. */
	6557	char *fst = (yylex (), yylval);
	6558	char *snd = (yylex (), yylval);
	6559	printf ("\"%s\", \"%s\"\n", fst, snd);
	6560	return 0;
	6561	}
	6562	@end verbatim
	6563
	6564	If you compile and run this code, you get:
	6565
	6566	@example
	6567	$ @kbd{flex -osplit-lines.c split-lines.l}
	6568	$ @kbd{gcc -osplit-lines split-lines.c -ll}
	6569	$ @kbd{printf 'one\ntwo\n' \| ./split-lines}
	6570	"one
	6571	two", "two"
	6572	@end example
	6573
	6574	@noindent
	6575	this is because @code{yytext} is a buffer provided for @emph{reading}
	6576	in the action, but if you want to keep it, you have to duplicate it
	6577	(e.g., using @code{strdup}). Note that the output may depend on how
	6578	your implementation of Lex handles @code{yytext}. For instance, when
	6579	given the Lex compatibility option @option{-l} (which triggers the
	6580	option @samp{%array}) Flex generates a different behavior:
	6581
	6582	@example
	6583	$ @kbd{flex -l -osplit-lines.c split-lines.l}
	6584	$ @kbd{gcc -osplit-lines split-lines.c -ll}
	6585	$ @kbd{printf 'one\ntwo\n' \| ./split-lines}
	6586	"two", "two"
	6587	@end example
	6588
	6589
	6590	@node C++ Parsers
	6591	@section C++ Parsers
	6592
	6593	@display
	6594	How can I generate parsers in C++?
	6595	@end display
	6596
	6597	We are working on a C++ output for Bison, but unfortunately, for lack
	6598	of time, the skeleton is not finished. It is functional, but in
	6599	numerous respects, it will require additional work which @emph{might}
	6600	break backward compatibility. Since the skeleton for C++ is not
	6601	documented, we do not consider ourselves bound to this interface,
	6602	nevertheless, as much as possible we will try to keep compatibility.
	6603
	6604	Another possibility is to use the regular C parsers, and to compile
	6605	them with a C++ compiler. This works properly, provided that you bear
	6606	some simple C++ rules in mind, such as not including ``real classes''
	6607	(i.e., structure with constructors) in unions. Therefore, in the
	6608	@code{%union}, use pointers to classes, or better yet, a single
	6609	pointer type to the root of your lexical/syntactic hierarchy.
	6610
	6611
	6612	@node Implementing Loops
	6613	@section Implementing Loops
	6614
	6615	@display
	6616	My simple calculator supports variables, assignments, and functions,
	6617	but how can I implement loops?
	6618	@end display
	6619
	6620	Although very pedagogical, the examples included in the document blur
	6621	the distinction to make between the parser---whose job is to recover
	6622	the structure of a text and to transmit it to subsequent modules of
	6623	the program---and the processing (such as the execution) of this
	6624	structure. This works well with so called straight line programs,
	6625	i.e., precisely those that have a straightforward execution model:
	6626	execute simple instructions one after the others.
	6627
	6628	@cindex abstract syntax tree
	6629	@cindex @acronym{AST}
	6630	If you want a richer model, you will probably need to use the parser
	6631	to construct a tree that does represent the structure it has
	6632	recovered; this tree is usually called the @dfn{abstract syntax tree},
	6633	or @dfn{@acronym{AST}} for short. Then, walking through this tree,
	6634	traversing it in various ways, will enable treatments such as its
	6635	execution or its translation, which will result in an interpreter or a
	6636	compiler.
	6637
	6638	This topic is way beyond the scope of this manual, and the reader is
	6639	invited to consult the dedicated literature.
	6640
	6641
	6642
	6643	@c ================================================= Table of Symbols
	6644
	6645	@node Table of Symbols
	6646	@appendix Bison Symbols
	6647	@cindex Bison symbols, table of
	6648	@cindex symbols in Bison, table of
	6649
	6650	@deffn {Variable} @@$
	6651	In an action, the location of the left-hand side of the rule.
	6652	@xref{Locations, , Locations Overview}.
	6653	@end deffn
	6654
	6655	@deffn {Variable} @@@var{n}
	6656	In an action, the location of the @var{n}-th symbol of the right-hand
	6657	side of the rule. @xref{Locations, , Locations Overview}.
	6658	@end deffn
	6659
	6660	@deffn {Variable} $$
	6661	In an action, the semantic value of the left-hand side of the rule.
	6662	@xref{Actions}.
	6663	@end deffn
	6664
	6665	@deffn {Variable} $@var{n}
	6666	In an action, the semantic value of the @var{n}-th symbol of the
	6667	right-hand side of the rule. @xref{Actions}.
	6668	@end deffn
	6669
	6670	@deffn {Symbol} $accept
	6671	The predefined nonterminal whose only rule is @samp{$accept: @var{start}
	6672	$end}, where @var{start} is the start symbol. @xref{Start Decl, , The
	6673	Start-Symbol}. It cannot be used in the grammar.
	6674	@end deffn
	6675
	6676	@deffn {Symbol} $end
	6677	The predefined token marking the end of the token stream. It cannot be
	6678	used in the grammar.
	6679	@end deffn
	6680
	6681	@deffn {Symbol} $undefined
	6682	The predefined token onto which all undefined values returned by
	6683	@code{yylex} are mapped. It cannot be used in the grammar, rather, use
	6684	@code{error}.
	6685	@end deffn
	6686
	6687	@deffn {Symbol} error
	6688	A token name reserved for error recovery. This token may be used in
	6689	grammar rules so as to allow the Bison parser to recognize an error in
	6690	the grammar without halting the process. In effect, a sentence
	6691	containing an error may be recognized as valid. On a syntax error, the
	6692	token @code{error} becomes the current look-ahead token. Actions
	6693	corresponding to @code{error} are then executed, and the look-ahead
	6694	token is reset to the token that originally caused the violation.
	6695	@xref{Error Recovery}.
	6696	@end deffn
	6697
	6698	@deffn {Macro} YYABORT
	6699	Macro to pretend that an unrecoverable syntax error has occurred, by
	6700	making @code{yyparse} return 1 immediately. The error reporting
	6701	function @code{yyerror} is not called. @xref{Parser Function, ,The
	6702	Parser Function @code{yyparse}}.
	6703	@end deffn
	6704
	6705	@deffn {Macro} YYACCEPT
	6706	Macro to pretend that a complete utterance of the language has been
	6707	read, by making @code{yyparse} return 0 immediately.
	6708	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6709	@end deffn
	6710
	6711	@deffn {Macro} YYBACKUP
	6712	Macro to discard a value from the parser stack and fake a look-ahead
	6713	token. @xref{Action Features, ,Special Features for Use in Actions}.
	6714	@end deffn
	6715
	6716	@deffn {Macro} YYDEBUG
	6717	Macro to define to equip the parser with tracing code. @xref{Tracing,
	6718	,Tracing Your Parser}.
	6719	@end deffn
	6720
	6721	@deffn {Macro} YYERROR
	6722	Macro to pretend that a syntax error has just been detected: call
	6723	@code{yyerror} and then perform normal error recovery if possible
	6724	(@pxref{Error Recovery}), or (if recovery is impossible) make
	6725	@code{yyparse} return 1. @xref{Error Recovery}.
	6726	@end deffn
	6727
	6728	@deffn {Macro} YYERROR_VERBOSE
	6729	An obsolete macro that you define with @code{#define} in the prologue
	6730	to request verbose, specific error message strings
	6731	when @code{yyerror} is called. It doesn't matter what definition you
	6732	use for @code{YYERROR_VERBOSE}, just whether you define it. Using
	6733	@code{%error-verbose} is preferred.
	6734	@end deffn
	6735
	6736	@deffn {Macro} YYINITDEPTH
	6737	Macro for specifying the initial size of the parser stack.
	6738	@xref{Stack Overflow}.
	6739	@end deffn
	6740
	6741	@deffn {Macro} YYLEX_PARAM
	6742	An obsolete macro for specifying an extra argument (or list of extra
	6743	arguments) for @code{yyparse} to pass to @code{yylex}. he use of this
	6744	macro is deprecated, and is supported only for Yacc like parsers.
	6745	@xref{Pure Calling,, Calling Conventions for Pure Parsers}.
	6746	@end deffn
	6747
	6748	@deffn {Type} YYLTYPE
	6749	Data type of @code{yylloc}; by default, a structure with four
	6750	members. @xref{Location Type, , Data Types of Locations}.
	6751	@end deffn
	6752
	6753	@deffn {Macro} YYMAXDEPTH
	6754	Macro for specifying the maximum size of the parser stack. @xref{Stack
	6755	Overflow}.
	6756	@end deffn
	6757
	6758	@deffn {Macro} YYPARSE_PARAM
	6759	An obsolete macro for specifying the name of a parameter that
	6760	@code{yyparse} should accept. The use of this macro is deprecated, and
	6761	is supported only for Yacc like parsers. @xref{Pure Calling,, Calling
	6762	Conventions for Pure Parsers}.
	6763	@end deffn
	6764
	6765	@deffn {Macro} YYRECOVERING
	6766	Macro whose value indicates whether the parser is recovering from a
	6767	syntax error. @xref{Action Features, ,Special Features for Use in Actions}.
	6768	@end deffn
	6769
	6770	@deffn {Macro} YYSTACK_USE_ALLOCA
	6771	Macro used to control the use of @code{alloca}. If defined to @samp{0},
	6772	the parser will not use @code{alloca} but @code{malloc} when trying to
	6773	grow its internal stacks. Do @emph{not} define @code{YYSTACK_USE_ALLOCA}
	6774	to anything else.
	6775	@end deffn
	6776
	6777	@deffn {Type} YYSTYPE
	6778	Data type of semantic values; @code{int} by default.
	6779	@xref{Value Type, ,Data Types of Semantic Values}.
	6780	@end deffn
	6781
	6782	@deffn {Variable} yychar
	6783	External integer variable that contains the integer value of the current
	6784	look-ahead token. (In a pure parser, it is a local variable within
	6785	@code{yyparse}.) Error-recovery rule actions may examine this variable.
	6786	@xref{Action Features, ,Special Features for Use in Actions}.
	6787	@end deffn
	6788
	6789	@deffn {Variable} yyclearin
	6790	Macro used in error-recovery rule actions. It clears the previous
	6791	look-ahead token. @xref{Error Recovery}.
	6792	@end deffn
	6793
	6794	@deffn {Variable} yydebug
	6795	External integer variable set to zero by default. If @code{yydebug}
	6796	is given a nonzero value, the parser will output information on input
	6797	symbols and parser action. @xref{Tracing, ,Tracing Your Parser}.
	6798	@end deffn
	6799
	6800	@deffn {Macro} yyerrok
	6801	Macro to cause parser to recover immediately to its normal mode
	6802	after a syntax error. @xref{Error Recovery}.
	6803	@end deffn
	6804
	6805	@deffn {Function} yyerror
	6806	User-supplied function to be called by @code{yyparse} on error.
	6807	@xref{Error Reporting, ,The Error
	6808	Reporting Function @code{yyerror}}.
	6809	@end deffn
	6810
	6811	@deffn {Function} yylex
	6812	User-supplied lexical analyzer function, called with no arguments to get
	6813	the next token. @xref{Lexical, ,The Lexical Analyzer Function
	6814	@code{yylex}}.
	6815	@end deffn
	6816
	6817	@deffn {Variable} yylval
	6818	External variable in which @code{yylex} should place the semantic
	6819	value associated with a token. (In a pure parser, it is a local
	6820	variable within @code{yyparse}, and its address is passed to
	6821	@code{yylex}.) @xref{Token Values, ,Semantic Values of Tokens}.
	6822	@end deffn
	6823
	6824	@deffn {Variable} yylloc
	6825	External variable in which @code{yylex} should place the line and column
	6826	numbers associated with a token. (In a pure parser, it is a local
	6827	variable within @code{yyparse}, and its address is passed to
	6828	@code{yylex}.) You can ignore this variable if you don't use the
	6829	@samp{@@} feature in the grammar actions. @xref{Token Locations,
	6830	,Textual Locations of Tokens}.
	6831	@end deffn
	6832
	6833	@deffn {Variable} yynerrs
	6834	Global variable which Bison increments each time there is a syntax error.
	6835	(In a pure parser, it is a local variable within @code{yyparse}.)
	6836	@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
	6837	@end deffn
	6838
	6839	@deffn {Function} yyparse
	6840	The parser function produced by Bison; call this function to start
	6841	parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6842	@end deffn
	6843
	6844	@deffn {Directive} %debug
	6845	Equip the parser for debugging. @xref{Decl Summary}.
	6846	@end deffn
	6847
	6848	@ifset defaultprec
	6849	@deffn {Directive} %default-prec
	6850	Assign a precedence to rules that lack an explicit @samp{%prec}
	6851	modifier. @xref{Contextual Precedence, ,Context-Dependent
	6852	Precedence}.
	6853	@end deffn
	6854	@end ifset
	6855
	6856	@deffn {Directive} %defines
	6857	Bison declaration to create a header file meant for the scanner.
	6858	@xref{Decl Summary}.
	6859	@end deffn
	6860
	6861	@deffn {Directive} %destructor
	6862	Specifying how the parser should reclaim the memory associated to
	6863	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	6864	@end deffn
	6865
	6866	@deffn {Directive} %dprec
	6867	Bison declaration to assign a precedence to a rule that is used at parse
	6868	time to resolve reduce/reduce conflicts. @xref{GLR Parsers, ,Writing
	6869	@acronym{GLR} Parsers}.
	6870	@end deffn
	6871
	6872	@deffn {Directive} %error-verbose
	6873	Bison declaration to request verbose, specific error message strings
	6874	when @code{yyerror} is called.
	6875	@end deffn
	6876
	6877	@deffn {Directive} %file-prefix="@var{prefix}"
	6878	Bison declaration to set the prefix of the output files. @xref{Decl
	6879	Summary}.
	6880	@end deffn
	6881
	6882	@deffn {Directive} %glr-parser
	6883	Bison declaration to produce a @acronym{GLR} parser. @xref{GLR
	6884	Parsers, ,Writing @acronym{GLR} Parsers}.
	6885	@end deffn
	6886
	6887	@deffn {Directive} %left
	6888	Bison declaration to assign left associativity to token(s).
	6889	@xref{Precedence Decl, ,Operator Precedence}.
	6890	@end deffn
	6891
	6892	@deffn {Directive} %lex-param @{@var{argument-declaration}@}
	6893	Bison declaration to specifying an additional parameter that
	6894	@code{yylex} should accept. @xref{Pure Calling,, Calling Conventions
	6895	for Pure Parsers}.
	6896	@end deffn
	6897
	6898	@deffn {Directive} %merge
	6899	Bison declaration to assign a merging function to a rule. If there is a
	6900	reduce/reduce conflict with a rule having the same merging function, the
	6901	function is applied to the two semantic values to get a single result.
	6902	@xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}.
	6903	@end deffn
	6904
	6905	@deffn {Directive} %name-prefix="@var{prefix}"
	6906	Bison declaration to rename the external symbols. @xref{Decl Summary}.
	6907	@end deffn
	6908
	6909	@ifset defaultprec
	6910	@deffn {Directive} %no-default-prec
	6911	Do not assign a precedence to rules that lack an explicit @samp{%prec}
	6912	modifier. @xref{Contextual Precedence, ,Context-Dependent
	6913	Precedence}.
	6914	@end deffn
	6915	@end ifset
	6916
	6917	@deffn {Directive} %no-lines
	6918	Bison declaration to avoid generating @code{#line} directives in the
	6919	parser file. @xref{Decl Summary}.
	6920	@end deffn
	6921
	6922	@deffn {Directive} %nonassoc
	6923	Bison declaration to assign non-associativity to token(s).
	6924	@xref{Precedence Decl, ,Operator Precedence}.
	6925	@end deffn
	6926
	6927	@deffn {Directive} %output="@var{filename}"
	6928	Bison declaration to set the name of the parser file. @xref{Decl
	6929	Summary}.
	6930	@end deffn
	6931
	6932	@deffn {Directive} %parse-param @{@var{argument-declaration}@}
	6933	Bison declaration to specifying an additional parameter that
	6934	@code{yyparse} should accept. @xref{Parser Function,, The Parser
	6935	Function @code{yyparse}}.
	6936	@end deffn
	6937
	6938	@deffn {Directive} %prec
	6939	Bison declaration to assign a precedence to a specific rule.
	6940	@xref{Contextual Precedence, ,Context-Dependent Precedence}.
	6941	@end deffn
	6942
	6943	@deffn {Directive} %pure-parser
	6944	Bison declaration to request a pure (reentrant) parser.
	6945	@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	6946	@end deffn
	6947
	6948	@deffn {Directive} %right
	6949	Bison declaration to assign right associativity to token(s).
	6950	@xref{Precedence Decl, ,Operator Precedence}.
	6951	@end deffn
	6952
	6953	@deffn {Directive} %start
	6954	Bison declaration to specify the start symbol. @xref{Start Decl, ,The
	6955	Start-Symbol}.
	6956	@end deffn
	6957
	6958	@deffn {Directive} %token
	6959	Bison declaration to declare token(s) without specifying precedence.
	6960	@xref{Token Decl, ,Token Type Names}.
	6961	@end deffn
	6962
	6963	@deffn {Directive} %token-table
	6964	Bison declaration to include a token name table in the parser file.
	6965	@xref{Decl Summary}.
	6966	@end deffn
	6967
	6968	@deffn {Directive} %type
	6969	Bison declaration to declare nonterminals. @xref{Type Decl,
	6970	,Nonterminal Symbols}.
	6971	@end deffn
	6972
	6973	@deffn {Directive} %union
	6974	Bison declaration to specify several possible data types for semantic
	6975	values. @xref{Union Decl, ,The Collection of Value Types}.
	6976	@end deffn
	6977
	6978	@sp 1
	6979
	6980	These are the punctuation and delimiters used in Bison input:
	6981
	6982	@deffn {Delimiter} %%
	6983	Delimiter used to separate the grammar rule section from the
	6984	Bison declarations section or the epilogue.
	6985	@xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}.
	6986	@end deffn
	6987
	6988	@c Don't insert spaces, or check the DVI output.
	6989	@deffn {Delimiter} %@{@var{code}%@}
	6990	All code listed between @samp{%@{} and @samp{%@}} is copied directly to
	6991	the output file uninterpreted. Such code forms the prologue of the input
	6992	file. @xref{Grammar Outline, ,Outline of a Bison
	6993	Grammar}.
	6994	@end deffn
	6995
	6996	@deffn {Construct} /@dots{}/
	6997	Comment delimiters, as in C.
	6998	@end deffn
	6999
	7000	@deffn {Delimiter} :
	7001	Separates a rule's result from its components. @xref{Rules, ,Syntax of
	7002	Grammar Rules}.
	7003	@end deffn
	7004
	7005	@deffn {Delimiter} ;
	7006	Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}.
	7007	@end deffn
	7008
	7009	@deffn {Delimiter} \|
	7010	Separates alternate rules for the same result nonterminal.
	7011	@xref{Rules, ,Syntax of Grammar Rules}.
	7012	@end deffn
	7013
	7014	@node Glossary
	7015	@appendix Glossary
	7016	@cindex glossary
	7017
	7018	@table @asis
	7019	@item Backus-Naur Form (@acronym{BNF}; also called ``Backus Normal Form'')
	7020	Formal method of specifying context-free grammars originally proposed
	7021	by John Backus, and slightly improved by Peter Naur in his 1960-01-02
	7022	committee document contributing to what became the Algol 60 report.
	7023	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	7024
	7025	@item Context-free grammars
	7026	Grammars specified as rules that can be applied regardless of context.
	7027	Thus, if there is a rule which says that an integer can be used as an
	7028	expression, integers are allowed @emph{anywhere} an expression is
	7029	permitted. @xref{Language and Grammar, ,Languages and Context-Free
	7030	Grammars}.
	7031
	7032	@item Dynamic allocation
	7033	Allocation of memory that occurs during execution, rather than at
	7034	compile time or on entry to a function.
	7035
	7036	@item Empty string
	7037	Analogous to the empty set in set theory, the empty string is a
	7038	character string of length zero.
	7039
	7040	@item Finite-state stack machine
	7041	A ``machine'' that has discrete states in which it is said to exist at
	7042	each instant in time. As input to the machine is processed, the
	7043	machine moves from state to state as specified by the logic of the
	7044	machine. In the case of the parser, the input is the language being
	7045	parsed, and the states correspond to various stages in the grammar
	7046	rules. @xref{Algorithm, ,The Bison Parser Algorithm}.
	7047
	7048	@item Generalized @acronym{LR} (@acronym{GLR})
	7049	A parsing algorithm that can handle all context-free grammars, including those
	7050	that are not @acronym{LALR}(1). It resolves situations that Bison's
	7051	usual @acronym{LALR}(1)
	7052	algorithm cannot by effectively splitting off multiple parsers, trying all
	7053	possible parsers, and discarding those that fail in the light of additional
	7054	right context. @xref{Generalized LR Parsing, ,Generalized
	7055	@acronym{LR} Parsing}.
	7056
	7057	@item Grouping
	7058	A language construct that is (in general) grammatically divisible;
	7059	for example, `expression' or `declaration' in C@.
	7060	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	7061
	7062	@item Infix operator
	7063	An arithmetic operator that is placed between the operands on which it
	7064	performs some operation.
	7065
	7066	@item Input stream
	7067	A continuous flow of data between devices or programs.
	7068
	7069	@item Language construct
	7070	One of the typical usage schemas of the language. For example, one of
	7071	the constructs of the C language is the @code{if} statement.
	7072	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	7073
	7074	@item Left associativity
	7075	Operators having left associativity are analyzed from left to right:
	7076	@samp{a+b+c} first computes @samp{a+b} and then combines with
	7077	@samp{c}. @xref{Precedence, ,Operator Precedence}.
	7078
	7079	@item Left recursion
	7080	A rule whose result symbol is also its first component symbol; for
	7081	example, @samp{expseq1 : expseq1 ',' exp;}. @xref{Recursion, ,Recursive
	7082	Rules}.
	7083
	7084	@item Left-to-right parsing
	7085	Parsing a sentence of a language by analyzing it token by token from
	7086	left to right. @xref{Algorithm, ,The Bison Parser Algorithm}.
	7087
	7088	@item Lexical analyzer (scanner)
	7089	A function that reads an input stream and returns tokens one by one.
	7090	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	7091
	7092	@item Lexical tie-in
	7093	A flag, set by actions in the grammar rules, which alters the way
	7094	tokens are parsed. @xref{Lexical Tie-ins}.
	7095
	7096	@item Literal string token
	7097	A token which consists of two or more fixed characters. @xref{Symbols}.
	7098
	7099	@item Look-ahead token
	7100	A token already read but not yet shifted. @xref{Look-Ahead, ,Look-Ahead
	7101	Tokens}.
	7102
	7103	@item @acronym{LALR}(1)
	7104	The class of context-free grammars that Bison (like most other parser
	7105	generators) can handle; a subset of @acronym{LR}(1). @xref{Mystery
	7106	Conflicts, ,Mysterious Reduce/Reduce Conflicts}.
	7107
	7108	@item @acronym{LR}(1)
	7109	The class of context-free grammars in which at most one token of
	7110	look-ahead is needed to disambiguate the parsing of any piece of input.
	7111
	7112	@item Nonterminal symbol
	7113	A grammar symbol standing for a grammatical construct that can
	7114	be expressed through rules in terms of smaller constructs; in other
	7115	words, a construct that is not a token. @xref{Symbols}.
	7116
	7117	@item Parser
	7118	A function that recognizes valid sentences of a language by analyzing
	7119	the syntax structure of a set of tokens passed to it from a lexical
	7120	analyzer.
	7121
	7122	@item Postfix operator
	7123	An arithmetic operator that is placed after the operands upon which it
	7124	performs some operation.
	7125
	7126	@item Reduction
	7127	Replacing a string of nonterminals and/or terminals with a single
	7128	nonterminal, according to a grammar rule. @xref{Algorithm, ,The Bison
	7129	Parser Algorithm}.
	7130
	7131	@item Reentrant
	7132	A reentrant subprogram is a subprogram which can be in invoked any
	7133	number of times in parallel, without interference between the various
	7134	invocations. @xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	7135
	7136	@item Reverse polish notation
	7137	A language in which all operators are postfix operators.
	7138
	7139	@item Right recursion
	7140	A rule whose result symbol is also its last component symbol; for
	7141	example, @samp{expseq1: exp ',' expseq1;}. @xref{Recursion, ,Recursive
	7142	Rules}.
	7143
	7144	@item Semantics
	7145	In computer languages, the semantics are specified by the actions
	7146	taken for each instance of the language, i.e., the meaning of
	7147	each statement. @xref{Semantics, ,Defining Language Semantics}.
	7148
	7149	@item Shift
	7150	A parser is said to shift when it makes the choice of analyzing
	7151	further input from the stream rather than reducing immediately some
	7152	already-recognized rule. @xref{Algorithm, ,The Bison Parser Algorithm}.
	7153
	7154	@item Single-character literal
	7155	A single character that is recognized and interpreted as is.
	7156	@xref{Grammar in Bison, ,From Formal Rules to Bison Input}.
	7157
	7158	@item Start symbol
	7159	The nonterminal symbol that stands for a complete valid utterance in
	7160	the language being parsed. The start symbol is usually listed as the
	7161	first nonterminal symbol in a language specification.
	7162	@xref{Start Decl, ,The Start-Symbol}.
	7163
	7164	@item Symbol table
	7165	A data structure where symbol names and associated data are stored
	7166	during parsing to allow for recognition and use of existing
	7167	information in repeated uses of a symbol. @xref{Multi-function Calc}.
	7168
	7169	@item Syntax error
	7170	An error encountered during parsing of an input stream due to invalid
	7171	syntax. @xref{Error Recovery}.
	7172
	7173	@item Token
	7174	A basic, grammatically indivisible unit of a language. The symbol
	7175	that describes a token in the grammar is a terminal symbol.
	7176	The input of the Bison parser is a stream of tokens which comes from
	7177	the lexical analyzer. @xref{Symbols}.
	7178
	7179	@item Terminal symbol
	7180	A grammar symbol that has no rules in the grammar and therefore is
	7181	grammatically indivisible. The piece of text it represents is a token.
	7182	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	7183	@end table
	7184
	7185	@node Copying This Manual
	7186	@appendix Copying This Manual
	7187
	7188	@menu
	7189	* GNU Free Documentation License:: License for copying this manual.
	7190	@end menu
	7191
	7192	@include fdl.texi
	7193
	7194	@node Index
	7195	@unnumbered Index
	7196
	7197	@printindex cp
	7198
	7199	@bye
	7200
	7201	@c LocalWords: texinfo setfilename settitle setchapternewpage finalout
	7202	@c LocalWords: ifinfo smallbook shorttitlepage titlepage GPL FIXME iftex
	7203	@c LocalWords: akim fn cp syncodeindex vr tp synindex dircategory direntry
	7204	@c LocalWords: ifset vskip pt filll insertcopying sp ISBN Etienne Suvasa
	7205	@c LocalWords: ifnottex yyparse detailmenu GLR RPN Calc var Decls Rpcalc
	7206	@c LocalWords: rpcalc Lexer Gen Comp Expr ltcalc mfcalc Decl Symtab yylex
	7207	@c LocalWords: yyerror pxref LR yylval cindex dfn LALR samp gpl BNF xref
	7208	@c LocalWords: const int paren ifnotinfo AC noindent emph expr stmt findex
	7209	@c LocalWords: glr YYSTYPE TYPENAME prog dprec printf decl init stmtMerge
	7210	@c LocalWords: pre STDC GNUC endif yy YY alloca lf stddef stdlib YYDEBUG
	7211	@c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit
	7212	@c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok
	7213	@c LocalWords: longjmp fprintf stderr preg yylloc YYLTYPE cos ln
	7214	@c LocalWords: smallexample symrec val tptr FNCT fnctptr func struct sym
	7215	@c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof
	7216	@c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum
	7217	@c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype
	7218	@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless typefull yynerrs
	7219	@c LocalWords: yychar yydebug msg YYNTOKENS YYNNTS YYNRULES YYNSTATES
	7220	@c LocalWords: cparse clex deftypefun NE defmac YYACCEPT YYABORT param
	7221	@c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP
	7222	@c LocalWords: YYEMPTY YYRECOVERING yyclearin GE def UMINUS maybeword
	7223	@c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH
	7224	@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm VCG notype
	7225	@c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args
	7226	@c LocalWords: YYPRINTF infile ypp yxx outfile itemx vcg tex leaderfill
	7227	@c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll
	7228	@c LocalWords: yyrestart nbar yytext fst snd osplit ntwo strdup AST
	7229	@c LocalWords: YYSTACK DVI fdl printindex