git.saurik.com Git - bison.git/blame_incremental

... / ...

Commit	Line	Data
	1	\input texinfo @c --texinfo--
	2	@comment %**start of header
	3	@setfilename bison.info
	4	@include version.texi
	5	@settitle Bison @value{VERSION}
	6	@setchapternewpage odd
	7
	8	@finalout
	9
	10	@c SMALL BOOK version
	11	@c This edition has been formatted so that you can format and print it in
	12	@c the smallbook format.
	13	@c @smallbook
	14
	15	@c Set following if you have the new `shorttitlepage' command
	16	@c @clear shorttitlepage-enabled
	17	@c @set shorttitlepage-enabled
	18
	19	@c ISPELL CHECK: done, 14 Jan 1993 --bob
	20
	21	@c Check COPYRIGHT dates. should be updated in the titlepage, ifinfo
	22	@c titlepage; should NOT be changed in the GPL. --mew
	23
	24	@c FIXME: I don't understand this `iftex'. Obsolete? --akim.
	25	@iftex
	26	@syncodeindex fn cp
	27	@syncodeindex vr cp
	28	@syncodeindex tp cp
	29	@end iftex
	30	@ifinfo
	31	@synindex fn cp
	32	@synindex vr cp
	33	@synindex tp cp
	34	@end ifinfo
	35	@comment %**end of header
	36
	37	@copying
	38
	39	This manual is for @acronym{GNU} Bison (version @value{VERSION},
	40	@value{UPDATED}), the @acronym{GNU} parser generator.
	41
	42	Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998,
	43	1999, 2000, 2001, 2002 Free Software Foundation, Inc.
	44
	45	@quotation
	46	Permission is granted to copy, distribute and/or modify this document
	47	under the terms of the @acronym{GNU} Free Documentation License,
	48	Version 1.1 or any later version published by the Free Software
	49	Foundation; with no Invariant Sections, with the Front-Cover texts
	50	being ``A @acronym{GNU} Manual,'' and with the Back-Cover Texts as in
	51	(a) below. A copy of the license is included in the section entitled
	52	``@acronym{GNU} Free Documentation License.''
	53
	54	(a) The @acronym{FSF}'s Back-Cover Text is: ``You have freedom to copy
	55	and modify this @acronym{GNU} Manual, like @acronym{GNU} software.
	56	Copies published by the Free Software Foundation raise funds for
	57	@acronym{GNU} development.''
	58	@end quotation
	59	@end copying
	60
	61	@dircategory GNU programming tools
	62	@direntry
	63	* bison: (bison). @acronym{GNU} parser generator (Yacc replacement).
	64	@end direntry
	65
	66	@ifset shorttitlepage-enabled
	67	@shorttitlepage Bison
	68	@end ifset
	69	@titlepage
	70	@title Bison
	71	@subtitle The Yacc-compatible Parser Generator
	72	@subtitle @value{UPDATED}, Bison Version @value{VERSION}
	73
	74	@author by Charles Donnelly and Richard Stallman
	75
	76	@page
	77	@vskip 0pt plus 1filll
	78	@insertcopying
	79	@sp 2
	80	Published by the Free Software Foundation @*
	81	59 Temple Place, Suite 330 @*
	82	Boston, MA 02111-1307 USA @*
	83	Printed copies are available from the Free Software Foundation.@*
	84	@acronym{ISBN} 1-882114-44-2
	85	@sp 2
	86	Cover art by Etienne Suvasa.
	87	@end titlepage
	88
	89	@contents
	90
	91	@ifnottex
	92	@node Top
	93	@top Bison
	94	@insertcopying
	95	@end ifnottex
	96
	97	@menu
	98	* Introduction::
	99	* Conditions::
	100	* Copying:: The @acronym{GNU} General Public License says
	101	how you can copy and share Bison
	102
	103	Tutorial sections:
	104	* Concepts:: Basic concepts for understanding Bison.
	105	* Examples:: Three simple explained examples of using Bison.
	106
	107	Reference sections:
	108	* Grammar File:: Writing Bison declarations and rules.
	109	* Interface:: C-language interface to the parser function @code{yyparse}.
	110	* Algorithm:: How the Bison parser works at run-time.
	111	* Error Recovery:: Writing rules for error recovery.
	112	* Context Dependency:: What to do if your language syntax is too
	113	messy for Bison to handle straightforwardly.
	114	* Debugging:: Understanding or debugging Bison parsers.
	115	* Invocation:: How to run Bison (to produce the parser source file).
	116	* Table of Symbols:: All the keywords of the Bison language are explained.
	117	* Glossary:: Basic concepts are explained.
	118	* FAQ:: Frequently Asked Questions
	119	* Copying This Manual:: License for copying this manual.
	120	* Index:: Cross-references to the text.
	121
	122	@detailmenu --- The Detailed Node Listing ---
	123
	124	The Concepts of Bison
	125
	126	* Language and Grammar:: Languages and context-free grammars,
	127	as mathematical ideas.
	128	* Grammar in Bison:: How we represent grammars for Bison's sake.
	129	* Semantic Values:: Each token or syntactic grouping can have
	130	a semantic value (the value of an integer,
	131	the name of an identifier, etc.).
	132	* Semantic Actions:: Each rule can have an action containing C code.
	133	* Bison Parser:: What are Bison's input and output,
	134	how is the output used?
	135	* Stages:: Stages in writing and running Bison grammars.
	136	* Grammar Layout:: Overall structure of a Bison grammar file.
	137
	138	Examples
	139
	140	* RPN Calc:: Reverse polish notation calculator;
	141	a first example with no operator precedence.
	142	* Infix Calc:: Infix (algebraic) notation calculator.
	143	Operator precedence is introduced.
	144	* Simple Error Recovery:: Continuing after syntax errors.
	145	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	146	* Multi-function Calc:: Calculator with memory and trig functions.
	147	It uses multiple data-types for semantic values.
	148	* Exercises:: Ideas for improving the multi-function calculator.
	149
	150	Reverse Polish Notation Calculator
	151
	152	* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc.
	153	* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation.
	154	* Lexer: Rpcalc Lexer. The lexical analyzer.
	155	* Main: Rpcalc Main. The controlling function.
	156	* Error: Rpcalc Error. The error reporting function.
	157	* Gen: Rpcalc Gen. Running Bison on the grammar file.
	158	* Comp: Rpcalc Compile. Run the C compiler on the output code.
	159
	160	Grammar Rules for @code{rpcalc}
	161
	162	* Rpcalc Input::
	163	* Rpcalc Line::
	164	* Rpcalc Expr::
	165
	166	Location Tracking Calculator: @code{ltcalc}
	167
	168	* Decls: Ltcalc Decls. Bison and C declarations for ltcalc.
	169	* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations.
	170	* Lexer: Ltcalc Lexer. The lexical analyzer.
	171
	172	Multi-Function Calculator: @code{mfcalc}
	173
	174	* Decl: Mfcalc Decl. Bison declarations for multi-function calculator.
	175	* Rules: Mfcalc Rules. Grammar rules for the calculator.
	176	* Symtab: Mfcalc Symtab. Symbol table management subroutines.
	177
	178	Bison Grammar Files
	179
	180	* Grammar Outline:: Overall layout of the grammar file.
	181	* Symbols:: Terminal and nonterminal symbols.
	182	* Rules:: How to write grammar rules.
	183	* Recursion:: Writing recursive rules.
	184	* Semantics:: Semantic values and actions.
	185	* Declarations:: All kinds of Bison declarations are described here.
	186	* Multiple Parsers:: Putting more than one Bison parser in one program.
	187
	188	Outline of a Bison Grammar
	189
	190	* Prologue:: Syntax and usage of the prologue (declarations section).
	191	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	192	* Grammar Rules:: Syntax and usage of the grammar rules section.
	193	* Epilogue:: Syntax and usage of the epilogue (additional code section).
	194
	195	Defining Language Semantics
	196
	197	* Value Type:: Specifying one data type for all semantic values.
	198	* Multiple Types:: Specifying several alternative data types.
	199	* Actions:: An action is the semantic definition of a grammar rule.
	200	* Action Types:: Specifying data types for actions to operate on.
	201	* Mid-Rule Actions:: Most actions go at the end of a rule.
	202	This says when, why and how to use the exceptional
	203	action in the middle of a rule.
	204
	205	Bison Declarations
	206
	207	* Token Decl:: Declaring terminal symbols.
	208	* Precedence Decl:: Declaring terminals with precedence and associativity.
	209	* Union Decl:: Declaring the set of all semantic value types.
	210	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	211	* Destructor Decl:: Declaring how symbols are freed.
	212	* Expect Decl:: Suppressing warnings about shift/reduce conflicts.
	213	* Start Decl:: Specifying the start symbol.
	214	* Pure Decl:: Requesting a reentrant parser.
	215	* Decl Summary:: Table of all Bison declarations.
	216
	217	Parser C-Language Interface
	218
	219	* Parser Function:: How to call @code{yyparse} and what it returns.
	220	* Lexical:: You must supply a function @code{yylex}
	221	which reads tokens.
	222	* Error Reporting:: You must supply a function @code{yyerror}.
	223	* Action Features:: Special features for use in actions.
	224
	225	The Lexical Analyzer Function @code{yylex}
	226
	227	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	228	* Token Values:: How @code{yylex} must return the semantic value
	229	of the token it has read.
	230	* Token Positions:: How @code{yylex} must return the text position
	231	(line number, etc.) of the token, if the
	232	actions want that.
	233	* Pure Calling:: How the calling convention differs
	234	in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	235
	236	The Bison Parser Algorithm
	237
	238	* Look-Ahead:: Parser looks one token ahead when deciding what to do.
	239	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	240	* Precedence:: Operator precedence works by resolving conflicts.
	241	* Contextual Precedence:: When an operator's precedence depends on context.
	242	* Parser States:: The parser is a finite-state-machine with stack.
	243	* Reduce/Reduce:: When two rules are applicable in the same situation.
	244	* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
	245	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	246	* Stack Overflow:: What happens when stack gets full. How to avoid it.
	247
	248	Operator Precedence
	249
	250	* Why Precedence:: An example showing why precedence is needed.
	251	* Using Precedence:: How to specify precedence in Bison grammars.
	252	* Precedence Examples:: How these features are used in the previous example.
	253	* How Precedence:: How they work.
	254
	255	Handling Context Dependencies
	256
	257	* Semantic Tokens:: Token parsing can depend on the semantic context.
	258	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	259	* Tie-in Recovery:: Lexical tie-ins have implications for how
	260	error recovery rules must be written.
	261
	262	Understanding or Debugging Your Parser
	263
	264	* Understanding:: Understanding the structure of your parser.
	265	* Tracing:: Tracing the execution of your parser.
	266
	267	Invoking Bison
	268
	269	* Bison Options:: All the options described in detail,
	270	in alphabetical order by short options.
	271	* Option Cross Key:: Alphabetical list of long options.
	272
	273	Frequently Asked Questions
	274
	275	* Parser Stack Overflow:: Breaking the Stack Limits
	276
	277	Copying This Manual
	278
	279	* GNU Free Documentation License:: License for copying this manual.
	280
	281	@end detailmenu
	282	@end menu
	283
	284	@node Introduction
	285	@unnumbered Introduction
	286	@cindex introduction
	287
	288	@dfn{Bison} is a general-purpose parser generator that converts a
	289	grammar description for an @acronym{LALR}(1) context-free grammar into a C
	290	program to parse that grammar. Once you are proficient with Bison,
	291	you may use it to develop a wide range of language parsers, from those
	292	used in simple desk calculators to complex programming languages.
	293
	294	Bison is upward compatible with Yacc: all properly-written Yacc grammars
	295	ought to work with Bison with no change. Anyone familiar with Yacc
	296	should be able to use Bison with little trouble. You need to be fluent in
	297	C programming in order to use Bison or to understand this manual.
	298
	299	We begin with tutorial chapters that explain the basic concepts of using
	300	Bison and show three explained examples, each building on the last. If you
	301	don't know Bison or Yacc, start by reading these chapters. Reference
	302	chapters follow which describe specific aspects of Bison in detail.
	303
	304	Bison was written primarily by Robert Corbett; Richard Stallman made it
	305	Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added
	306	multi-character string literals and other features.
	307
	308	This edition corresponds to version @value{VERSION} of Bison.
	309
	310	@node Conditions
	311	@unnumbered Conditions for Using Bison
	312
	313	As of Bison version 1.24, we have changed the distribution terms for
	314	@code{yyparse} to permit using Bison's output in nonfree programs when
	315	Bison is generating C code for @acronym{LALR}(1) parsers. Formerly, these
	316	parsers could be used only in programs that were free software.
	317
	318	The other @acronym{GNU} programming tools, such as the @acronym{GNU} C
	319	compiler, have never
	320	had such a requirement. They could always be used for nonfree
	321	software. The reason Bison was different was not due to a special
	322	policy decision; it resulted from applying the usual General Public
	323	License to all of the Bison source code.
	324
	325	The output of the Bison utility---the Bison parser file---contains a
	326	verbatim copy of a sizable piece of Bison, which is the code for the
	327	@code{yyparse} function. (The actions from your grammar are inserted
	328	into this function at one point, but the rest of the function is not
	329	changed.) When we applied the @acronym{GPL} terms to the code for
	330	@code{yyparse},
	331	the effect was to restrict the use of Bison output to free software.
	332
	333	We didn't change the terms because of sympathy for people who want to
	334	make software proprietary. @strong{Software should be free.} But we
	335	concluded that limiting Bison's use to free software was doing little to
	336	encourage people to make other software free. So we decided to make the
	337	practical conditions for using Bison match the practical conditions for
	338	using the other @acronym{GNU} tools.
	339
	340	This exception applies only when Bison is generating C code for a
	341	@acronym{LALR}(1) parser; otherwise, the @acronym{GPL} terms operate
	342	as usual. You can
	343	tell whether the exception applies to your @samp{.c} output file by
	344	inspecting it to see whether it says ``As a special exception, when
	345	this file is copied by Bison into a Bison output file, you may use
	346	that output file without restriction.''
	347
	348	@include gpl.texi
	349
	350	@node Concepts
	351	@chapter The Concepts of Bison
	352
	353	This chapter introduces many of the basic concepts without which the
	354	details of Bison will not make sense. If you do not already know how to
	355	use Bison or Yacc, we suggest you start by reading this chapter carefully.
	356
	357	@menu
	358	* Language and Grammar:: Languages and context-free grammars,
	359	as mathematical ideas.
	360	* Grammar in Bison:: How we represent grammars for Bison's sake.
	361	* Semantic Values:: Each token or syntactic grouping can have
	362	a semantic value (the value of an integer,
	363	the name of an identifier, etc.).
	364	* Semantic Actions:: Each rule can have an action containing C code.
	365	* GLR Parsers:: Writing parsers for general context-free languages
	366	* Locations Overview:: Tracking Locations.
	367	* Bison Parser:: What are Bison's input and output,
	368	how is the output used?
	369	* Stages:: Stages in writing and running Bison grammars.
	370	* Grammar Layout:: Overall structure of a Bison grammar file.
	371	@end menu
	372
	373	@node Language and Grammar
	374	@section Languages and Context-Free Grammars
	375
	376	@cindex context-free grammar
	377	@cindex grammar, context-free
	378	In order for Bison to parse a language, it must be described by a
	379	@dfn{context-free grammar}. This means that you specify one or more
	380	@dfn{syntactic groupings} and give rules for constructing them from their
	381	parts. For example, in the C language, one kind of grouping is called an
	382	`expression'. One rule for making an expression might be, ``An expression
	383	can be made of a minus sign and another expression''. Another would be,
	384	``An expression can be an integer''. As you can see, rules are often
	385	recursive, but there must be at least one rule which leads out of the
	386	recursion.
	387
	388	@cindex @acronym{BNF}
	389	@cindex Backus-Naur form
	390	The most common formal system for presenting such rules for humans to read
	391	is @dfn{Backus-Naur Form} or ``@acronym{BNF}'', which was developed in
	392	order to specify the language Algol 60. Any grammar expressed in
	393	@acronym{BNF} is a context-free grammar. The input to Bison is
	394	essentially machine-readable @acronym{BNF}.
	395
	396	@cindex @acronym{LALR}(1) grammars
	397	@cindex @acronym{LR}(1) grammars
	398	There are various important subclasses of context-free grammar. Although it
	399	can handle almost all context-free grammars, Bison is optimized for what
	400	are called @acronym{LALR}(1) grammars.
	401	In brief, in these grammars, it must be possible to
	402	tell how to parse any portion of an input string with just a single
	403	token of look-ahead. Strictly speaking, that is a description of an
	404	@acronym{LR}(1) grammar, and @acronym{LALR}(1) involves additional
	405	restrictions that are
	406	hard to explain simply; but it is rare in actual practice to find an
	407	@acronym{LR}(1) grammar that fails to be @acronym{LALR}(1).
	408	@xref{Mystery Conflicts, ,Mysterious Reduce/Reduce Conflicts}, for
	409	more information on this.
	410
	411	@cindex @acronym{GLR} parsing
	412	@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
	413	@cindex ambiguous grammars
	414	@cindex non-deterministic parsing
	415
	416	Parsers for @acronym{LALR}(1) grammars are @dfn{deterministic}, meaning
	417	roughly that the next grammar rule to apply at any point in the input is
	418	uniquely determined by the preceding input and a fixed, finite portion
	419	(called a @dfn{look-ahead}) of the remaining input. A context-free
	420	grammar can be @dfn{ambiguous}, meaning that there are multiple ways to
	421	apply the grammar rules to get the some inputs. Even unambiguous
	422	grammars can be @dfn{non-deterministic}, meaning that no fixed
	423	look-ahead always suffices to determine the next grammar rule to apply.
	424	With the proper declarations, Bison is also able to parse these more
	425	general context-free grammars, using a technique known as @acronym{GLR}
	426	parsing (for Generalized @acronym{LR}). Bison's @acronym{GLR} parsers
	427	are able to handle any context-free grammar for which the number of
	428	possible parses of any given string is finite.
	429
	430	@cindex symbols (abstract)
	431	@cindex token
	432	@cindex syntactic grouping
	433	@cindex grouping, syntactic
	434	In the formal grammatical rules for a language, each kind of syntactic
	435	unit or grouping is named by a @dfn{symbol}. Those which are built by
	436	grouping smaller constructs according to grammatical rules are called
	437	@dfn{nonterminal symbols}; those which can't be subdivided are called
	438	@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
	439	corresponding to a single terminal symbol a @dfn{token}, and a piece
	440	corresponding to a single nonterminal symbol a @dfn{grouping}.
	441
	442	We can use the C language as an example of what symbols, terminal and
	443	nonterminal, mean. The tokens of C are identifiers, constants (numeric
	444	and string), and the various keywords, arithmetic operators and
	445	punctuation marks. So the terminal symbols of a grammar for C include
	446	`identifier', `number', `string', plus one symbol for each keyword,
	447	operator or punctuation mark: `if', `return', `const', `static', `int',
	448	`char', `plus-sign', `open-brace', `close-brace', `comma' and many more.
	449	(These tokens can be subdivided into characters, but that is a matter of
	450	lexicography, not grammar.)
	451
	452	Here is a simple C function subdivided into tokens:
	453
	454	@ifinfo
	455	@example
	456	int /* @r{keyword `int'} */
	457	square (int x) /* @r{identifier, open-paren, identifier,}
	458	@r{identifier, close-paren} */
	459	@{ /* @r{open-brace} */
	460	return x * x; /* @r{keyword `return', identifier, asterisk,
	461	identifier, semicolon} */
	462	@} /* @r{close-brace} */
	463	@end example
	464	@end ifinfo
	465	@ifnotinfo
	466	@example
	467	int /* @r{keyword `int'} */
	468	square (int x) /* @r{identifier, open-paren, identifier, identifier, close-paren} */
	469	@{ /* @r{open-brace} */
	470	return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */
	471	@} /* @r{close-brace} */
	472	@end example
	473	@end ifnotinfo
	474
	475	The syntactic groupings of C include the expression, the statement, the
	476	declaration, and the function definition. These are represented in the
	477	grammar of C by nonterminal symbols `expression', `statement',
	478	`declaration' and `function definition'. The full grammar uses dozens of
	479	additional language constructs, each with its own nonterminal symbol, in
	480	order to express the meanings of these four. The example above is a
	481	function definition; it contains one declaration, and one statement. In
	482	the statement, each @samp{x} is an expression and so is @samp{x * x}.
	483
	484	Each nonterminal symbol must have grammatical rules showing how it is made
	485	out of simpler constructs. For example, one kind of C statement is the
	486	@code{return} statement; this would be described with a grammar rule which
	487	reads informally as follows:
	488
	489	@quotation
	490	A `statement' can be made of a `return' keyword, an `expression' and a
	491	`semicolon'.
	492	@end quotation
	493
	494	@noindent
	495	There would be many other rules for `statement', one for each kind of
	496	statement in C.
	497
	498	@cindex start symbol
	499	One nonterminal symbol must be distinguished as the special one which
	500	defines a complete utterance in the language. It is called the @dfn{start
	501	symbol}. In a compiler, this means a complete input program. In the C
	502	language, the nonterminal symbol `sequence of definitions and declarations'
	503	plays this role.
	504
	505	For example, @samp{1 + 2} is a valid C expression---a valid part of a C
	506	program---but it is not valid as an @emph{entire} C program. In the
	507	context-free grammar of C, this follows from the fact that `expression' is
	508	not the start symbol.
	509
	510	The Bison parser reads a sequence of tokens as its input, and groups the
	511	tokens using the grammar rules. If the input is valid, the end result is
	512	that the entire token sequence reduces to a single grouping whose symbol is
	513	the grammar's start symbol. If we use a grammar for C, the entire input
	514	must be a `sequence of definitions and declarations'. If not, the parser
	515	reports a syntax error.
	516
	517	@node Grammar in Bison
	518	@section From Formal Rules to Bison Input
	519	@cindex Bison grammar
	520	@cindex grammar, Bison
	521	@cindex formal grammar
	522
	523	A formal grammar is a mathematical construct. To define the language
	524	for Bison, you must write a file expressing the grammar in Bison syntax:
	525	a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}.
	526
	527	A nonterminal symbol in the formal grammar is represented in Bison input
	528	as an identifier, like an identifier in C@. By convention, it should be
	529	in lower case, such as @code{expr}, @code{stmt} or @code{declaration}.
	530
	531	The Bison representation for a terminal symbol is also called a @dfn{token
	532	type}. Token types as well can be represented as C-like identifiers. By
	533	convention, these identifiers should be upper case to distinguish them from
	534	nonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or
	535	@code{RETURN}. A terminal symbol that stands for a particular keyword in
	536	the language should be named after that keyword converted to upper case.
	537	The terminal symbol @code{error} is reserved for error recovery.
	538	@xref{Symbols}.
	539
	540	A terminal symbol can also be represented as a character literal, just like
	541	a C character constant. You should do this whenever a token is just a
	542	single character (parenthesis, plus-sign, etc.): use that same character in
	543	a literal as the terminal symbol for that token.
	544
	545	A third way to represent a terminal symbol is with a C string constant
	546	containing several characters. @xref{Symbols}, for more information.
	547
	548	The grammar rules also have an expression in Bison syntax. For example,
	549	here is the Bison rule for a C @code{return} statement. The semicolon in
	550	quotes is a literal character token, representing part of the C syntax for
	551	the statement; the naked semicolon, and the colon, are Bison punctuation
	552	used in every rule.
	553
	554	@example
	555	stmt: RETURN expr ';'
	556	;
	557	@end example
	558
	559	@noindent
	560	@xref{Rules, ,Syntax of Grammar Rules}.
	561
	562	@node Semantic Values
	563	@section Semantic Values
	564	@cindex semantic value
	565	@cindex value, semantic
	566
	567	A formal grammar selects tokens only by their classifications: for example,
	568	if a rule mentions the terminal symbol `integer constant', it means that
	569	@emph{any} integer constant is grammatically valid in that position. The
	570	precise value of the constant is irrelevant to how to parse the input: if
	571	@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equally
	572	grammatical.
	573
	574	But the precise value is very important for what the input means once it is
	575	parsed. A compiler is useless if it fails to distinguish between 4, 1 and
	576	3989 as constants in the program! Therefore, each token in a Bison grammar
	577	has both a token type and a @dfn{semantic value}. @xref{Semantics,
	578	,Defining Language Semantics},
	579	for details.
	580
	581	The token type is a terminal symbol defined in the grammar, such as
	582	@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everything
	583	you need to know to decide where the token may validly appear and how to
	584	group it with other tokens. The grammar rules know nothing about tokens
	585	except their types.
	586
	587	The semantic value has all the rest of the information about the
	588	meaning of the token, such as the value of an integer, or the name of an
	589	identifier. (A token such as @code{','} which is just punctuation doesn't
	590	need to have any semantic value.)
	591
	592	For example, an input token might be classified as token type
	593	@code{INTEGER} and have the semantic value 4. Another input token might
	594	have the same token type @code{INTEGER} but value 3989. When a grammar
	595	rule says that @code{INTEGER} is allowed, either of these tokens is
	596	acceptable because each is an @code{INTEGER}. When the parser accepts the
	597	token, it keeps track of the token's semantic value.
	598
	599	Each grouping can also have a semantic value as well as its nonterminal
	600	symbol. For example, in a calculator, an expression typically has a
	601	semantic value that is a number. In a compiler for a programming
	602	language, an expression typically has a semantic value that is a tree
	603	structure describing the meaning of the expression.
	604
	605	@node Semantic Actions
	606	@section Semantic Actions
	607	@cindex semantic actions
	608	@cindex actions, semantic
	609
	610	In order to be useful, a program must do more than parse input; it must
	611	also produce some output based on the input. In a Bison grammar, a grammar
	612	rule can have an @dfn{action} made up of C statements. Each time the
	613	parser recognizes a match for that rule, the action is executed.
	614	@xref{Actions}.
	615
	616	Most of the time, the purpose of an action is to compute the semantic value
	617	of the whole construct from the semantic values of its parts. For example,
	618	suppose we have a rule which says an expression can be the sum of two
	619	expressions. When the parser recognizes such a sum, each of the
	620	subexpressions has a semantic value which describes how it was built up.
	621	The action for this rule should create a similar sort of value for the
	622	newly recognized larger expression.
	623
	624	For example, here is a rule that says an expression can be the sum of
	625	two subexpressions:
	626
	627	@example
	628	expr: expr '+' expr @{ $$ = $1 + $3; @}
	629	;
	630	@end example
	631
	632	@noindent
	633	The action says how to produce the semantic value of the sum expression
	634	from the values of the two subexpressions.
	635
	636	@node GLR Parsers
	637	@section Writing @acronym{GLR} Parsers
	638	@cindex @acronym{GLR} parsing
	639	@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
	640	@findex %glr-parser
	641	@cindex conflicts
	642	@cindex shift/reduce conflicts
	643
	644	In some grammars, there will be cases where Bison's standard
	645	@acronym{LALR}(1) parsing algorithm cannot decide whether to apply a
	646	certain grammar rule at a given point. That is, it may not be able to
	647	decide (on the basis of the input read so far) which of two possible
	648	reductions (applications of a grammar rule) applies, or whether to apply
	649	a reduction or read more of the input and apply a reduction later in the
	650	input. These are known respectively as @dfn{reduce/reduce} conflicts
	651	(@pxref{Reduce/Reduce}), and @dfn{shift/reduce} conflicts
	652	(@pxref{Shift/Reduce}).
	653
	654	To use a grammar that is not easily modified to be @acronym{LALR}(1), a
	655	more general parsing algorithm is sometimes necessary. If you include
	656	@code{%glr-parser} among the Bison declarations in your file
	657	(@pxref{Grammar Outline}), the result will be a Generalized @acronym{LR}
	658	(@acronym{GLR}) parser. These parsers handle Bison grammars that
	659	contain no unresolved conflicts (i.e., after applying precedence
	660	declarations) identically to @acronym{LALR}(1) parsers. However, when
	661	faced with unresolved shift/reduce and reduce/reduce conflicts,
	662	@acronym{GLR} parsers use the simple expedient of doing both,
	663	effectively cloning the parser to follow both possibilities. Each of
	664	the resulting parsers can again split, so that at any given time, there
	665	can be any number of possible parses being explored. The parsers
	666	proceed in lockstep; that is, all of them consume (shift) a given input
	667	symbol before any of them proceed to the next. Each of the cloned
	668	parsers eventually meets one of two possible fates: either it runs into
	669	a parsing error, in which case it simply vanishes, or it merges with
	670	another parser, because the two of them have reduced the input to an
	671	identical set of symbols.
	672
	673	During the time that there are multiple parsers, semantic actions are
	674	recorded, but not performed. When a parser disappears, its recorded
	675	semantic actions disappear as well, and are never performed. When a
	676	reduction makes two parsers identical, causing them to merge, Bison
	677	records both sets of semantic actions. Whenever the last two parsers
	678	merge, reverting to the single-parser case, Bison resolves all the
	679	outstanding actions either by precedences given to the grammar rules
	680	involved, or by performing both actions, and then calling a designated
	681	user-defined function on the resulting values to produce an arbitrary
	682	merged result.
	683
	684	Let's consider an example, vastly simplified from a C++ grammar.
	685
	686	@example
	687	%@{
	688	#define YYSTYPE const char*
	689	%@}
	690
	691	%token TYPENAME ID
	692
	693	%right '='
	694	%left '+'
	695
	696	%glr-parser
	697
	698	%%
	699
	700	prog :
	701	\| prog stmt @{ printf ("\n"); @}
	702	;
	703
	704	stmt : expr ';' %dprec 1
	705	\| decl %dprec 2
	706	;
	707
	708	expr : ID @{ printf ("%s ", $$); @}
	709	\| TYPENAME '(' expr ')'
	710	@{ printf ("%s <cast> ", $1); @}
	711	\| expr '+' expr @{ printf ("+ "); @}
	712	\| expr '=' expr @{ printf ("= "); @}
	713	;
	714
	715	decl : TYPENAME declarator ';'
	716	@{ printf ("%s <declare> ", $1); @}
	717	\| TYPENAME declarator '=' expr ';'
	718	@{ printf ("%s <init-declare> ", $1); @}
	719	;
	720
	721	declarator : ID @{ printf ("\"%s\" ", $1); @}
	722	\| '(' declarator ')'
	723	;
	724	@end example
	725
	726	@noindent
	727	This models a problematic part of the C++ grammar---the ambiguity between
	728	certain declarations and statements. For example,
	729
	730	@example
	731	T (x) = y+z;
	732	@end example
	733
	734	@noindent
	735	parses as either an @code{expr} or a @code{stmt}
	736	(assuming that @samp{T} is recognized as a @code{TYPENAME} and
	737	@samp{x} as an @code{ID}).
	738	Bison detects this as a reduce/reduce conflict between the rules
	739	@code{expr : ID} and @code{declarator : ID}, which it cannot resolve at the
	740	time it encounters @code{x} in the example above. The two @code{%dprec}
	741	declarations, however, give precedence to interpreting the example as a
	742	@code{decl}, which implies that @code{x} is a declarator.
	743	The parser therefore prints
	744
	745	@example
	746	"x" y z + T <init-declare>
	747	@end example
	748
	749	Consider a different input string for this parser:
	750
	751	@example
	752	T (x) + y;
	753	@end example
	754
	755	@noindent
	756	Here, there is no ambiguity (this cannot be parsed as a declaration).
	757	However, at the time the Bison parser encounters @code{x}, it does not
	758	have enough information to resolve the reduce/reduce conflict (again,
	759	between @code{x} as an @code{expr} or a @code{declarator}). In this
	760	case, no precedence declaration is used. Instead, the parser splits
	761	into two, one assuming that @code{x} is an @code{expr}, and the other
	762	assuming @code{x} is a @code{declarator}. The second of these parsers
	763	then vanishes when it sees @code{+}, and the parser prints
	764
	765	@example
	766	x T <cast> y +
	767	@end example
	768
	769	Suppose that instead of resolving the ambiguity, you wanted to see all
	770	the possibilities. For this purpose, we must @dfn{merge} the semantic
	771	actions of the two possible parsers, rather than choosing one over the
	772	other. To do so, you could change the declaration of @code{stmt} as
	773	follows:
	774
	775	@example
	776	stmt : expr ';' %merge <stmtMerge>
	777	\| decl %merge <stmtMerge>
	778	;
	779	@end example
	780
	781	@noindent
	782
	783	and define the @code{stmtMerge} function as:
	784
	785	@example
	786	static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1)
	787	@{
	788	printf ("<OR> ");
	789	return "";
	790	@}
	791	@end example
	792
	793	@noindent
	794	with an accompanying forward declaration
	795	in the C declarations at the beginning of the file:
	796
	797	@example
	798	%@{
	799	#define YYSTYPE const char*
	800	static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1);
	801	%@}
	802	@end example
	803
	804	@noindent
	805	With these declarations, the resulting parser will parse the first example
	806	as both an @code{expr} and a @code{decl}, and print
	807
	808	@example
	809	"x" y z + T <init-declare> x T <cast> y z + = <OR>
	810	@end example
	811
	812	@sp 1
	813
	814	@cindex @code{incline}
	815	@cindex @acronym{GLR} parsers and @code{inline}
	816	Note that the @acronym{GLR} parsers require an ISO C89 compiler. In
	817	addition, they use the @code{inline} keyword, which is not C89, but a
	818	common extension. It is up to the user of these parsers to handle
	819	portability issues. For instance, if using Autoconf and the Autoconf
	820	macro @code{AC_C_INLINE}, a mere
	821
	822	@example
	823	%@{
	824	#include <config.h>
	825	%@}
	826	@end example
	827
	828	@noindent
	829	will suffice. Otherwise, we suggest
	830
	831	@example
	832	%@{
	833	#if ! defined __GNUC__ && ! defined inline
	834	# define inline
	835	#endif
	836	%@}
	837	@end example
	838
	839	@node Locations Overview
	840	@section Locations
	841	@cindex location
	842	@cindex textual position
	843	@cindex position, textual
	844
	845	Many applications, like interpreters or compilers, have to produce verbose
	846	and useful error messages. To achieve this, one must be able to keep track of
	847	the @dfn{textual position}, or @dfn{location}, of each syntactic construct.
	848	Bison provides a mechanism for handling these locations.
	849
	850	Each token has a semantic value. In a similar fashion, each token has an
	851	associated location, but the type of locations is the same for all tokens and
	852	groupings. Moreover, the output parser is equipped with a default data
	853	structure for storing locations (@pxref{Locations}, for more details).
	854
	855	Like semantic values, locations can be reached in actions using a dedicated
	856	set of constructs. In the example above, the location of the whole grouping
	857	is @code{@@$}, while the locations of the subexpressions are @code{@@1} and
	858	@code{@@3}.
	859
	860	When a rule is matched, a default action is used to compute the semantic value
	861	of its left hand side (@pxref{Actions}). In the same way, another default
	862	action is used for locations. However, the action for locations is general
	863	enough for most cases, meaning there is usually no need to describe for each
	864	rule how @code{@@$} should be formed. When building a new location for a given
	865	grouping, the default behavior of the output parser is to take the beginning
	866	of the first symbol, and the end of the last symbol.
	867
	868	@node Bison Parser
	869	@section Bison Output: the Parser File
	870	@cindex Bison parser
	871	@cindex Bison utility
	872	@cindex lexical analyzer, purpose
	873	@cindex parser
	874
	875	When you run Bison, you give it a Bison grammar file as input. The output
	876	is a C source file that parses the language described by the grammar.
	877	This file is called a @dfn{Bison parser}. Keep in mind that the Bison
	878	utility and the Bison parser are two distinct programs: the Bison utility
	879	is a program whose output is the Bison parser that becomes part of your
	880	program.
	881
	882	The job of the Bison parser is to group tokens into groupings according to
	883	the grammar rules---for example, to build identifiers and operators into
	884	expressions. As it does this, it runs the actions for the grammar rules it
	885	uses.
	886
	887	The tokens come from a function called the @dfn{lexical analyzer} that
	888	you must supply in some fashion (such as by writing it in C). The Bison
	889	parser calls the lexical analyzer each time it wants a new token. It
	890	doesn't know what is ``inside'' the tokens (though their semantic values
	891	may reflect this). Typically the lexical analyzer makes the tokens by
	892	parsing characters of text, but Bison does not depend on this.
	893	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	894
	895	The Bison parser file is C code which defines a function named
	896	@code{yyparse} which implements that grammar. This function does not make
	897	a complete C program: you must supply some additional functions. One is
	898	the lexical analyzer. Another is an error-reporting function which the
	899	parser calls to report an error. In addition, a complete C program must
	900	start with a function called @code{main}; you have to provide this, and
	901	arrange for it to call @code{yyparse} or the parser will never run.
	902	@xref{Interface, ,Parser C-Language Interface}.
	903
	904	Aside from the token type names and the symbols in the actions you
	905	write, all symbols defined in the Bison parser file itself
	906	begin with @samp{yy} or @samp{YY}. This includes interface functions
	907	such as the lexical analyzer function @code{yylex}, the error reporting
	908	function @code{yyerror} and the parser function @code{yyparse} itself.
	909	This also includes numerous identifiers used for internal purposes.
	910	Therefore, you should avoid using C identifiers starting with @samp{yy}
	911	or @samp{YY} in the Bison grammar file except for the ones defined in
	912	this manual.
	913
	914	In some cases the Bison parser file includes system headers, and in
	915	those cases your code should respect the identifiers reserved by those
	916	headers. On some non-@acronym{GNU} hosts, @code{<alloca.h>},
	917	@code{<stddef.h>}, and @code{<stdlib.h>} are included as needed to
	918	declare memory allocators and related types. Other system headers may
	919	be included if you define @code{YYDEBUG} to a nonzero value
	920	(@pxref{Tracing, ,Tracing Your Parser}).
	921
	922	@node Stages
	923	@section Stages in Using Bison
	924	@cindex stages in using Bison
	925	@cindex using Bison
	926
	927	The actual language-design process using Bison, from grammar specification
	928	to a working compiler or interpreter, has these parts:
	929
	930	@enumerate
	931	@item
	932	Formally specify the grammar in a form recognized by Bison
	933	(@pxref{Grammar File, ,Bison Grammar Files}). For each grammatical rule
	934	in the language, describe the action that is to be taken when an
	935	instance of that rule is recognized. The action is described by a
	936	sequence of C statements.
	937
	938	@item
	939	Write a lexical analyzer to process input and pass tokens to the parser.
	940	The lexical analyzer may be written by hand in C (@pxref{Lexical, ,The
	941	Lexical Analyzer Function @code{yylex}}). It could also be produced
	942	using Lex, but the use of Lex is not discussed in this manual.
	943
	944	@item
	945	Write a controlling function that calls the Bison-produced parser.
	946
	947	@item
	948	Write error-reporting routines.
	949	@end enumerate
	950
	951	To turn this source code as written into a runnable program, you
	952	must follow these steps:
	953
	954	@enumerate
	955	@item
	956	Run Bison on the grammar to produce the parser.
	957
	958	@item
	959	Compile the code output by Bison, as well as any other source files.
	960
	961	@item
	962	Link the object files to produce the finished product.
	963	@end enumerate
	964
	965	@node Grammar Layout
	966	@section The Overall Layout of a Bison Grammar
	967	@cindex grammar file
	968	@cindex file format
	969	@cindex format of grammar file
	970	@cindex layout of Bison grammar
	971
	972	The input file for the Bison utility is a @dfn{Bison grammar file}. The
	973	general form of a Bison grammar file is as follows:
	974
	975	@example
	976	%@{
	977	@var{Prologue}
	978	%@}
	979
	980	@var{Bison declarations}
	981
	982	%%
	983	@var{Grammar rules}
	984	%%
	985	@var{Epilogue}
	986	@end example
	987
	988	@noindent
	989	The @samp{%%}, @samp{%@{} and @samp{%@}} are punctuation that appears
	990	in every Bison grammar file to separate the sections.
	991
	992	The prologue may define types and variables used in the actions. You can
	993	also use preprocessor commands to define macros used there, and use
	994	@code{#include} to include header files that do any of these things.
	995
	996	The Bison declarations declare the names of the terminal and nonterminal
	997	symbols, and may also describe operator precedence and the data types of
	998	semantic values of various symbols.
	999
	1000	The grammar rules define how to construct each nonterminal symbol from its
	1001	parts.
	1002
	1003	The epilogue can contain any code you want to use. Often the definition of
	1004	the lexical analyzer @code{yylex} goes here, plus subroutines called by the
	1005	actions in the grammar rules. In a simple program, all the rest of the
	1006	program can go here.
	1007
	1008	@node Examples
	1009	@chapter Examples
	1010	@cindex simple examples
	1011	@cindex examples, simple
	1012
	1013	Now we show and explain three sample programs written using Bison: a
	1014	reverse polish notation calculator, an algebraic (infix) notation
	1015	calculator, and a multi-function calculator. All three have been tested
	1016	under BSD Unix 4.3; each produces a usable, though limited, interactive
	1017	desk-top calculator.
	1018
	1019	These examples are simple, but Bison grammars for real programming
	1020	languages are written the same way.
	1021	@ifinfo
	1022	You can copy these examples out of the Info file and into a source file
	1023	to try them.
	1024	@end ifinfo
	1025
	1026	@menu
	1027	* RPN Calc:: Reverse polish notation calculator;
	1028	a first example with no operator precedence.
	1029	* Infix Calc:: Infix (algebraic) notation calculator.
	1030	Operator precedence is introduced.
	1031	* Simple Error Recovery:: Continuing after syntax errors.
	1032	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	1033	* Multi-function Calc:: Calculator with memory and trig functions.
	1034	It uses multiple data-types for semantic values.
	1035	* Exercises:: Ideas for improving the multi-function calculator.
	1036	@end menu
	1037
	1038	@node RPN Calc
	1039	@section Reverse Polish Notation Calculator
	1040	@cindex reverse polish notation
	1041	@cindex polish notation calculator
	1042	@cindex @code{rpcalc}
	1043	@cindex calculator, simple
	1044
	1045	The first example is that of a simple double-precision @dfn{reverse polish
	1046	notation} calculator (a calculator using postfix operators). This example
	1047	provides a good starting point, since operator precedence is not an issue.
	1048	The second example will illustrate how operator precedence is handled.
	1049
	1050	The source code for this calculator is named @file{rpcalc.y}. The
	1051	@samp{.y} extension is a convention used for Bison input files.
	1052
	1053	@menu
	1054	* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc.
	1055	* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation.
	1056	* Lexer: Rpcalc Lexer. The lexical analyzer.
	1057	* Main: Rpcalc Main. The controlling function.
	1058	* Error: Rpcalc Error. The error reporting function.
	1059	* Gen: Rpcalc Gen. Running Bison on the grammar file.
	1060	* Comp: Rpcalc Compile. Run the C compiler on the output code.
	1061	@end menu
	1062
	1063	@node Rpcalc Decls
	1064	@subsection Declarations for @code{rpcalc}
	1065
	1066	Here are the C and Bison declarations for the reverse polish notation
	1067	calculator. As in C, comments are placed between @samp{/@dots{}/}.
	1068
	1069	@example
	1070	/* Reverse polish notation calculator. */
	1071
	1072	%@{
	1073	#define YYSTYPE double
	1074	#include <math.h>
	1075	%@}
	1076
	1077	%token NUM
	1078
	1079	%% /* Grammar rules and actions follow. */
	1080	@end example
	1081
	1082	The declarations section (@pxref{Prologue, , The prologue}) contains two
	1083	preprocessor directives.
	1084
	1085	The @code{#define} directive defines the macro @code{YYSTYPE}, thus
	1086	specifying the C data type for semantic values of both tokens and
	1087	groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The
	1088	Bison parser will use whatever type @code{YYSTYPE} is defined as; if you
	1089	don't define it, @code{int} is the default. Because we specify
	1090	@code{double}, each token and each expression has an associated value,
	1091	which is a floating point number.
	1092
	1093	The @code{#include} directive is used to declare the exponentiation
	1094	function @code{pow}.
	1095
	1096	The second section, Bison declarations, provides information to Bison
	1097	about the token types (@pxref{Bison Declarations, ,The Bison
	1098	Declarations Section}). Each terminal symbol that is not a
	1099	single-character literal must be declared here. (Single-character
	1100	literals normally don't need to be declared.) In this example, all the
	1101	arithmetic operators are designated by single-character literals, so the
	1102	only terminal symbol that needs to be declared is @code{NUM}, the token
	1103	type for numeric constants.
	1104
	1105	@node Rpcalc Rules
	1106	@subsection Grammar Rules for @code{rpcalc}
	1107
	1108	Here are the grammar rules for the reverse polish notation calculator.
	1109
	1110	@example
	1111	input: /* empty */
	1112	\| input line
	1113	;
	1114
	1115	line: '\n'
	1116	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1117	;
	1118
	1119	exp: NUM @{ $$ = $1; @}
	1120	\| exp exp '+' @{ $$ = $1 + $2; @}
	1121	\| exp exp '-' @{ $$ = $1 - $2; @}
	1122	\| exp exp '' @{ $$ = $1 $2; @}
	1123	\| exp exp '/' @{ $$ = $1 / $2; @}
	1124	/* Exponentiation */
	1125	\| exp exp '^' @{ $$ = pow ($1, $2); @}
	1126	/* Unary minus */
	1127	\| exp 'n' @{ $$ = -$1; @}
	1128	;
	1129	%%
	1130	@end example
	1131
	1132	The groupings of the rpcalc ``language'' defined here are the expression
	1133	(given the name @code{exp}), the line of input (@code{line}), and the
	1134	complete input transcript (@code{input}). Each of these nonterminal
	1135	symbols has several alternate rules, joined by the @samp{\|} punctuator
	1136	which is read as ``or''. The following sections explain what these rules
	1137	mean.
	1138
	1139	The semantics of the language is determined by the actions taken when a
	1140	grouping is recognized. The actions are the C code that appears inside
	1141	braces. @xref{Actions}.
	1142
	1143	You must specify these actions in C, but Bison provides the means for
	1144	passing semantic values between the rules. In each action, the
	1145	pseudo-variable @code{$$} stands for the semantic value for the grouping
	1146	that the rule is going to construct. Assigning a value to @code{$$} is the
	1147	main job of most actions. The semantic values of the components of the
	1148	rule are referred to as @code{$1}, @code{$2}, and so on.
	1149
	1150	@menu
	1151	* Rpcalc Input::
	1152	* Rpcalc Line::
	1153	* Rpcalc Expr::
	1154	@end menu
	1155
	1156	@node Rpcalc Input
	1157	@subsubsection Explanation of @code{input}
	1158
	1159	Consider the definition of @code{input}:
	1160
	1161	@example
	1162	input: /* empty */
	1163	\| input line
	1164	;
	1165	@end example
	1166
	1167	This definition reads as follows: ``A complete input is either an empty
	1168	string, or a complete input followed by an input line''. Notice that
	1169	``complete input'' is defined in terms of itself. This definition is said
	1170	to be @dfn{left recursive} since @code{input} appears always as the
	1171	leftmost symbol in the sequence. @xref{Recursion, ,Recursive Rules}.
	1172
	1173	The first alternative is empty because there are no symbols between the
	1174	colon and the first @samp{\|}; this means that @code{input} can match an
	1175	empty string of input (no tokens). We write the rules this way because it
	1176	is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
	1177	It's conventional to put an empty alternative first and write the comment
	1178	@samp{/* empty */} in it.
	1179
	1180	The second alternate rule (@code{input line}) handles all nontrivial input.
	1181	It means, ``After reading any number of lines, read one more line if
	1182	possible.'' The left recursion makes this rule into a loop. Since the
	1183	first alternative matches empty input, the loop can be executed zero or
	1184	more times.
	1185
	1186	The parser function @code{yyparse} continues to process input until a
	1187	grammatical error is seen or the lexical analyzer says there are no more
	1188	input tokens; we will arrange for the latter to happen at end-of-input.
	1189
	1190	@node Rpcalc Line
	1191	@subsubsection Explanation of @code{line}
	1192
	1193	Now consider the definition of @code{line}:
	1194
	1195	@example
	1196	line: '\n'
	1197	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1198	;
	1199	@end example
	1200
	1201	The first alternative is a token which is a newline character; this means
	1202	that rpcalc accepts a blank line (and ignores it, since there is no
	1203	action). The second alternative is an expression followed by a newline.
	1204	This is the alternative that makes rpcalc useful. The semantic value of
	1205	the @code{exp} grouping is the value of @code{$1} because the @code{exp} in
	1206	question is the first symbol in the alternative. The action prints this
	1207	value, which is the result of the computation the user asked for.
	1208
	1209	This action is unusual because it does not assign a value to @code{$$}. As
	1210	a consequence, the semantic value associated with the @code{line} is
	1211	uninitialized (its value will be unpredictable). This would be a bug if
	1212	that value were ever used, but we don't use it: once rpcalc has printed the
	1213	value of the user's input line, that value is no longer needed.
	1214
	1215	@node Rpcalc Expr
	1216	@subsubsection Explanation of @code{expr}
	1217
	1218	The @code{exp} grouping has several rules, one for each kind of expression.
	1219	The first rule handles the simplest expressions: those that are just numbers.
	1220	The second handles an addition-expression, which looks like two expressions
	1221	followed by a plus-sign. The third handles subtraction, and so on.
	1222
	1223	@example
	1224	exp: NUM
	1225	\| exp exp '+' @{ $$ = $1 + $2; @}
	1226	\| exp exp '-' @{ $$ = $1 - $2; @}
	1227	@dots{}
	1228	;
	1229	@end example
	1230
	1231	We have used @samp{\|} to join all the rules for @code{exp}, but we could
	1232	equally well have written them separately:
	1233
	1234	@example
	1235	exp: NUM ;
	1236	exp: exp exp '+' @{ $$ = $1 + $2; @} ;
	1237	exp: exp exp '-' @{ $$ = $1 - $2; @} ;
	1238	@dots{}
	1239	@end example
	1240
	1241	Most of the rules have actions that compute the value of the expression in
	1242	terms of the value of its parts. For example, in the rule for addition,
	1243	@code{$1} refers to the first component @code{exp} and @code{$2} refers to
	1244	the second one. The third component, @code{'+'}, has no meaningful
	1245	associated semantic value, but if it had one you could refer to it as
	1246	@code{$3}. When @code{yyparse} recognizes a sum expression using this
	1247	rule, the sum of the two subexpressions' values is produced as the value of
	1248	the entire expression. @xref{Actions}.
	1249
	1250	You don't have to give an action for every rule. When a rule has no
	1251	action, Bison by default copies the value of @code{$1} into @code{$$}.
	1252	This is what happens in the first rule (the one that uses @code{NUM}).
	1253
	1254	The formatting shown here is the recommended convention, but Bison does
	1255	not require it. You can add or change white space as much as you wish.
	1256	For example, this:
	1257
	1258	@example
	1259	exp : NUM \| exp exp '+' @{$$ = $1 + $2; @} \| @dots{}
	1260	@end example
	1261
	1262	@noindent
	1263	means the same thing as this:
	1264
	1265	@example
	1266	exp: NUM
	1267	\| exp exp '+' @{ $$ = $1 + $2; @}
	1268	\| @dots{}
	1269	@end example
	1270
	1271	@noindent
	1272	The latter, however, is much more readable.
	1273
	1274	@node Rpcalc Lexer
	1275	@subsection The @code{rpcalc} Lexical Analyzer
	1276	@cindex writing a lexical analyzer
	1277	@cindex lexical analyzer, writing
	1278
	1279	The lexical analyzer's job is low-level parsing: converting characters
	1280	or sequences of characters into tokens. The Bison parser gets its
	1281	tokens by calling the lexical analyzer. @xref{Lexical, ,The Lexical
	1282	Analyzer Function @code{yylex}}.
	1283
	1284	Only a simple lexical analyzer is needed for the @acronym{RPN}
	1285	calculator. This
	1286	lexical analyzer skips blanks and tabs, then reads in numbers as
	1287	@code{double} and returns them as @code{NUM} tokens. Any other character
	1288	that isn't part of a number is a separate token. Note that the token-code
	1289	for such a single-character token is the character itself.
	1290
	1291	The return value of the lexical analyzer function is a numeric code which
	1292	represents a token type. The same text used in Bison rules to stand for
	1293	this token type is also a C expression for the numeric code for the type.
	1294	This works in two ways. If the token type is a character literal, then its
	1295	numeric code is that of the character; you can use the same
	1296	character literal in the lexical analyzer to express the number. If the
	1297	token type is an identifier, that identifier is defined by Bison as a C
	1298	macro whose definition is the appropriate number. In this example,
	1299	therefore, @code{NUM} becomes a macro for @code{yylex} to use.
	1300
	1301	The semantic value of the token (if it has one) is stored into the
	1302	global variable @code{yylval}, which is where the Bison parser will look
	1303	for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was
	1304	defined at the beginning of the grammar; @pxref{Rpcalc Decls,
	1305	,Declarations for @code{rpcalc}}.)
	1306
	1307	A token type code of zero is returned if the end-of-input is encountered.
	1308	(Bison recognizes any nonpositive value as indicating end-of-input.)
	1309
	1310	Here is the code for the lexical analyzer:
	1311
	1312	@example
	1313	@group
	1314	/* The lexical analyzer returns a double floating point
	1315	number on the stack and the token NUM, or the numeric code
	1316	of the character read if not a number. It skips all blanks
	1317	and tabs, and returns 0 for end-of-input. */
	1318
	1319	#include <ctype.h>
	1320	@end group
	1321
	1322	@group
	1323	int
	1324	yylex (void)
	1325	@{
	1326	int c;
	1327
	1328	/* Skip white space. */
	1329	while ((c = getchar ()) == ' ' \|\| c == '\t')
	1330	;
	1331	@end group
	1332	@group
	1333	/* Process numbers. */
	1334	if (c == '.' \|\| isdigit (c))
	1335	@{
	1336	ungetc (c, stdin);
	1337	scanf ("%lf", &yylval);
	1338	return NUM;
	1339	@}
	1340	@end group
	1341	@group
	1342	/* Return end-of-input. */
	1343	if (c == EOF)
	1344	return 0;
	1345	/* Return a single char. */
	1346	return c;
	1347	@}
	1348	@end group
	1349	@end example
	1350
	1351	@node Rpcalc Main
	1352	@subsection The Controlling Function
	1353	@cindex controlling function
	1354	@cindex main function in simple example
	1355
	1356	In keeping with the spirit of this example, the controlling function is
	1357	kept to the bare minimum. The only requirement is that it call
	1358	@code{yyparse} to start the process of parsing.
	1359
	1360	@example
	1361	@group
	1362	int
	1363	main (void)
	1364	@{
	1365	return yyparse ();
	1366	@}
	1367	@end group
	1368	@end example
	1369
	1370	@node Rpcalc Error
	1371	@subsection The Error Reporting Routine
	1372	@cindex error reporting routine
	1373
	1374	When @code{yyparse} detects a syntax error, it calls the error reporting
	1375	function @code{yyerror} to print an error message (usually but not
	1376	always @code{"syntax error"}). It is up to the programmer to supply
	1377	@code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so
	1378	here is the definition we will use:
	1379
	1380	@example
	1381	@group
	1382	#include <stdio.h>
	1383
	1384	void
	1385	yyerror (const char s) / Called by yyparse on error. */
	1386	@{
	1387	printf ("%s\n", s);
	1388	@}
	1389	@end group
	1390	@end example
	1391
	1392	After @code{yyerror} returns, the Bison parser may recover from the error
	1393	and continue parsing if the grammar contains a suitable error rule
	1394	(@pxref{Error Recovery}). Otherwise, @code{yyparse} returns nonzero. We
	1395	have not written any error rules in this example, so any invalid input will
	1396	cause the calculator program to exit. This is not clean behavior for a
	1397	real calculator, but it is adequate for the first example.
	1398
	1399	@node Rpcalc Gen
	1400	@subsection Running Bison to Make the Parser
	1401	@cindex running Bison (introduction)
	1402
	1403	Before running Bison to produce a parser, we need to decide how to
	1404	arrange all the source code in one or more source files. For such a
	1405	simple example, the easiest thing is to put everything in one file. The
	1406	definitions of @code{yylex}, @code{yyerror} and @code{main} go at the
	1407	end, in the epilogue of the file
	1408	(@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}).
	1409
	1410	For a large project, you would probably have several source files, and use
	1411	@code{make} to arrange to recompile them.
	1412
	1413	With all the source in a single file, you use the following command to
	1414	convert it into a parser file:
	1415
	1416	@example
	1417	bison @var{file_name}.y
	1418	@end example
	1419
	1420	@noindent
	1421	In this example the file was called @file{rpcalc.y} (for ``Reverse Polish
	1422	@sc{calc}ulator''). Bison produces a file named @file{@var{file_name}.tab.c},
	1423	removing the @samp{.y} from the original file name. The file output by
	1424	Bison contains the source code for @code{yyparse}. The additional
	1425	functions in the input file (@code{yylex}, @code{yyerror} and @code{main})
	1426	are copied verbatim to the output.
	1427
	1428	@node Rpcalc Compile
	1429	@subsection Compiling the Parser File
	1430	@cindex compiling the parser
	1431
	1432	Here is how to compile and run the parser file:
	1433
	1434	@example
	1435	@group
	1436	# @r{List files in current directory.}
	1437	$ @kbd{ls}
	1438	rpcalc.tab.c rpcalc.y
	1439	@end group
	1440
	1441	@group
	1442	# @r{Compile the Bison parser.}
	1443	# @r{@samp{-lm} tells compiler to search math library for @code{pow}.}
	1444	$ @kbd{cc -lm -o rpcalc rpcalc.tab.c}
	1445	@end group
	1446
	1447	@group
	1448	# @r{List files again.}
	1449	$ @kbd{ls}
	1450	rpcalc rpcalc.tab.c rpcalc.y
	1451	@end group
	1452	@end example
	1453
	1454	The file @file{rpcalc} now contains the executable code. Here is an
	1455	example session using @code{rpcalc}.
	1456
	1457	@example
	1458	$ @kbd{rpcalc}
	1459	@kbd{4 9 +}
	1460	13
	1461	@kbd{3 7 + 3 4 5 *+-}
	1462	-13
	1463	@kbd{3 7 + 3 4 5 * + - n} @r{Note the unary minus, @samp{n}}
	1464	13
	1465	@kbd{5 6 / 4 n +}
	1466	-3.166666667
	1467	@kbd{3 4 ^} @r{Exponentiation}
	1468	81
	1469	@kbd{^D} @r{End-of-file indicator}
	1470	$
	1471	@end example
	1472
	1473	@node Infix Calc
	1474	@section Infix Notation Calculator: @code{calc}
	1475	@cindex infix notation calculator
	1476	@cindex @code{calc}
	1477	@cindex calculator, infix notation
	1478
	1479	We now modify rpcalc to handle infix operators instead of postfix. Infix
	1480	notation involves the concept of operator precedence and the need for
	1481	parentheses nested to arbitrary depth. Here is the Bison code for
	1482	@file{calc.y}, an infix desk-top calculator.
	1483
	1484	@example
	1485	/* Infix notation calculator--calc */
	1486
	1487	%@{
	1488	#define YYSTYPE double
	1489	#include <math.h>
	1490	%@}
	1491
	1492	/* Bison Declarations */
	1493	%token NUM
	1494	%left '-' '+'
	1495	%left '*' '/'
	1496	%left NEG /* negation--unary minus */
	1497	%right '^' /* exponentiation */
	1498
	1499	/* Grammar follows */
	1500	%%
	1501	input: /* empty string */
	1502	\| input line
	1503	;
	1504
	1505	line: '\n'
	1506	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1507	;
	1508
	1509	exp: NUM @{ $$ = $1; @}
	1510	\| exp '+' exp @{ $$ = $1 + $3; @}
	1511	\| exp '-' exp @{ $$ = $1 - $3; @}
	1512	\| exp '' exp @{ $$ = $1 $3; @}
	1513	\| exp '/' exp @{ $$ = $1 / $3; @}
	1514	\| '-' exp %prec NEG @{ $$ = -$2; @}
	1515	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1516	\| '(' exp ')' @{ $$ = $2; @}
	1517	;
	1518	%%
	1519	@end example
	1520
	1521	@noindent
	1522	The functions @code{yylex}, @code{yyerror} and @code{main} can be the
	1523	same as before.
	1524
	1525	There are two important new features shown in this code.
	1526
	1527	In the second section (Bison declarations), @code{%left} declares token
	1528	types and says they are left-associative operators. The declarations
	1529	@code{%left} and @code{%right} (right associativity) take the place of
	1530	@code{%token} which is used to declare a token type name without
	1531	associativity. (These tokens are single-character literals, which
	1532	ordinarily don't need to be declared. We declare them here to specify
	1533	the associativity.)
	1534
	1535	Operator precedence is determined by the line ordering of the
	1536	declarations; the higher the line number of the declaration (lower on
	1537	the page or screen), the higher the precedence. Hence, exponentiation
	1538	has the highest precedence, unary minus (@code{NEG}) is next, followed
	1539	by @samp{*} and @samp{/}, and so on. @xref{Precedence, ,Operator
	1540	Precedence}.
	1541
	1542	The other important new feature is the @code{%prec} in the grammar
	1543	section for the unary minus operator. The @code{%prec} simply instructs
	1544	Bison that the rule @samp{\| '-' exp} has the same precedence as
	1545	@code{NEG}---in this case the next-to-highest. @xref{Contextual
	1546	Precedence, ,Context-Dependent Precedence}.
	1547
	1548	Here is a sample run of @file{calc.y}:
	1549
	1550	@need 500
	1551	@example
	1552	$ @kbd{calc}
	1553	@kbd{4 + 4.5 - (34/(8*3+-3))}
	1554	6.880952381
	1555	@kbd{-56 + 2}
	1556	-54
	1557	@kbd{3 ^ 2}
	1558	9
	1559	@end example
	1560
	1561	@node Simple Error Recovery
	1562	@section Simple Error Recovery
	1563	@cindex error recovery, simple
	1564
	1565	Up to this point, this manual has not addressed the issue of @dfn{error
	1566	recovery}---how to continue parsing after the parser detects a syntax
	1567	error. All we have handled is error reporting with @code{yyerror}.
	1568	Recall that by default @code{yyparse} returns after calling
	1569	@code{yyerror}. This means that an erroneous input line causes the
	1570	calculator program to exit. Now we show how to rectify this deficiency.
	1571
	1572	The Bison language itself includes the reserved word @code{error}, which
	1573	may be included in the grammar rules. In the example below it has
	1574	been added to one of the alternatives for @code{line}:
	1575
	1576	@example
	1577	@group
	1578	line: '\n'
	1579	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1580	\| error '\n' @{ yyerrok; @}
	1581	;
	1582	@end group
	1583	@end example
	1584
	1585	This addition to the grammar allows for simple error recovery in the
	1586	event of a syntax error. If an expression that cannot be evaluated is
	1587	read, the error will be recognized by the third rule for @code{line},
	1588	and parsing will continue. (The @code{yyerror} function is still called
	1589	upon to print its message as well.) The action executes the statement
	1590	@code{yyerrok}, a macro defined automatically by Bison; its meaning is
	1591	that error recovery is complete (@pxref{Error Recovery}). Note the
	1592	difference between @code{yyerrok} and @code{yyerror}; neither one is a
	1593	misprint.
	1594
	1595	This form of error recovery deals with syntax errors. There are other
	1596	kinds of errors; for example, division by zero, which raises an exception
	1597	signal that is normally fatal. A real calculator program must handle this
	1598	signal and use @code{longjmp} to return to @code{main} and resume parsing
	1599	input lines; it would also have to discard the rest of the current line of
	1600	input. We won't discuss this issue further because it is not specific to
	1601	Bison programs.
	1602
	1603	@node Location Tracking Calc
	1604	@section Location Tracking Calculator: @code{ltcalc}
	1605	@cindex location tracking calculator
	1606	@cindex @code{ltcalc}
	1607	@cindex calculator, location tracking
	1608
	1609	This example extends the infix notation calculator with location
	1610	tracking. This feature will be used to improve the error messages. For
	1611	the sake of clarity, this example is a simple integer calculator, since
	1612	most of the work needed to use locations will be done in the lexical
	1613	analyzer.
	1614
	1615	@menu
	1616	* Decls: Ltcalc Decls. Bison and C declarations for ltcalc.
	1617	* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations.
	1618	* Lexer: Ltcalc Lexer. The lexical analyzer.
	1619	@end menu
	1620
	1621	@node Ltcalc Decls
	1622	@subsection Declarations for @code{ltcalc}
	1623
	1624	The C and Bison declarations for the location tracking calculator are
	1625	the same as the declarations for the infix notation calculator.
	1626
	1627	@example
	1628	/* Location tracking calculator. */
	1629
	1630	%@{
	1631	#define YYSTYPE int
	1632	#include <math.h>
	1633	%@}
	1634
	1635	/* Bison declarations. */
	1636	%token NUM
	1637
	1638	%left '-' '+'
	1639	%left '*' '/'
	1640	%left NEG
	1641	%right '^'
	1642
	1643	%% /* Grammar follows */
	1644	@end example
	1645
	1646	@noindent
	1647	Note there are no declarations specific to locations. Defining a data
	1648	type for storing locations is not needed: we will use the type provided
	1649	by default (@pxref{Location Type, ,Data Types of Locations}), which is a
	1650	four member structure with the following integer fields:
	1651	@code{first_line}, @code{first_column}, @code{last_line} and
	1652	@code{last_column}.
	1653
	1654	@node Ltcalc Rules
	1655	@subsection Grammar Rules for @code{ltcalc}
	1656
	1657	Whether handling locations or not has no effect on the syntax of your
	1658	language. Therefore, grammar rules for this example will be very close
	1659	to those of the previous example: we will only modify them to benefit
	1660	from the new information.
	1661
	1662	Here, we will use locations to report divisions by zero, and locate the
	1663	wrong expressions or subexpressions.
	1664
	1665	@example
	1666	@group
	1667	input : /* empty */
	1668	\| input line
	1669	;
	1670	@end group
	1671
	1672	@group
	1673	line : '\n'
	1674	\| exp '\n' @{ printf ("%d\n", $1); @}
	1675	;
	1676	@end group
	1677
	1678	@group
	1679	exp : NUM @{ $$ = $1; @}
	1680	\| exp '+' exp @{ $$ = $1 + $3; @}
	1681	\| exp '-' exp @{ $$ = $1 - $3; @}
	1682	\| exp '' exp @{ $$ = $1 $3; @}
	1683	@end group
	1684	@group
	1685	\| exp '/' exp
	1686	@{
	1687	if ($3)
	1688	$$ = $1 / $3;
	1689	else
	1690	@{
	1691	$$ = 1;
	1692	fprintf (stderr, "%d.%d-%d.%d: division by zero",
	1693	@@3.first_line, @@3.first_column,
	1694	@@3.last_line, @@3.last_column);
	1695	@}
	1696	@}
	1697	@end group
	1698	@group
	1699	\| '-' exp %preg NEG @{ $$ = -$2; @}
	1700	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1701	\| '(' exp ')' @{ $$ = $2; @}
	1702	@end group
	1703	@end example
	1704
	1705	This code shows how to reach locations inside of semantic actions, by
	1706	using the pseudo-variables @code{@@@var{n}} for rule components, and the
	1707	pseudo-variable @code{@@$} for groupings.
	1708
	1709	We don't need to assign a value to @code{@@$}: the output parser does it
	1710	automatically. By default, before executing the C code of each action,
	1711	@code{@@$} is set to range from the beginning of @code{@@1} to the end
	1712	of @code{@@@var{n}}, for a rule with @var{n} components. This behavior
	1713	can be redefined (@pxref{Location Default Action, , Default Action for
	1714	Locations}), and for very specific rules, @code{@@$} can be computed by
	1715	hand.
	1716
	1717	@node Ltcalc Lexer
	1718	@subsection The @code{ltcalc} Lexical Analyzer.
	1719
	1720	Until now, we relied on Bison's defaults to enable location
	1721	tracking. The next step is to rewrite the lexical analyzer, and make it
	1722	able to feed the parser with the token locations, as it already does for
	1723	semantic values.
	1724
	1725	To this end, we must take into account every single character of the
	1726	input text, to avoid the computed locations of being fuzzy or wrong:
	1727
	1728	@example
	1729	@group
	1730	int
	1731	yylex (void)
	1732	@{
	1733	int c;
	1734	@end group
	1735
	1736	@group
	1737	/* Skip white space. */
	1738	while ((c = getchar ()) == ' ' \|\| c == '\t')
	1739	++yylloc.last_column;
	1740	@end group
	1741
	1742	@group
	1743	/* Step. */
	1744	yylloc.first_line = yylloc.last_line;
	1745	yylloc.first_column = yylloc.last_column;
	1746	@end group
	1747
	1748	@group
	1749	/* Process numbers. */
	1750	if (isdigit (c))
	1751	@{
	1752	yylval = c - '0';
	1753	++yylloc.last_column;
	1754	while (isdigit (c = getchar ()))
	1755	@{
	1756	++yylloc.last_column;
	1757	yylval = yylval * 10 + c - '0';
	1758	@}
	1759	ungetc (c, stdin);
	1760	return NUM;
	1761	@}
	1762	@end group
	1763
	1764	/* Return end-of-input. */
	1765	if (c == EOF)
	1766	return 0;
	1767
	1768	/* Return a single char, and update location. */
	1769	if (c == '\n')
	1770	@{
	1771	++yylloc.last_line;
	1772	yylloc.last_column = 0;
	1773	@}
	1774	else
	1775	++yylloc.last_column;
	1776	return c;
	1777	@}
	1778	@end example
	1779
	1780	Basically, the lexical analyzer performs the same processing as before:
	1781	it skips blanks and tabs, and reads numbers or single-character tokens.
	1782	In addition, it updates @code{yylloc}, the global variable (of type
	1783	@code{YYLTYPE}) containing the token's location.
	1784
	1785	Now, each time this function returns a token, the parser has its number
	1786	as well as its semantic value, and its location in the text. The last
	1787	needed change is to initialize @code{yylloc}, for example in the
	1788	controlling function:
	1789
	1790	@example
	1791	@group
	1792	int
	1793	main (void)
	1794	@{
	1795	yylloc.first_line = yylloc.last_line = 1;
	1796	yylloc.first_column = yylloc.last_column = 0;
	1797	return yyparse ();
	1798	@}
	1799	@end group
	1800	@end example
	1801
	1802	Remember that computing locations is not a matter of syntax. Every
	1803	character must be associated to a location update, whether it is in
	1804	valid input, in comments, in literal strings, and so on.
	1805
	1806	@node Multi-function Calc
	1807	@section Multi-Function Calculator: @code{mfcalc}
	1808	@cindex multi-function calculator
	1809	@cindex @code{mfcalc}
	1810	@cindex calculator, multi-function
	1811
	1812	Now that the basics of Bison have been discussed, it is time to move on to
	1813	a more advanced problem. The above calculators provided only five
	1814	functions, @samp{+}, @samp{-}, @samp{*}, @samp{/} and @samp{^}. It would
	1815	be nice to have a calculator that provides other mathematical functions such
	1816	as @code{sin}, @code{cos}, etc.
	1817
	1818	It is easy to add new operators to the infix calculator as long as they are
	1819	only single-character literals. The lexical analyzer @code{yylex} passes
	1820	back all nonnumber characters as tokens, so new grammar rules suffice for
	1821	adding a new operator. But we want something more flexible: built-in
	1822	functions whose syntax has this form:
	1823
	1824	@example
	1825	@var{function_name} (@var{argument})
	1826	@end example
	1827
	1828	@noindent
	1829	At the same time, we will add memory to the calculator, by allowing you
	1830	to create named variables, store values in them, and use them later.
	1831	Here is a sample session with the multi-function calculator:
	1832
	1833	@example
	1834	$ @kbd{mfcalc}
	1835	@kbd{pi = 3.141592653589}
	1836	3.1415926536
	1837	@kbd{sin(pi)}
	1838	0.0000000000
	1839	@kbd{alpha = beta1 = 2.3}
	1840	2.3000000000
	1841	@kbd{alpha}
	1842	2.3000000000
	1843	@kbd{ln(alpha)}
	1844	0.8329091229
	1845	@kbd{exp(ln(beta1))}
	1846	2.3000000000
	1847	$
	1848	@end example
	1849
	1850	Note that multiple assignment and nested function calls are permitted.
	1851
	1852	@menu
	1853	* Decl: Mfcalc Decl. Bison declarations for multi-function calculator.
	1854	* Rules: Mfcalc Rules. Grammar rules for the calculator.
	1855	* Symtab: Mfcalc Symtab. Symbol table management subroutines.
	1856	@end menu
	1857
	1858	@node Mfcalc Decl
	1859	@subsection Declarations for @code{mfcalc}
	1860
	1861	Here are the C and Bison declarations for the multi-function calculator.
	1862
	1863	@smallexample
	1864	@group
	1865	%@{
	1866	#include <math.h> /* For math functions, cos(), sin(), etc. */
	1867	#include "calc.h" /* Contains definition of `symrec' */
	1868	%@}
	1869	@end group
	1870	@group
	1871	%union @{
	1872	double val; /* For returning numbers. */
	1873	symrec tptr; / For returning symbol-table pointers. */
	1874	@}
	1875	@end group
	1876	%token <val> NUM /* Simple double precision number. */
	1877	%token <tptr> VAR FNCT /* Variable and Function. */
	1878	%type <val> exp
	1879
	1880	@group
	1881	%right '='
	1882	%left '-' '+'
	1883	%left '*' '/'
	1884	%left NEG /* Negation--unary minus */
	1885	%right '^' /* Exponentiation */
	1886	@end group
	1887	/* Grammar follows */
	1888	%%
	1889	@end smallexample
	1890
	1891	The above grammar introduces only two new features of the Bison language.
	1892	These features allow semantic values to have various data types
	1893	(@pxref{Multiple Types, ,More Than One Value Type}).
	1894
	1895	The @code{%union} declaration specifies the entire list of possible types;
	1896	this is instead of defining @code{YYSTYPE}. The allowable types are now
	1897	double-floats (for @code{exp} and @code{NUM}) and pointers to entries in
	1898	the symbol table. @xref{Union Decl, ,The Collection of Value Types}.
	1899
	1900	Since values can now have various types, it is necessary to associate a
	1901	type with each grammar symbol whose semantic value is used. These symbols
	1902	are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their
	1903	declarations are augmented with information about their data type (placed
	1904	between angle brackets).
	1905
	1906	The Bison construct @code{%type} is used for declaring nonterminal
	1907	symbols, just as @code{%token} is used for declaring token types. We
	1908	have not used @code{%type} before because nonterminal symbols are
	1909	normally declared implicitly by the rules that define them. But
	1910	@code{exp} must be declared explicitly so we can specify its value type.
	1911	@xref{Type Decl, ,Nonterminal Symbols}.
	1912
	1913	@node Mfcalc Rules
	1914	@subsection Grammar Rules for @code{mfcalc}
	1915
	1916	Here are the grammar rules for the multi-function calculator.
	1917	Most of them are copied directly from @code{calc}; three rules,
	1918	those which mention @code{VAR} or @code{FNCT}, are new.
	1919
	1920	@smallexample
	1921	@group
	1922	input: /* empty */
	1923	\| input line
	1924	;
	1925	@end group
	1926
	1927	@group
	1928	line:
	1929	'\n'
	1930	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1931	\| error '\n' @{ yyerrok; @}
	1932	;
	1933	@end group
	1934
	1935	@group
	1936	exp: NUM @{ $$ = $1; @}
	1937	\| VAR @{ $$ = $1->value.var; @}
	1938	\| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
	1939	\| FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @}
	1940	\| exp '+' exp @{ $$ = $1 + $3; @}
	1941	\| exp '-' exp @{ $$ = $1 - $3; @}
	1942	\| exp '' exp @{ $$ = $1 $3; @}
	1943	\| exp '/' exp @{ $$ = $1 / $3; @}
	1944	\| '-' exp %prec NEG @{ $$ = -$2; @}
	1945	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1946	\| '(' exp ')' @{ $$ = $2; @}
	1947	;
	1948	@end group
	1949	/* End of grammar */
	1950	%%
	1951	@end smallexample
	1952
	1953	@node Mfcalc Symtab
	1954	@subsection The @code{mfcalc} Symbol Table
	1955	@cindex symbol table example
	1956
	1957	The multi-function calculator requires a symbol table to keep track of the
	1958	names and meanings of variables and functions. This doesn't affect the
	1959	grammar rules (except for the actions) or the Bison declarations, but it
	1960	requires some additional C functions for support.
	1961
	1962	The symbol table itself consists of a linked list of records. Its
	1963	definition, which is kept in the header @file{calc.h}, is as follows. It
	1964	provides for either functions or variables to be placed in the table.
	1965
	1966	@smallexample
	1967	@group
	1968	/* Function type. */
	1969	typedef double (*func_t) (double);
	1970	@end group
	1971
	1972	@group
	1973	/* Data type for links in the chain of symbols. */
	1974	struct symrec
	1975	@{
	1976	char name; / name of symbol */
	1977	int type; /* type of symbol: either VAR or FNCT */
	1978	union
	1979	@{
	1980	double var; /* value of a VAR */
	1981	func_t fnctptr; /* value of a FNCT */
	1982	@} value;
	1983	struct symrec next; / link field */
	1984	@};
	1985	@end group
	1986
	1987	@group
	1988	typedef struct symrec symrec;
	1989
	1990	/* The symbol table: a chain of `struct symrec'. */
	1991	extern symrec *sym_table;
	1992
	1993	symrec putsym (const char , func_t);
	1994	symrec getsym (const char );
	1995	@end group
	1996	@end smallexample
	1997
	1998	The new version of @code{main} includes a call to @code{init_table}, a
	1999	function that initializes the symbol table. Here it is, and
	2000	@code{init_table} as well:
	2001
	2002	@smallexample
	2003	#include <stdio.h>
	2004
	2005	@group
	2006	int
	2007	main (void)
	2008	@{
	2009	init_table ();
	2010	return yyparse ();
	2011	@}
	2012	@end group
	2013
	2014	@group
	2015	void
	2016	yyerror (const char s) / Called by yyparse on error. */
	2017	@{
	2018	printf ("%s\n", s);
	2019	@}
	2020	@end group
	2021
	2022	@group
	2023	struct init
	2024	@{
	2025	char *fname;
	2026	double (*fnct)(double);
	2027	@};
	2028	@end group
	2029
	2030	@group
	2031	struct init arith_fncts[] =
	2032	@{
	2033	"sin", sin,
	2034	"cos", cos,
	2035	"atan", atan,
	2036	"ln", log,
	2037	"exp", exp,
	2038	"sqrt", sqrt,
	2039	0, 0
	2040	@};
	2041	@end group
	2042
	2043	@group
	2044	/* The symbol table: a chain of `struct symrec'. */
	2045	symrec sym_table = (symrec ) 0;
	2046	@end group
	2047
	2048	@group
	2049	/* Put arithmetic functions in table. */
	2050	void
	2051	init_table (void)
	2052	@{
	2053	int i;
	2054	symrec *ptr;
	2055	for (i = 0; arith_fncts[i].fname != 0; i++)
	2056	@{
	2057	ptr = putsym (arith_fncts[i].fname, FNCT);
	2058	ptr->value.fnctptr = arith_fncts[i].fnct;
	2059	@}
	2060	@}
	2061	@end group
	2062	@end smallexample
	2063
	2064	By simply editing the initialization list and adding the necessary include
	2065	files, you can add additional functions to the calculator.
	2066
	2067	Two important functions allow look-up and installation of symbols in the
	2068	symbol table. The function @code{putsym} is passed a name and the type
	2069	(@code{VAR} or @code{FNCT}) of the object to be installed. The object is
	2070	linked to the front of the list, and a pointer to the object is returned.
	2071	The function @code{getsym} is passed the name of the symbol to look up. If
	2072	found, a pointer to that symbol is returned; otherwise zero is returned.
	2073
	2074	@smallexample
	2075	symrec *
	2076	putsym (char *sym_name, int sym_type)
	2077	@{
	2078	symrec *ptr;
	2079	ptr = (symrec *) malloc (sizeof (symrec));
	2080	ptr->name = (char *) malloc (strlen (sym_name) + 1);
	2081	strcpy (ptr->name,sym_name);
	2082	ptr->type = sym_type;
	2083	ptr->value.var = 0; /* Set value to 0 even if fctn. */
	2084	ptr->next = (struct symrec *)sym_table;
	2085	sym_table = ptr;
	2086	return ptr;
	2087	@}
	2088
	2089	symrec *
	2090	getsym (const char *sym_name)
	2091	@{
	2092	symrec *ptr;
	2093	for (ptr = sym_table; ptr != (symrec *) 0;
	2094	ptr = (symrec *)ptr->next)
	2095	if (strcmp (ptr->name,sym_name) == 0)
	2096	return ptr;
	2097	return 0;
	2098	@}
	2099	@end smallexample
	2100
	2101	The function @code{yylex} must now recognize variables, numeric values, and
	2102	the single-character arithmetic operators. Strings of alphanumeric
	2103	characters with a leading non-digit are recognized as either variables or
	2104	functions depending on what the symbol table says about them.
	2105
	2106	The string is passed to @code{getsym} for look up in the symbol table. If
	2107	the name appears in the table, a pointer to its location and its type
	2108	(@code{VAR} or @code{FNCT}) is returned to @code{yyparse}. If it is not
	2109	already in the table, then it is installed as a @code{VAR} using
	2110	@code{putsym}. Again, a pointer and its type (which must be @code{VAR}) is
	2111	returned to @code{yyparse}.
	2112
	2113	No change is needed in the handling of numeric values and arithmetic
	2114	operators in @code{yylex}.
	2115
	2116	@smallexample
	2117	@group
	2118	#include <ctype.h>
	2119	@end group
	2120
	2121	@group
	2122	int
	2123	yylex (void)
	2124	@{
	2125	int c;
	2126
	2127	/* Ignore white space, get first nonwhite character. */
	2128	while ((c = getchar ()) == ' ' \|\| c == '\t');
	2129
	2130	if (c == EOF)
	2131	return 0;
	2132	@end group
	2133
	2134	@group
	2135	/* Char starts a number => parse the number. */
	2136	if (c == '.' \|\| isdigit (c))
	2137	@{
	2138	ungetc (c, stdin);
	2139	scanf ("%lf", &yylval.val);
	2140	return NUM;
	2141	@}
	2142	@end group
	2143
	2144	@group
	2145	/* Char starts an identifier => read the name. */
	2146	if (isalpha (c))
	2147	@{
	2148	symrec *s;
	2149	static char *symbuf = 0;
	2150	static int length = 0;
	2151	int i;
	2152	@end group
	2153
	2154	@group
	2155	/* Initially make the buffer long enough
	2156	for a 40-character symbol name. */
	2157	if (length == 0)
	2158	length = 40, symbuf = (char *)malloc (length + 1);
	2159
	2160	i = 0;
	2161	do
	2162	@end group
	2163	@group
	2164	@{
	2165	/* If buffer is full, make it bigger. */
	2166	if (i == length)
	2167	@{
	2168	length *= 2;
	2169	symbuf = (char *) realloc (symbuf, length + 1);
	2170	@}
	2171	/* Add this character to the buffer. */
	2172	symbuf[i++] = c;
	2173	/* Get another character. */
	2174	c = getchar ();
	2175	@}
	2176	@end group
	2177	@group
	2178	while (isalnum (c));
	2179
	2180	ungetc (c, stdin);
	2181	symbuf[i] = '\0';
	2182	@end group
	2183
	2184	@group
	2185	s = getsym (symbuf);
	2186	if (s == 0)
	2187	s = putsym (symbuf, VAR);
	2188	yylval.tptr = s;
	2189	return s->type;
	2190	@}
	2191
	2192	/* Any other character is a token by itself. */
	2193	return c;
	2194	@}
	2195	@end group
	2196	@end smallexample
	2197
	2198	This program is both powerful and flexible. You may easily add new
	2199	functions, and it is a simple job to modify this code to install
	2200	predefined variables such as @code{pi} or @code{e} as well.
	2201
	2202	@node Exercises
	2203	@section Exercises
	2204	@cindex exercises
	2205
	2206	@enumerate
	2207	@item
	2208	Add some new functions from @file{math.h} to the initialization list.
	2209
	2210	@item
	2211	Add another array that contains constants and their values. Then
	2212	modify @code{init_table} to add these constants to the symbol table.
	2213	It will be easiest to give the constants type @code{VAR}.
	2214
	2215	@item
	2216	Make the program report an error if the user refers to an
	2217	uninitialized variable in any way except to store a value in it.
	2218	@end enumerate
	2219
	2220	@node Grammar File
	2221	@chapter Bison Grammar Files
	2222
	2223	Bison takes as input a context-free grammar specification and produces a
	2224	C-language function that recognizes correct instances of the grammar.
	2225
	2226	The Bison grammar input file conventionally has a name ending in @samp{.y}.
	2227	@xref{Invocation, ,Invoking Bison}.
	2228
	2229	@menu
	2230	* Grammar Outline:: Overall layout of the grammar file.
	2231	* Symbols:: Terminal and nonterminal symbols.
	2232	* Rules:: How to write grammar rules.
	2233	* Recursion:: Writing recursive rules.
	2234	* Semantics:: Semantic values and actions.
	2235	* Locations:: Locations and actions.
	2236	* Declarations:: All kinds of Bison declarations are described here.
	2237	* Multiple Parsers:: Putting more than one Bison parser in one program.
	2238	@end menu
	2239
	2240	@node Grammar Outline
	2241	@section Outline of a Bison Grammar
	2242
	2243	A Bison grammar file has four main sections, shown here with the
	2244	appropriate delimiters:
	2245
	2246	@example
	2247	%@{
	2248	@var{Prologue}
	2249	%@}
	2250
	2251	@var{Bison declarations}
	2252
	2253	%%
	2254	@var{Grammar rules}
	2255	%%
	2256
	2257	@var{Epilogue}
	2258	@end example
	2259
	2260	Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections.
	2261	As a @acronym{GNU} extension, @samp{//} introduces a comment that
	2262	continues until end of line.
	2263
	2264	@menu
	2265	* Prologue:: Syntax and usage of the prologue.
	2266	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	2267	* Grammar Rules:: Syntax and usage of the grammar rules section.
	2268	* Epilogue:: Syntax and usage of the epilogue.
	2269	@end menu
	2270
	2271	@node Prologue, Bison Declarations, , Grammar Outline
	2272	@subsection The prologue
	2273	@cindex declarations section
	2274	@cindex Prologue
	2275	@cindex declarations
	2276
	2277	The @var{Prologue} section contains macro definitions and
	2278	declarations of functions and variables that are used in the actions in the
	2279	grammar rules. These are copied to the beginning of the parser file so
	2280	that they precede the definition of @code{yyparse}. You can use
	2281	@samp{#include} to get the declarations from a header file. If you don't
	2282	need any C declarations, you may omit the @samp{%@{} and @samp{%@}}
	2283	delimiters that bracket this section.
	2284
	2285	You may have more than one @var{Prologue} section, intermixed with the
	2286	@var{Bison declarations}. This allows you to have C and Bison
	2287	declarations that refer to each other. For example, the @code{%union}
	2288	declaration may use types defined in a header file, and you may wish to
	2289	prototype functions that take arguments of type @code{YYSTYPE}. This
	2290	can be done with two @var{Prologue} blocks, one before and one after the
	2291	@code{%union} declaration.
	2292
	2293	@smallexample
	2294	%@{
	2295	#include <stdio.h>
	2296	#include "ptypes.h"
	2297	%@}
	2298
	2299	%union @{
	2300	long n;
	2301	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2302	@}
	2303
	2304	%@{
	2305	static void print_token_value (FILE *, int, YYSTYPE);
	2306	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2307	%@}
	2308
	2309	@dots{}
	2310	@end smallexample
	2311
	2312	@node Bison Declarations
	2313	@subsection The Bison Declarations Section
	2314	@cindex Bison declarations (introduction)
	2315	@cindex declarations, Bison (introduction)
	2316
	2317	The @var{Bison declarations} section contains declarations that define
	2318	terminal and nonterminal symbols, specify precedence, and so on.
	2319	In some simple grammars you may not need any declarations.
	2320	@xref{Declarations, ,Bison Declarations}.
	2321
	2322	@node Grammar Rules
	2323	@subsection The Grammar Rules Section
	2324	@cindex grammar rules section
	2325	@cindex rules section for grammar
	2326
	2327	The @dfn{grammar rules} section contains one or more Bison grammar
	2328	rules, and nothing else. @xref{Rules, ,Syntax of Grammar Rules}.
	2329
	2330	There must always be at least one grammar rule, and the first
	2331	@samp{%%} (which precedes the grammar rules) may never be omitted even
	2332	if it is the first thing in the file.
	2333
	2334	@node Epilogue, , Grammar Rules, Grammar Outline
	2335	@subsection The epilogue
	2336	@cindex additional C code section
	2337	@cindex epilogue
	2338	@cindex C code, section for additional
	2339
	2340	The @var{Epilogue} is copied verbatim to the end of the parser file, just as
	2341	the @var{Prologue} is copied to the beginning. This is the most convenient
	2342	place to put anything that you want to have in the parser file but which need
	2343	not come before the definition of @code{yyparse}. For example, the
	2344	definitions of @code{yylex} and @code{yyerror} often go here.
	2345	@xref{Interface, ,Parser C-Language Interface}.
	2346
	2347	If the last section is empty, you may omit the @samp{%%} that separates it
	2348	from the grammar rules.
	2349
	2350	The Bison parser itself contains many static variables whose names start
	2351	with @samp{yy} and many macros whose names start with @samp{YY}. It is a
	2352	good idea to avoid using any such names (except those documented in this
	2353	manual) in the epilogue of the grammar file.
	2354
	2355	@node Symbols
	2356	@section Symbols, Terminal and Nonterminal
	2357	@cindex nonterminal symbol
	2358	@cindex terminal symbol
	2359	@cindex token type
	2360	@cindex symbol
	2361
	2362	@dfn{Symbols} in Bison grammars represent the grammatical classifications
	2363	of the language.
	2364
	2365	A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
	2366	class of syntactically equivalent tokens. You use the symbol in grammar
	2367	rules to mean that a token in that class is allowed. The symbol is
	2368	represented in the Bison parser by a numeric code, and the @code{yylex}
	2369	function returns a token type code to indicate what kind of token has been
	2370	read. You don't need to know what the code value is; you can use the
	2371	symbol to stand for it.
	2372
	2373	A @dfn{nonterminal symbol} stands for a class of syntactically equivalent
	2374	groupings. The symbol name is used in writing grammar rules. By convention,
	2375	it should be all lower case.
	2376
	2377	Symbol names can contain letters, digits (not at the beginning),
	2378	underscores and periods. Periods make sense only in nonterminals.
	2379
	2380	There are three ways of writing terminal symbols in the grammar:
	2381
	2382	@itemize @bullet
	2383	@item
	2384	A @dfn{named token type} is written with an identifier, like an
	2385	identifier in C@. By convention, it should be all upper case. Each
	2386	such name must be defined with a Bison declaration such as
	2387	@code{%token}. @xref{Token Decl, ,Token Type Names}.
	2388
	2389	@item
	2390	@cindex character token
	2391	@cindex literal token
	2392	@cindex single-character literal
	2393	A @dfn{character token type} (or @dfn{literal character token}) is
	2394	written in the grammar using the same syntax used in C for character
	2395	constants; for example, @code{'+'} is a character token type. A
	2396	character token type doesn't need to be declared unless you need to
	2397	specify its semantic value data type (@pxref{Value Type, ,Data Types of
	2398	Semantic Values}), associativity, or precedence (@pxref{Precedence,
	2399	,Operator Precedence}).
	2400
	2401	By convention, a character token type is used only to represent a
	2402	token that consists of that particular character. Thus, the token
	2403	type @code{'+'} is used to represent the character @samp{+} as a
	2404	token. Nothing enforces this convention, but if you depart from it,
	2405	your program will confuse other readers.
	2406
	2407	All the usual escape sequences used in character literals in C can be
	2408	used in Bison as well, but you must not use the null character as a
	2409	character literal because its numeric code, zero, signifies
	2410	end-of-input (@pxref{Calling Convention, ,Calling Convention
	2411	for @code{yylex}}). Also, unlike standard C, trigraphs have no
	2412	special meaning in Bison character literals, nor is backslash-newline
	2413	allowed.
	2414
	2415	@item
	2416	@cindex string token
	2417	@cindex literal string token
	2418	@cindex multicharacter literal
	2419	A @dfn{literal string token} is written like a C string constant; for
	2420	example, @code{"<="} is a literal string token. A literal string token
	2421	doesn't need to be declared unless you need to specify its semantic
	2422	value data type (@pxref{Value Type}), associativity, or precedence
	2423	(@pxref{Precedence}).
	2424
	2425	You can associate the literal string token with a symbolic name as an
	2426	alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token
	2427	Declarations}). If you don't do that, the lexical analyzer has to
	2428	retrieve the token number for the literal string token from the
	2429	@code{yytname} table (@pxref{Calling Convention}).
	2430
	2431	@strong{Warning}: literal string tokens do not work in Yacc.
	2432
	2433	By convention, a literal string token is used only to represent a token
	2434	that consists of that particular string. Thus, you should use the token
	2435	type @code{"<="} to represent the string @samp{<=} as a token. Bison
	2436	does not enforce this convention, but if you depart from it, people who
	2437	read your program will be confused.
	2438
	2439	All the escape sequences used in string literals in C can be used in
	2440	Bison as well. However, unlike Standard C, trigraphs have no special
	2441	meaning in Bison string literals, nor is backslash-newline allowed. A
	2442	literal string token must contain two or more characters; for a token
	2443	containing just one character, use a character token (see above).
	2444	@end itemize
	2445
	2446	How you choose to write a terminal symbol has no effect on its
	2447	grammatical meaning. That depends only on where it appears in rules and
	2448	on when the parser function returns that symbol.
	2449
	2450	The value returned by @code{yylex} is always one of the terminal
	2451	symbols, except that a zero or negative value signifies end-of-input.
	2452	Whichever way you write the token type in the grammar rules, you write
	2453	it the same way in the definition of @code{yylex}. The numeric code
	2454	for a character token type is simply the positive numeric code of the
	2455	character, so @code{yylex} can use the identical value to generate the
	2456	requisite code, though you may need to convert it to @code{unsigned
	2457	char} to avoid sign-extension on hosts where @code{char} is signed.
	2458	Each named token type becomes a C macro in
	2459	the parser file, so @code{yylex} can use the name to stand for the code.
	2460	(This is why periods don't make sense in terminal symbols.)
	2461	@xref{Calling Convention, ,Calling Convention for @code{yylex}}.
	2462
	2463	If @code{yylex} is defined in a separate file, you need to arrange for the
	2464	token-type macro definitions to be available there. Use the @samp{-d}
	2465	option when you run Bison, so that it will write these macro definitions
	2466	into a separate header file @file{@var{name}.tab.h} which you can include
	2467	in the other source files that need it. @xref{Invocation, ,Invoking Bison}.
	2468
	2469	If you want to write a grammar that is portable to any Standard C
	2470	host, you must use only non-null character tokens taken from the basic
	2471	execution character set of Standard C@. This set consists of the ten
	2472	digits, the 52 lower- and upper-case English letters, and the
	2473	characters in the following C-language string:
	2474
	2475	@example
	2476	"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{\|@}~"
	2477	@end example
	2478
	2479	The @code{yylex} function and Bison must use a consistent character
	2480	set and encoding for character tokens. For example, if you run Bison in an
	2481	@acronym{ASCII} environment, but then compile and run the resulting program
	2482	in an environment that uses an incompatible character set like
	2483	@acronym{EBCDIC}, the resulting program may not work because the
	2484	tables generated by Bison will assume @acronym{ASCII} numeric values for
	2485	character tokens. It is standard
	2486	practice for software distributions to contain C source files that
	2487	were generated by Bison in an @acronym{ASCII} environment, so installers on
	2488	platforms that are incompatible with @acronym{ASCII} must rebuild those
	2489	files before compiling them.
	2490
	2491	The symbol @code{error} is a terminal symbol reserved for error recovery
	2492	(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
	2493	In particular, @code{yylex} should never return this value. The default
	2494	value of the error token is 256, unless you explicitly assigned 256 to
	2495	one of your tokens with a @code{%token} declaration.
	2496
	2497	@node Rules
	2498	@section Syntax of Grammar Rules
	2499	@cindex rule syntax
	2500	@cindex grammar rule syntax
	2501	@cindex syntax of grammar rules
	2502
	2503	A Bison grammar rule has the following general form:
	2504
	2505	@example
	2506	@group
	2507	@var{result}: @var{components}@dots{}
	2508	;
	2509	@end group
	2510	@end example
	2511
	2512	@noindent
	2513	where @var{result} is the nonterminal symbol that this rule describes,
	2514	and @var{components} are various terminal and nonterminal symbols that
	2515	are put together by this rule (@pxref{Symbols}).
	2516
	2517	For example,
	2518
	2519	@example
	2520	@group
	2521	exp: exp '+' exp
	2522	;
	2523	@end group
	2524	@end example
	2525
	2526	@noindent
	2527	says that two groupings of type @code{exp}, with a @samp{+} token in between,
	2528	can be combined into a larger grouping of type @code{exp}.
	2529
	2530	White space in rules is significant only to separate symbols. You can add
	2531	extra white space as you wish.
	2532
	2533	Scattered among the components can be @var{actions} that determine
	2534	the semantics of the rule. An action looks like this:
	2535
	2536	@example
	2537	@{@var{C statements}@}
	2538	@end example
	2539
	2540	@noindent
	2541	Usually there is only one action and it follows the components.
	2542	@xref{Actions}.
	2543
	2544	@findex \|
	2545	Multiple rules for the same @var{result} can be written separately or can
	2546	be joined with the vertical-bar character @samp{\|} as follows:
	2547
	2548	@ifinfo
	2549	@example
	2550	@var{result}: @var{rule1-components}@dots{}
	2551	\| @var{rule2-components}@dots{}
	2552	@dots{}
	2553	;
	2554	@end example
	2555	@end ifinfo
	2556	@iftex
	2557	@example
	2558	@group
	2559	@var{result}: @var{rule1-components}@dots{}
	2560	\| @var{rule2-components}@dots{}
	2561	@dots{}
	2562	;
	2563	@end group
	2564	@end example
	2565	@end iftex
	2566
	2567	@noindent
	2568	They are still considered distinct rules even when joined in this way.
	2569
	2570	If @var{components} in a rule is empty, it means that @var{result} can
	2571	match the empty string. For example, here is how to define a
	2572	comma-separated sequence of zero or more @code{exp} groupings:
	2573
	2574	@example
	2575	@group
	2576	expseq: /* empty */
	2577	\| expseq1
	2578	;
	2579	@end group
	2580
	2581	@group
	2582	expseq1: exp
	2583	\| expseq1 ',' exp
	2584	;
	2585	@end group
	2586	@end example
	2587
	2588	@noindent
	2589	It is customary to write a comment @samp{/* empty */} in each rule
	2590	with no components.
	2591
	2592	@node Recursion
	2593	@section Recursive Rules
	2594	@cindex recursive rule
	2595
	2596	A rule is called @dfn{recursive} when its @var{result} nonterminal appears
	2597	also on its right hand side. Nearly all Bison grammars need to use
	2598	recursion, because that is the only way to define a sequence of any number
	2599	of a particular thing. Consider this recursive definition of a
	2600	comma-separated sequence of one or more expressions:
	2601
	2602	@example
	2603	@group
	2604	expseq1: exp
	2605	\| expseq1 ',' exp
	2606	;
	2607	@end group
	2608	@end example
	2609
	2610	@cindex left recursion
	2611	@cindex right recursion
	2612	@noindent
	2613	Since the recursive use of @code{expseq1} is the leftmost symbol in the
	2614	right hand side, we call this @dfn{left recursion}. By contrast, here
	2615	the same construct is defined using @dfn{right recursion}:
	2616
	2617	@example
	2618	@group
	2619	expseq1: exp
	2620	\| exp ',' expseq1
	2621	;
	2622	@end group
	2623	@end example
	2624
	2625	@noindent
	2626	Any kind of sequence can be defined using either left recursion or right
	2627	recursion, but you should always use left recursion, because it can
	2628	parse a sequence of any number of elements with bounded stack space.
	2629	Right recursion uses up space on the Bison stack in proportion to the
	2630	number of elements in the sequence, because all the elements must be
	2631	shifted onto the stack before the rule can be applied even once.
	2632	@xref{Algorithm, ,The Bison Parser Algorithm}, for further explanation
	2633	of this.
	2634
	2635	@cindex mutual recursion
	2636	@dfn{Indirect} or @dfn{mutual} recursion occurs when the result of the
	2637	rule does not appear directly on its right hand side, but does appear
	2638	in rules for other nonterminals which do appear on its right hand
	2639	side.
	2640
	2641	For example:
	2642
	2643	@example
	2644	@group
	2645	expr: primary
	2646	\| primary '+' primary
	2647	;
	2648	@end group
	2649
	2650	@group
	2651	primary: constant
	2652	\| '(' expr ')'
	2653	;
	2654	@end group
	2655	@end example
	2656
	2657	@noindent
	2658	defines two mutually-recursive nonterminals, since each refers to the
	2659	other.
	2660
	2661	@node Semantics
	2662	@section Defining Language Semantics
	2663	@cindex defining language semantics
	2664	@cindex language semantics, defining
	2665
	2666	The grammar rules for a language determine only the syntax. The semantics
	2667	are determined by the semantic values associated with various tokens and
	2668	groupings, and by the actions taken when various groupings are recognized.
	2669
	2670	For example, the calculator calculates properly because the value
	2671	associated with each expression is the proper number; it adds properly
	2672	because the action for the grouping @w{@samp{@var{x} + @var{y}}} is to add
	2673	the numbers associated with @var{x} and @var{y}.
	2674
	2675	@menu
	2676	* Value Type:: Specifying one data type for all semantic values.
	2677	* Multiple Types:: Specifying several alternative data types.
	2678	* Actions:: An action is the semantic definition of a grammar rule.
	2679	* Action Types:: Specifying data types for actions to operate on.
	2680	* Mid-Rule Actions:: Most actions go at the end of a rule.
	2681	This says when, why and how to use the exceptional
	2682	action in the middle of a rule.
	2683	@end menu
	2684
	2685	@node Value Type
	2686	@subsection Data Types of Semantic Values
	2687	@cindex semantic value type
	2688	@cindex value type, semantic
	2689	@cindex data types of semantic values
	2690	@cindex default data type
	2691
	2692	In a simple program it may be sufficient to use the same data type for
	2693	the semantic values of all language constructs. This was true in the
	2694	@acronym{RPN} and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish
	2695	Notation Calculator}).
	2696
	2697	Bison's default is to use type @code{int} for all semantic values. To
	2698	specify some other type, define @code{YYSTYPE} as a macro, like this:
	2699
	2700	@example
	2701	#define YYSTYPE double
	2702	@end example
	2703
	2704	@noindent
	2705	This macro definition must go in the prologue of the grammar file
	2706	(@pxref{Grammar Outline, ,Outline of a Bison Grammar}).
	2707
	2708	@node Multiple Types
	2709	@subsection More Than One Value Type
	2710
	2711	In most programs, you will need different data types for different kinds
	2712	of tokens and groupings. For example, a numeric constant may need type
	2713	@code{int} or @code{long}, while a string constant needs type @code{char *},
	2714	and an identifier might need a pointer to an entry in the symbol table.
	2715
	2716	To use more than one data type for semantic values in one parser, Bison
	2717	requires you to do two things:
	2718
	2719	@itemize @bullet
	2720	@item
	2721	Specify the entire collection of possible data types, with the
	2722	@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
	2723	Value Types}).
	2724
	2725	@item
	2726	Choose one of those types for each symbol (terminal or nonterminal) for
	2727	which semantic values are used. This is done for tokens with the
	2728	@code{%token} Bison declaration (@pxref{Token Decl, ,Token Type Names})
	2729	and for groupings with the @code{%type} Bison declaration (@pxref{Type
	2730	Decl, ,Nonterminal Symbols}).
	2731	@end itemize
	2732
	2733	@node Actions
	2734	@subsection Actions
	2735	@cindex action
	2736	@vindex $$
	2737	@vindex $@var{n}
	2738
	2739	An action accompanies a syntactic rule and contains C code to be executed
	2740	each time an instance of that rule is recognized. The task of most actions
	2741	is to compute a semantic value for the grouping built by the rule from the
	2742	semantic values associated with tokens or smaller groupings.
	2743
	2744	An action consists of C statements surrounded by braces, much like a
	2745	compound statement in C@. An action can contain any sequence of C
	2746	statements. Bison does not look for trigraphs, though, so if your C
	2747	code uses trigraphs you should ensure that they do not affect the
	2748	nesting of braces or the boundaries of comments, strings, or character
	2749	literals.
	2750
	2751	An action can be placed at any position in the rule;
	2752	it is executed at that position. Most rules have just one action at the
	2753	end of the rule, following all the components. Actions in the middle of
	2754	a rule are tricky and used only for special purposes (@pxref{Mid-Rule
	2755	Actions, ,Actions in Mid-Rule}).
	2756
	2757	The C code in an action can refer to the semantic values of the components
	2758	matched by the rule with the construct @code{$@var{n}}, which stands for
	2759	the value of the @var{n}th component. The semantic value for the grouping
	2760	being constructed is @code{$$}. (Bison translates both of these constructs
	2761	into array element references when it copies the actions into the parser
	2762	file.)
	2763
	2764	Here is a typical example:
	2765
	2766	@example
	2767	@group
	2768	exp: @dots{}
	2769	\| exp '+' exp
	2770	@{ $$ = $1 + $3; @}
	2771	@end group
	2772	@end example
	2773
	2774	@noindent
	2775	This rule constructs an @code{exp} from two smaller @code{exp} groupings
	2776	connected by a plus-sign token. In the action, @code{$1} and @code{$3}
	2777	refer to the semantic values of the two component @code{exp} groupings,
	2778	which are the first and third symbols on the right hand side of the rule.
	2779	The sum is stored into @code{$$} so that it becomes the semantic value of
	2780	the addition-expression just recognized by the rule. If there were a
	2781	useful semantic value associated with the @samp{+} token, it could be
	2782	referred to as @code{$2}.
	2783
	2784	Note that the vertical-bar character @samp{\|} is really a rule
	2785	separator, and actions are attached to a single rule. This is a
	2786	difference with tools like Flex, for which @samp{\|} stands for either
	2787	``or'', or ``the same action as that of the next rule''. In the
	2788	following example, the action is triggered only when @samp{b} is found:
	2789
	2790	@example
	2791	@group
	2792	a-or-b: 'a'\|'b' @{ a_or_b_found = 1; @};
	2793	@end group
	2794	@end example
	2795
	2796	@cindex default action
	2797	If you don't specify an action for a rule, Bison supplies a default:
	2798	@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule
	2799	becomes the value of the whole rule. Of course, the default action is
	2800	valid only if the two data types match. There is no meaningful default
	2801	action for an empty rule; every empty rule must have an explicit action
	2802	unless the rule's value does not matter.
	2803
	2804	@code{$@var{n}} with @var{n} zero or negative is allowed for reference
	2805	to tokens and groupings on the stack @emph{before} those that match the
	2806	current rule. This is a very risky practice, and to use it reliably
	2807	you must be certain of the context in which the rule is applied. Here
	2808	is a case in which you can use this reliably:
	2809
	2810	@example
	2811	@group
	2812	foo: expr bar '+' expr @{ @dots{} @}
	2813	\| expr bar '-' expr @{ @dots{} @}
	2814	;
	2815	@end group
	2816
	2817	@group
	2818	bar: /* empty */
	2819	@{ previous_expr = $0; @}
	2820	;
	2821	@end group
	2822	@end example
	2823
	2824	As long as @code{bar} is used only in the fashion shown here, @code{$0}
	2825	always refers to the @code{expr} which precedes @code{bar} in the
	2826	definition of @code{foo}.
	2827
	2828	@node Action Types
	2829	@subsection Data Types of Values in Actions
	2830	@cindex action data types
	2831	@cindex data types in actions
	2832
	2833	If you have chosen a single data type for semantic values, the @code{$$}
	2834	and @code{$@var{n}} constructs always have that data type.
	2835
	2836	If you have used @code{%union} to specify a variety of data types, then you
	2837	must declare a choice among these types for each terminal or nonterminal
	2838	symbol that can have a semantic value. Then each time you use @code{$$} or
	2839	@code{$@var{n}}, its data type is determined by which symbol it refers to
	2840	in the rule. In this example,
	2841
	2842	@example
	2843	@group
	2844	exp: @dots{}
	2845	\| exp '+' exp
	2846	@{ $$ = $1 + $3; @}
	2847	@end group
	2848	@end example
	2849
	2850	@noindent
	2851	@code{$1} and @code{$3} refer to instances of @code{exp}, so they all
	2852	have the data type declared for the nonterminal symbol @code{exp}. If
	2853	@code{$2} were used, it would have the data type declared for the
	2854	terminal symbol @code{'+'}, whatever that might be.
	2855
	2856	Alternatively, you can specify the data type when you refer to the value,
	2857	by inserting @samp{<@var{type}>} after the @samp{$} at the beginning of the
	2858	reference. For example, if you have defined types as shown here:
	2859
	2860	@example
	2861	@group
	2862	%union @{
	2863	int itype;
	2864	double dtype;
	2865	@}
	2866	@end group
	2867	@end example
	2868
	2869	@noindent
	2870	then you can write @code{$<itype>1} to refer to the first subunit of the
	2871	rule as an integer, or @code{$<dtype>1} to refer to it as a double.
	2872
	2873	@node Mid-Rule Actions
	2874	@subsection Actions in Mid-Rule
	2875	@cindex actions in mid-rule
	2876	@cindex mid-rule actions
	2877
	2878	Occasionally it is useful to put an action in the middle of a rule.
	2879	These actions are written just like usual end-of-rule actions, but they
	2880	are executed before the parser even recognizes the following components.
	2881
	2882	A mid-rule action may refer to the components preceding it using
	2883	@code{$@var{n}}, but it may not refer to subsequent components because
	2884	it is run before they are parsed.
	2885
	2886	The mid-rule action itself counts as one of the components of the rule.
	2887	This makes a difference when there is another action later in the same rule
	2888	(and usually there is another at the end): you have to count the actions
	2889	along with the symbols when working out which number @var{n} to use in
	2890	@code{$@var{n}}.
	2891
	2892	The mid-rule action can also have a semantic value. The action can set
	2893	its value with an assignment to @code{$$}, and actions later in the rule
	2894	can refer to the value using @code{$@var{n}}. Since there is no symbol
	2895	to name the action, there is no way to declare a data type for the value
	2896	in advance, so you must use the @samp{$<@dots{}>@var{n}} construct to
	2897	specify a data type each time you refer to this value.
	2898
	2899	There is no way to set the value of the entire rule with a mid-rule
	2900	action, because assignments to @code{$$} do not have that effect. The
	2901	only way to set the value for the entire rule is with an ordinary action
	2902	at the end of the rule.
	2903
	2904	Here is an example from a hypothetical compiler, handling a @code{let}
	2905	statement that looks like @samp{let (@var{variable}) @var{statement}} and
	2906	serves to create a variable named @var{variable} temporarily for the
	2907	duration of @var{statement}. To parse this construct, we must put
	2908	@var{variable} into the symbol table while @var{statement} is parsed, then
	2909	remove it afterward. Here is how it is done:
	2910
	2911	@example
	2912	@group
	2913	stmt: LET '(' var ')'
	2914	@{ $<context>$ = push_context ();
	2915	declare_variable ($3); @}
	2916	stmt @{ $$ = $6;
	2917	pop_context ($<context>5); @}
	2918	@end group
	2919	@end example
	2920
	2921	@noindent
	2922	As soon as @samp{let (@var{variable})} has been recognized, the first
	2923	action is run. It saves a copy of the current semantic context (the
	2924	list of accessible variables) as its semantic value, using alternative
	2925	@code{context} in the data-type union. Then it calls
	2926	@code{declare_variable} to add the new variable to that list. Once the
	2927	first action is finished, the embedded statement @code{stmt} can be
	2928	parsed. Note that the mid-rule action is component number 5, so the
	2929	@samp{stmt} is component number 6.
	2930
	2931	After the embedded statement is parsed, its semantic value becomes the
	2932	value of the entire @code{let}-statement. Then the semantic value from the
	2933	earlier action is used to restore the prior list of variables. This
	2934	removes the temporary @code{let}-variable from the list so that it won't
	2935	appear to exist while the rest of the program is parsed.
	2936
	2937	Taking action before a rule is completely recognized often leads to
	2938	conflicts since the parser must commit to a parse in order to execute the
	2939	action. For example, the following two rules, without mid-rule actions,
	2940	can coexist in a working parser because the parser can shift the open-brace
	2941	token and look at what follows before deciding whether there is a
	2942	declaration or not:
	2943
	2944	@example
	2945	@group
	2946	compound: '@{' declarations statements '@}'
	2947	\| '@{' statements '@}'
	2948	;
	2949	@end group
	2950	@end example
	2951
	2952	@noindent
	2953	But when we add a mid-rule action as follows, the rules become nonfunctional:
	2954
	2955	@example
	2956	@group
	2957	compound: @{ prepare_for_local_variables (); @}
	2958	'@{' declarations statements '@}'
	2959	@end group
	2960	@group
	2961	\| '@{' statements '@}'
	2962	;
	2963	@end group
	2964	@end example
	2965
	2966	@noindent
	2967	Now the parser is forced to decide whether to run the mid-rule action
	2968	when it has read no farther than the open-brace. In other words, it
	2969	must commit to using one rule or the other, without sufficient
	2970	information to do it correctly. (The open-brace token is what is called
	2971	the @dfn{look-ahead} token at this time, since the parser is still
	2972	deciding what to do about it. @xref{Look-Ahead, ,Look-Ahead Tokens}.)
	2973
	2974	You might think that you could correct the problem by putting identical
	2975	actions into the two rules, like this:
	2976
	2977	@example
	2978	@group
	2979	compound: @{ prepare_for_local_variables (); @}
	2980	'@{' declarations statements '@}'
	2981	\| @{ prepare_for_local_variables (); @}
	2982	'@{' statements '@}'
	2983	;
	2984	@end group
	2985	@end example
	2986
	2987	@noindent
	2988	But this does not help, because Bison does not realize that the two actions
	2989	are identical. (Bison never tries to understand the C code in an action.)
	2990
	2991	If the grammar is such that a declaration can be distinguished from a
	2992	statement by the first token (which is true in C), then one solution which
	2993	does work is to put the action after the open-brace, like this:
	2994
	2995	@example
	2996	@group
	2997	compound: '@{' @{ prepare_for_local_variables (); @}
	2998	declarations statements '@}'
	2999	\| '@{' statements '@}'
	3000	;
	3001	@end group
	3002	@end example
	3003
	3004	@noindent
	3005	Now the first token of the following declaration or statement,
	3006	which would in any case tell Bison which rule to use, can still do so.
	3007
	3008	Another solution is to bury the action inside a nonterminal symbol which
	3009	serves as a subroutine:
	3010
	3011	@example
	3012	@group
	3013	subroutine: /* empty */
	3014	@{ prepare_for_local_variables (); @}
	3015	;
	3016
	3017	@end group
	3018
	3019	@group
	3020	compound: subroutine
	3021	'@{' declarations statements '@}'
	3022	\| subroutine
	3023	'@{' statements '@}'
	3024	;
	3025	@end group
	3026	@end example
	3027
	3028	@noindent
	3029	Now Bison can execute the action in the rule for @code{subroutine} without
	3030	deciding which rule for @code{compound} it will eventually use. Note that
	3031	the action is now at the end of its rule. Any mid-rule action can be
	3032	converted to an end-of-rule action in this way, and this is what Bison
	3033	actually does to implement mid-rule actions.
	3034
	3035	@node Locations
	3036	@section Tracking Locations
	3037	@cindex location
	3038	@cindex textual position
	3039	@cindex position, textual
	3040
	3041	Though grammar rules and semantic actions are enough to write a fully
	3042	functional parser, it can be useful to process some additional information,
	3043	especially symbol locations.
	3044
	3045	@c (terminal or not) ?
	3046
	3047	The way locations are handled is defined by providing a data type, and
	3048	actions to take when rules are matched.
	3049
	3050	@menu
	3051	* Location Type:: Specifying a data type for locations.
	3052	* Actions and Locations:: Using locations in actions.
	3053	* Location Default Action:: Defining a general way to compute locations.
	3054	@end menu
	3055
	3056	@node Location Type
	3057	@subsection Data Type of Locations
	3058	@cindex data type of locations
	3059	@cindex default location type
	3060
	3061	Defining a data type for locations is much simpler than for semantic values,
	3062	since all tokens and groupings always use the same type.
	3063
	3064	The type of locations is specified by defining a macro called @code{YYLTYPE}.
	3065	When @code{YYLTYPE} is not defined, Bison uses a default structure type with
	3066	four members:
	3067
	3068	@example
	3069	struct
	3070	@{
	3071	int first_line;
	3072	int first_column;
	3073	int last_line;
	3074	int last_column;
	3075	@}
	3076	@end example
	3077
	3078	@node Actions and Locations
	3079	@subsection Actions and Locations
	3080	@cindex location actions
	3081	@cindex actions, location
	3082	@vindex @@$
	3083	@vindex @@@var{n}
	3084
	3085	Actions are not only useful for defining language semantics, but also for
	3086	describing the behavior of the output parser with locations.
	3087
	3088	The most obvious way for building locations of syntactic groupings is very
	3089	similar to the way semantic values are computed. In a given rule, several
	3090	constructs can be used to access the locations of the elements being matched.
	3091	The location of the @var{n}th component of the right hand side is
	3092	@code{@@@var{n}}, while the location of the left hand side grouping is
	3093	@code{@@$}.
	3094
	3095	Here is a basic example using the default data type for locations:
	3096
	3097	@example
	3098	@group
	3099	exp: @dots{}
	3100	\| exp '/' exp
	3101	@{
	3102	@@$.first_column = @@1.first_column;
	3103	@@$.first_line = @@1.first_line;
	3104	@@$.last_column = @@3.last_column;
	3105	@@$.last_line = @@3.last_line;
	3106	if ($3)
	3107	$$ = $1 / $3;
	3108	else
	3109	@{
	3110	$$ = 1;
	3111	printf("Division by zero, l%d,c%d-l%d,c%d",
	3112	@@3.first_line, @@3.first_column,
	3113	@@3.last_line, @@3.last_column);
	3114	@}
	3115	@}
	3116	@end group
	3117	@end example
	3118
	3119	As for semantic values, there is a default action for locations that is
	3120	run each time a rule is matched. It sets the beginning of @code{@@$} to the
	3121	beginning of the first symbol, and the end of @code{@@$} to the end of the
	3122	last symbol.
	3123
	3124	With this default action, the location tracking can be fully automatic. The
	3125	example above simply rewrites this way:
	3126
	3127	@example
	3128	@group
	3129	exp: @dots{}
	3130	\| exp '/' exp
	3131	@{
	3132	if ($3)
	3133	$$ = $1 / $3;
	3134	else
	3135	@{
	3136	$$ = 1;
	3137	printf("Division by zero, l%d,c%d-l%d,c%d",
	3138	@@3.first_line, @@3.first_column,
	3139	@@3.last_line, @@3.last_column);
	3140	@}
	3141	@}
	3142	@end group
	3143	@end example
	3144
	3145	@node Location Default Action
	3146	@subsection Default Action for Locations
	3147	@vindex YYLLOC_DEFAULT
	3148
	3149	Actually, actions are not the best place to compute locations. Since
	3150	locations are much more general than semantic values, there is room in
	3151	the output parser to redefine the default action to take for each
	3152	rule. The @code{YYLLOC_DEFAULT} macro is invoked each time a rule is
	3153	matched, before the associated action is run.
	3154
	3155	Most of the time, this macro is general enough to suppress location
	3156	dedicated code from semantic actions.
	3157
	3158	The @code{YYLLOC_DEFAULT} macro takes three parameters. The first one is
	3159	the location of the grouping (the result of the computation). The second one
	3160	is an array holding locations of all right hand side elements of the rule
	3161	being matched. The last one is the size of the right hand side rule.
	3162
	3163	By default, it is defined this way for simple @acronym{LALR}(1) parsers:
	3164
	3165	@example
	3166	@group
	3167	#define YYLLOC_DEFAULT(Current, Rhs, N) \
	3168	Current.first_line = Rhs[1].first_line; \
	3169	Current.first_column = Rhs[1].first_column; \
	3170	Current.last_line = Rhs[N].last_line; \
	3171	Current.last_column = Rhs[N].last_column;
	3172	@end group
	3173	@end example
	3174
	3175	@noindent
	3176	and like this for @acronym{GLR} parsers:
	3177
	3178	@example
	3179	@group
	3180	#define YYLLOC_DEFAULT(Current, Rhs, N) \
	3181	Current.first_line = YYRHSLOC(Rhs,1).first_line; \
	3182	Current.first_column = YYRHSLOC(Rhs,1).first_column; \
	3183	Current.last_line = YYRHSLOC(Rhs,N).last_line; \
	3184	Current.last_column = YYRHSLOC(Rhs,N).last_column;
	3185	@end group
	3186	@end example
	3187
	3188	When defining @code{YYLLOC_DEFAULT}, you should consider that:
	3189
	3190	@itemize @bullet
	3191	@item
	3192	All arguments are free of side-effects. However, only the first one (the
	3193	result) should be modified by @code{YYLLOC_DEFAULT}.
	3194
	3195	@item
	3196	For consistency with semantic actions, valid indexes for the location
	3197	array range from 1 to @var{n}.
	3198	@end itemize
	3199
	3200	@node Declarations
	3201	@section Bison Declarations
	3202	@cindex declarations, Bison
	3203	@cindex Bison declarations
	3204
	3205	The @dfn{Bison declarations} section of a Bison grammar defines the symbols
	3206	used in formulating the grammar and the data types of semantic values.
	3207	@xref{Symbols}.
	3208
	3209	All token type names (but not single-character literal tokens such as
	3210	@code{'+'} and @code{'*'}) must be declared. Nonterminal symbols must be
	3211	declared if you need to specify which data type to use for the semantic
	3212	value (@pxref{Multiple Types, ,More Than One Value Type}).
	3213
	3214	The first rule in the file also specifies the start symbol, by default.
	3215	If you want some other symbol to be the start symbol, you must declare
	3216	it explicitly (@pxref{Language and Grammar, ,Languages and Context-Free
	3217	Grammars}).
	3218
	3219	@menu
	3220	* Token Decl:: Declaring terminal symbols.
	3221	* Precedence Decl:: Declaring terminals with precedence and associativity.
	3222	* Union Decl:: Declaring the set of all semantic value types.
	3223	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	3224	* Destructor Decl:: Declaring how symbols are freed.
	3225	* Expect Decl:: Suppressing warnings about shift/reduce conflicts.
	3226	* Start Decl:: Specifying the start symbol.
	3227	* Pure Decl:: Requesting a reentrant parser.
	3228	* Decl Summary:: Table of all Bison declarations.
	3229	@end menu
	3230
	3231	@node Token Decl
	3232	@subsection Token Type Names
	3233	@cindex declaring token type names
	3234	@cindex token type names, declaring
	3235	@cindex declaring literal string tokens
	3236	@findex %token
	3237
	3238	The basic way to declare a token type name (terminal symbol) is as follows:
	3239
	3240	@example
	3241	%token @var{name}
	3242	@end example
	3243
	3244	Bison will convert this into a @code{#define} directive in
	3245	the parser, so that the function @code{yylex} (if it is in this file)
	3246	can use the name @var{name} to stand for this token type's code.
	3247
	3248	Alternatively, you can use @code{%left}, @code{%right}, or
	3249	@code{%nonassoc} instead of @code{%token}, if you wish to specify
	3250	associativity and precedence. @xref{Precedence Decl, ,Operator
	3251	Precedence}.
	3252
	3253	You can explicitly specify the numeric code for a token type by appending
	3254	an integer value in the field immediately following the token name:
	3255
	3256	@example
	3257	%token NUM 300
	3258	@end example
	3259
	3260	@noindent
	3261	It is generally best, however, to let Bison choose the numeric codes for
	3262	all token types. Bison will automatically select codes that don't conflict
	3263	with each other or with normal characters.
	3264
	3265	In the event that the stack type is a union, you must augment the
	3266	@code{%token} or other token declaration to include the data type
	3267	alternative delimited by angle-brackets (@pxref{Multiple Types, ,More
	3268	Than One Value Type}).
	3269
	3270	For example:
	3271
	3272	@example
	3273	@group
	3274	%union @{ /* define stack type */
	3275	double val;
	3276	symrec *tptr;
	3277	@}
	3278	%token <val> NUM /* define token NUM and its type */
	3279	@end group
	3280	@end example
	3281
	3282	You can associate a literal string token with a token type name by
	3283	writing the literal string at the end of a @code{%token}
	3284	declaration which declares the name. For example:
	3285
	3286	@example
	3287	%token arrow "=>"
	3288	@end example
	3289
	3290	@noindent
	3291	For example, a grammar for the C language might specify these names with
	3292	equivalent literal string tokens:
	3293
	3294	@example
	3295	%token <operator> OR "\|\|"
	3296	%token <operator> LE 134 "<="
	3297	%left OR "<="
	3298	@end example
	3299
	3300	@noindent
	3301	Once you equate the literal string and the token name, you can use them
	3302	interchangeably in further declarations or the grammar rules. The
	3303	@code{yylex} function can use the token name or the literal string to
	3304	obtain the token type code number (@pxref{Calling Convention}).
	3305
	3306	@node Precedence Decl
	3307	@subsection Operator Precedence
	3308	@cindex precedence declarations
	3309	@cindex declaring operator precedence
	3310	@cindex operator precedence, declaring
	3311
	3312	Use the @code{%left}, @code{%right} or @code{%nonassoc} declaration to
	3313	declare a token and specify its precedence and associativity, all at
	3314	once. These are called @dfn{precedence declarations}.
	3315	@xref{Precedence, ,Operator Precedence}, for general information on
	3316	operator precedence.
	3317
	3318	The syntax of a precedence declaration is the same as that of
	3319	@code{%token}: either
	3320
	3321	@example
	3322	%left @var{symbols}@dots{}
	3323	@end example
	3324
	3325	@noindent
	3326	or
	3327
	3328	@example
	3329	%left <@var{type}> @var{symbols}@dots{}
	3330	@end example
	3331
	3332	And indeed any of these declarations serves the purposes of @code{%token}.
	3333	But in addition, they specify the associativity and relative precedence for
	3334	all the @var{symbols}:
	3335
	3336	@itemize @bullet
	3337	@item
	3338	The associativity of an operator @var{op} determines how repeated uses
	3339	of the operator nest: whether @samp{@var{x} @var{op} @var{y} @var{op}
	3340	@var{z}} is parsed by grouping @var{x} with @var{y} first or by
	3341	grouping @var{y} with @var{z} first. @code{%left} specifies
	3342	left-associativity (grouping @var{x} with @var{y} first) and
	3343	@code{%right} specifies right-associativity (grouping @var{y} with
	3344	@var{z} first). @code{%nonassoc} specifies no associativity, which
	3345	means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is
	3346	considered a syntax error.
	3347
	3348	@item
	3349	The precedence of an operator determines how it nests with other operators.
	3350	All the tokens declared in a single precedence declaration have equal
	3351	precedence and nest together according to their associativity.
	3352	When two tokens declared in different precedence declarations associate,
	3353	the one declared later has the higher precedence and is grouped first.
	3354	@end itemize
	3355
	3356	@node Union Decl
	3357	@subsection The Collection of Value Types
	3358	@cindex declaring value types
	3359	@cindex value types, declaring
	3360	@findex %union
	3361
	3362	The @code{%union} declaration specifies the entire collection of possible
	3363	data types for semantic values. The keyword @code{%union} is followed by a
	3364	pair of braces containing the same thing that goes inside a @code{union} in
	3365	C.
	3366
	3367	For example:
	3368
	3369	@example
	3370	@group
	3371	%union @{
	3372	double val;
	3373	symrec *tptr;
	3374	@}
	3375	@end group
	3376	@end example
	3377
	3378	@noindent
	3379	This says that the two alternative types are @code{double} and @code{symrec
	3380	*}. They are given names @code{val} and @code{tptr}; these names are used
	3381	in the @code{%token} and @code{%type} declarations to pick one of the types
	3382	for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
	3383
	3384	Note that, unlike making a @code{union} declaration in C, you do not write
	3385	a semicolon after the closing brace.
	3386
	3387	@node Type Decl
	3388	@subsection Nonterminal Symbols
	3389	@cindex declaring value types, nonterminals
	3390	@cindex value types, nonterminals, declaring
	3391	@findex %type
	3392
	3393	@noindent
	3394	When you use @code{%union} to specify multiple value types, you must
	3395	declare the value type of each nonterminal symbol for which values are
	3396	used. This is done with a @code{%type} declaration, like this:
	3397
	3398	@example
	3399	%type <@var{type}> @var{nonterminal}@dots{}
	3400	@end example
	3401
	3402	@noindent
	3403	Here @var{nonterminal} is the name of a nonterminal symbol, and
	3404	@var{type} is the name given in the @code{%union} to the alternative
	3405	that you want (@pxref{Union Decl, ,The Collection of Value Types}). You
	3406	can give any number of nonterminal symbols in the same @code{%type}
	3407	declaration, if they have the same value type. Use spaces to separate
	3408	the symbol names.
	3409
	3410	You can also declare the value type of a terminal symbol. To do this,
	3411	use the same @code{<@var{type}>} construction in a declaration for the
	3412	terminal symbol. All kinds of token declarations allow
	3413	@code{<@var{type}>}.
	3414
	3415	@node Destructor Decl
	3416	@subsection Freeing Discarded Symbols
	3417	@cindex freeing discarded symbols
	3418	@findex %destructor
	3419
	3420	Some symbols can be discarded by the parser, typically during error
	3421	recovery (@pxref{Error Recovery}). Basically, during error recovery,
	3422	embarrassing symbols already pushed on the stack, and embarrassing
	3423	tokens coming from the rest of the file are thrown away until the parser
	3424	falls on its feet. If these symbols convey heap based information, this
	3425	memory is lost. While this behavior is tolerable for batch parsers,
	3426	such as in compilers, it is unacceptable for parsers that can
	3427	possibility ``never end'' such as shells, or implementations of
	3428	communication protocols.
	3429
	3430	The @code{%destructor} directive allows for the definition of code that
	3431	is called when a symbol is thrown away.
	3432
	3433	@deffn {Directive} %destructor @{ @var{code} @} @var{symbols}
	3434	@findex %destructor
	3435	Declare that the @var{code} must be invoked for each of the
	3436	@var{symbols} that will be discarded by the parser. The @var{code}
	3437	should use @code{$$} to designate the semantic value associated to the
	3438	@var{symbols}. The additional parser parameters are also avaible
	3439	(@pxref{Parser Function, , The Parser Function @code{yyparse}}).
	3440
	3441	@strong{Warning:} as of Bison 1.875, this feature is still considered as
	3442	experimental, as there was not enough users feedback. In particular,
	3443	the syntax might still change.
	3444	@end deffn
	3445
	3446	For instance:
	3447
	3448	@smallexample
	3449	%union
	3450	@{
	3451	char *string;
	3452	@}
	3453	%token <string> STRING
	3454	%type <string> string
	3455	%destructor @{ free ($$); @} STRING string
	3456	@end smallexample
	3457
	3458	@noindent
	3459	guarantees that when a @code{STRING} or a @code{string} will be discarded,
	3460	its associated memory will be freed.
	3461
	3462	Note that in the future, Bison might also consider that right hand side
	3463	members that are not mentioned in the action can be destroyed. For
	3464	instance, in:
	3465
	3466	@smallexample
	3467	comment: "/" STRING "/";
	3468	@end smallexample
	3469
	3470	@noindent
	3471	the parser is entitled to destroy the semantic value of the
	3472	@code{string}. Of course, this will not apply to the default action;
	3473	compare:
	3474
	3475	@smallexample
	3476	typeless: string; // $$ = $1 does not apply; $1 is destroyed.
	3477	typefull: string; // $$ = $1 applies, $1 is not destroyed.
	3478	@end smallexample
	3479
	3480	@node Expect Decl
	3481	@subsection Suppressing Conflict Warnings
	3482	@cindex suppressing conflict warnings
	3483	@cindex preventing warnings about conflicts
	3484	@cindex warnings, preventing
	3485	@cindex conflicts, suppressing warnings of
	3486	@findex %expect
	3487
	3488	Bison normally warns if there are any conflicts in the grammar
	3489	(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars
	3490	have harmless shift/reduce conflicts which are resolved in a predictable
	3491	way and would be difficult to eliminate. It is desirable to suppress
	3492	the warning about these conflicts unless the number of conflicts
	3493	changes. You can do this with the @code{%expect} declaration.
	3494
	3495	The declaration looks like this:
	3496
	3497	@example
	3498	%expect @var{n}
	3499	@end example
	3500
	3501	Here @var{n} is a decimal integer. The declaration says there should be
	3502	no warning if there are @var{n} shift/reduce conflicts and no
	3503	reduce/reduce conflicts. An error, instead of the usual warning, is
	3504	given if there are either more or fewer conflicts, or if there are any
	3505	reduce/reduce conflicts.
	3506
	3507	In general, using @code{%expect} involves these steps:
	3508
	3509	@itemize @bullet
	3510	@item
	3511	Compile your grammar without @code{%expect}. Use the @samp{-v} option
	3512	to get a verbose list of where the conflicts occur. Bison will also
	3513	print the number of conflicts.
	3514
	3515	@item
	3516	Check each of the conflicts to make sure that Bison's default
	3517	resolution is what you really want. If not, rewrite the grammar and
	3518	go back to the beginning.
	3519
	3520	@item
	3521	Add an @code{%expect} declaration, copying the number @var{n} from the
	3522	number which Bison printed.
	3523	@end itemize
	3524
	3525	Now Bison will stop annoying you about the conflicts you have checked, but
	3526	it will warn you again if changes in the grammar result in additional
	3527	conflicts.
	3528
	3529	@node Start Decl
	3530	@subsection The Start-Symbol
	3531	@cindex declaring the start symbol
	3532	@cindex start symbol, declaring
	3533	@cindex default start symbol
	3534	@findex %start
	3535
	3536	Bison assumes by default that the start symbol for the grammar is the first
	3537	nonterminal specified in the grammar specification section. The programmer
	3538	may override this restriction with the @code{%start} declaration as follows:
	3539
	3540	@example
	3541	%start @var{symbol}
	3542	@end example
	3543
	3544	@node Pure Decl
	3545	@subsection A Pure (Reentrant) Parser
	3546	@cindex reentrant parser
	3547	@cindex pure parser
	3548	@findex %pure-parser
	3549
	3550	A @dfn{reentrant} program is one which does not alter in the course of
	3551	execution; in other words, it consists entirely of @dfn{pure} (read-only)
	3552	code. Reentrancy is important whenever asynchronous execution is possible;
	3553	for example, a non-reentrant program may not be safe to call from a signal
	3554	handler. In systems with multiple threads of control, a non-reentrant
	3555	program must be called only within interlocks.
	3556
	3557	Normally, Bison generates a parser which is not reentrant. This is
	3558	suitable for most uses, and it permits compatibility with Yacc. (The
	3559	standard Yacc interfaces are inherently nonreentrant, because they use
	3560	statically allocated variables for communication with @code{yylex},
	3561	including @code{yylval} and @code{yylloc}.)
	3562
	3563	Alternatively, you can generate a pure, reentrant parser. The Bison
	3564	declaration @code{%pure-parser} says that you want the parser to be
	3565	reentrant. It looks like this:
	3566
	3567	@example
	3568	%pure-parser
	3569	@end example
	3570
	3571	The result is that the communication variables @code{yylval} and
	3572	@code{yylloc} become local variables in @code{yyparse}, and a different
	3573	calling convention is used for the lexical analyzer function
	3574	@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
	3575	Parsers}, for the details of this. The variable @code{yynerrs} also
	3576	becomes local in @code{yyparse} (@pxref{Error Reporting, ,The Error
	3577	Reporting Function @code{yyerror}}). The convention for calling
	3578	@code{yyparse} itself is unchanged.
	3579
	3580	Whether the parser is pure has nothing to do with the grammar rules.
	3581	You can generate either a pure parser or a nonreentrant parser from any
	3582	valid grammar.
	3583
	3584	@node Decl Summary
	3585	@subsection Bison Declaration Summary
	3586	@cindex Bison declaration summary
	3587	@cindex declaration summary
	3588	@cindex summary, Bison declaration
	3589
	3590	Here is a summary of the declarations used to define a grammar:
	3591
	3592	@deffn {Directive} %union
	3593	Declare the collection of data types that semantic values may have
	3594	(@pxref{Union Decl, ,The Collection of Value Types}).
	3595	@end deffn
	3596
	3597	@deffn {Directive} %token
	3598	Declare a terminal symbol (token type name) with no precedence
	3599	or associativity specified (@pxref{Token Decl, ,Token Type Names}).
	3600	@end deffn
	3601
	3602	@deffn {Directive} %right
	3603	Declare a terminal symbol (token type name) that is right-associative
	3604	(@pxref{Precedence Decl, ,Operator Precedence}).
	3605	@end deffn
	3606
	3607	@deffn {Directive} %left
	3608	Declare a terminal symbol (token type name) that is left-associative
	3609	(@pxref{Precedence Decl, ,Operator Precedence}).
	3610	@end deffn
	3611
	3612	@deffn {Directive} %nonassoc
	3613	Declare a terminal symbol (token type name) that is nonassociative
	3614	(using it in a way that would be associative is a syntax error)
	3615	@end deffn
	3616	(@pxref{Precedence Decl, ,Operator Precedence}).
	3617
	3618	@deffn {Directive} %type
	3619	Declare the type of semantic values for a nonterminal symbol
	3620	(@pxref{Type Decl, ,Nonterminal Symbols}).
	3621	@end deffn
	3622
	3623	@deffn {Directive} %start
	3624	Specify the grammar's start symbol (@pxref{Start Decl, ,The
	3625	Start-Symbol}).
	3626	@end deffn
	3627
	3628	@deffn {Directive} %expect
	3629	Declare the expected number of shift-reduce conflicts
	3630	(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
	3631	@end deffn
	3632
	3633
	3634	@sp 1
	3635	@noindent
	3636	In order to change the behavior of @command{bison}, use the following
	3637	directives:
	3638
	3639	@deffn {Directive} %debug
	3640	In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
	3641	already defined, so that the debugging facilities are compiled.
	3642	@end deffn
	3643	@xref{Tracing, ,Tracing Your Parser}.
	3644
	3645	@deffn {Directive} %defines
	3646	Write an extra output file containing macro definitions for the token
	3647	type names defined in the grammar and the semantic value type
	3648	@code{YYSTYPE}, as well as a few @code{extern} variable declarations.
	3649
	3650	If the parser output file is named @file{@var{name}.c} then this file
	3651	is named @file{@var{name}.h}.
	3652
	3653	This output file is essential if you wish to put the definition of
	3654	@code{yylex} in a separate source file, because @code{yylex} needs to
	3655	be able to refer to token type codes and the variable
	3656	@code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}.
	3657	@end deffn
	3658
	3659	@deffn {Directive} %destructor
	3660	Specifying how the parser should reclaim the memory associated to
	3661	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	3662	@end deffn
	3663
	3664	@deffn {Directive} %file-prefix="@var{prefix}"
	3665	Specify a prefix to use for all Bison output file names. The names are
	3666	chosen as if the input file were named @file{@var{prefix}.y}.
	3667	@end deffn
	3668
	3669	@deffn {Directive} %locations
	3670	Generate the code processing the locations (@pxref{Action Features,
	3671	,Special Features for Use in Actions}). This mode is enabled as soon as
	3672	the grammar uses the special @samp{@@@var{n}} tokens, but if your
	3673	grammar does not use it, using @samp{%locations} allows for more
	3674	accurate syntax error messages.
	3675	@end deffn
	3676
	3677	@deffn {Directive} %name-prefix="@var{prefix}"
	3678	Rename the external symbols used in the parser so that they start with
	3679	@var{prefix} instead of @samp{yy}. The precise list of symbols renamed
	3680	is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
	3681	@code{yylval}, @code{yylloc}, @code{yychar}, @code{yydebug}, and
	3682	possible @code{yylloc}. For example, if you use
	3683	@samp{%name-prefix="c_"}, the names become @code{c_parse}, @code{c_lex},
	3684	and so on. @xref{Multiple Parsers, ,Multiple Parsers in the Same
	3685	Program}.
	3686	@end deffn
	3687
	3688	@deffn {Directive} %no-parser
	3689	Do not include any C code in the parser file; generate tables only. The
	3690	parser file contains just @code{#define} directives and static variable
	3691	declarations.
	3692
	3693	This option also tells Bison to write the C code for the grammar actions
	3694	into a file named @file{@var{filename}.act}, in the form of a
	3695	brace-surrounded body fit for a @code{switch} statement.
	3696	@end deffn
	3697
	3698	@deffn {Directive} %no-lines
	3699	Don't generate any @code{#line} preprocessor commands in the parser
	3700	file. Ordinarily Bison writes these commands in the parser file so that
	3701	the C compiler and debuggers will associate errors and object code with
	3702	your source file (the grammar file). This directive causes them to
	3703	associate errors with the parser file, treating it an independent source
	3704	file in its own right.
	3705	@end deffn
	3706
	3707	@deffn {Directive} %output="@var{filename}"
	3708	Specify the @var{filename} for the parser file.
	3709	@end deffn
	3710
	3711	@deffn {Directive} %pure-parser
	3712	Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure
	3713	(Reentrant) Parser}).
	3714	@end deffn
	3715
	3716	@deffn {Directive} %token-table
	3717	Generate an array of token names in the parser file. The name of the
	3718	array is @code{yytname}; @code{yytname[@var{i}]} is the name of the
	3719	token whose internal Bison token code number is @var{i}. The first
	3720	three elements of @code{yytname} are always @code{"$end"},
	3721	@code{"error"}, and @code{"$undefined"}; after these come the symbols
	3722	defined in the grammar file.
	3723
	3724	For single-character literal tokens and literal string tokens, the name
	3725	in the table includes the single-quote or double-quote characters: for
	3726	example, @code{"'+'"} is a single-character literal and @code{"\"<=\""}
	3727	is a literal string token. All the characters of the literal string
	3728	token appear verbatim in the string found in the table; even
	3729	double-quote characters are not escaped. For example, if the token
	3730	consists of three characters @samp{"}, its string in @code{yytname}
	3731	contains @samp{"""}. (In C, that would be written as
	3732	@code{"\"\"\""}).
	3733
	3734	When you specify @code{%token-table}, Bison also generates macro
	3735	definitions for macros @code{YYNTOKENS}, @code{YYNNTS}, and
	3736	@code{YYNRULES}, and @code{YYNSTATES}:
	3737
	3738	@table @code
	3739	@item YYNTOKENS
	3740	The highest token number, plus one.
	3741	@item YYNNTS
	3742	The number of nonterminal symbols.
	3743	@item YYNRULES
	3744	The number of grammar rules,
	3745	@item YYNSTATES
	3746	The number of parser states (@pxref{Parser States}).
	3747	@end table
	3748	@end deffn
	3749
	3750	@deffn {Directive} %verbose
	3751	Write an extra output file containing verbose descriptions of the
	3752	parser states and what is done for each type of look-ahead token in
	3753	that state. @xref{Understanding, , Understanding Your Parser}, for more
	3754	information.
	3755	@end deffn
	3756
	3757	@deffn {Directive} %yacc
	3758	Pretend the option @option{--yacc} was given, i.e., imitate Yacc,
	3759	including its naming conventions. @xref{Bison Options}, for more.
	3760	@end deffn
	3761
	3762
	3763	@node Multiple Parsers
	3764	@section Multiple Parsers in the Same Program
	3765
	3766	Most programs that use Bison parse only one language and therefore contain
	3767	only one Bison parser. But what if you want to parse more than one
	3768	language with the same program? Then you need to avoid a name conflict
	3769	between different definitions of @code{yyparse}, @code{yylval}, and so on.
	3770
	3771	The easy way to do this is to use the option @samp{-p @var{prefix}}
	3772	(@pxref{Invocation, ,Invoking Bison}). This renames the interface
	3773	functions and variables of the Bison parser to start with @var{prefix}
	3774	instead of @samp{yy}. You can use this to give each parser distinct
	3775	names that do not conflict.
	3776
	3777	The precise list of symbols renamed is @code{yyparse}, @code{yylex},
	3778	@code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yylloc},
	3779	@code{yychar} and @code{yydebug}. For example, if you use @samp{-p c},
	3780	the names become @code{cparse}, @code{clex}, and so on.
	3781
	3782	@strong{All the other variables and macros associated with Bison are not
	3783	renamed.} These others are not global; there is no conflict if the same
	3784	name is used in different parsers. For example, @code{YYSTYPE} is not
	3785	renamed, but defining this in different ways in different parsers causes
	3786	no trouble (@pxref{Value Type, ,Data Types of Semantic Values}).
	3787
	3788	The @samp{-p} option works by adding macro definitions to the beginning
	3789	of the parser source file, defining @code{yyparse} as
	3790	@code{@var{prefix}parse}, and so on. This effectively substitutes one
	3791	name for the other in the entire parser file.
	3792
	3793	@node Interface
	3794	@chapter Parser C-Language Interface
	3795	@cindex C-language interface
	3796	@cindex interface
	3797
	3798	The Bison parser is actually a C function named @code{yyparse}. Here we
	3799	describe the interface conventions of @code{yyparse} and the other
	3800	functions that it needs to use.
	3801
	3802	Keep in mind that the parser uses many C identifiers starting with
	3803	@samp{yy} and @samp{YY} for internal purposes. If you use such an
	3804	identifier (aside from those in this manual) in an action or in epilogue
	3805	in the grammar file, you are likely to run into trouble.
	3806
	3807	@menu
	3808	* Parser Function:: How to call @code{yyparse} and what it returns.
	3809	* Lexical:: You must supply a function @code{yylex}
	3810	which reads tokens.
	3811	* Error Reporting:: You must supply a function @code{yyerror}.
	3812	* Action Features:: Special features for use in actions.
	3813	@end menu
	3814
	3815	@node Parser Function
	3816	@section The Parser Function @code{yyparse}
	3817	@findex yyparse
	3818
	3819	You call the function @code{yyparse} to cause parsing to occur. This
	3820	function reads tokens, executes actions, and ultimately returns when it
	3821	encounters end-of-input or an unrecoverable syntax error. You can also
	3822	write an action which directs @code{yyparse} to return immediately
	3823	without reading further.
	3824
	3825
	3826	@deftypefun int yyparse (void)
	3827	The value returned by @code{yyparse} is 0 if parsing was successful (return
	3828	is due to end-of-input).
	3829
	3830	The value is 1 if parsing failed (return is due to a syntax error).
	3831	@end deftypefun
	3832
	3833	In an action, you can cause immediate return from @code{yyparse} by using
	3834	these macros:
	3835
	3836	@defmac YYACCEPT
	3837	@findex YYACCEPT
	3838	Return immediately with value 0 (to report success).
	3839	@end defmac
	3840
	3841	@defmac YYABORT
	3842	@findex YYABORT
	3843	Return immediately with value 1 (to report failure).
	3844	@end defmac
	3845
	3846	If you use a reentrant parser, you can optionally pass additional
	3847	parameter information to it in a reentrant way. To do so, use the
	3848	declaration @code{%parse-param}:
	3849
	3850	@deffn {Directive} %parse-param @{@var{argument-declaration}@}
	3851	@findex %parse-param
	3852	Declare that an argument declared by @code{argument-declaration} is an
	3853	additional @code{yyparse} argument. This argument is also passed to
	3854	@code{yyerror}. The @var{argument-declaration} is used when declaring
	3855	functions or prototypes. The last identifier in
	3856	@var{argument-declaration} must be the argument name.
	3857	@end deffn
	3858
	3859	Here's an example. Write this in the parser:
	3860
	3861	@example
	3862	%parse-param @{int *nastiness@}
	3863	%parse-param @{int *randomness@}
	3864	@end example
	3865
	3866	@noindent
	3867	Then call the parser like this:
	3868
	3869	@example
	3870	@{
	3871	int nastiness, randomness;
	3872	@dots{} /* @r{Store proper data in @code{nastiness} and @code{randomness}.} */
	3873	value = yyparse (&nastiness, &randomness);
	3874	@dots{}
	3875	@}
	3876	@end example
	3877
	3878	@noindent
	3879	In the grammar actions, use expressions like this to refer to the data:
	3880
	3881	@example
	3882	exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @}
	3883	@end example
	3884
	3885
	3886	@node Lexical
	3887	@section The Lexical Analyzer Function @code{yylex}
	3888	@findex yylex
	3889	@cindex lexical analyzer
	3890
	3891	The @dfn{lexical analyzer} function, @code{yylex}, recognizes tokens from
	3892	the input stream and returns them to the parser. Bison does not create
	3893	this function automatically; you must write it so that @code{yyparse} can
	3894	call it. The function is sometimes referred to as a lexical scanner.
	3895
	3896	In simple programs, @code{yylex} is often defined at the end of the Bison
	3897	grammar file. If @code{yylex} is defined in a separate source file, you
	3898	need to arrange for the token-type macro definitions to be available there.
	3899	To do this, use the @samp{-d} option when you run Bison, so that it will
	3900	write these macro definitions into a separate header file
	3901	@file{@var{name}.tab.h} which you can include in the other source files
	3902	that need it. @xref{Invocation, ,Invoking Bison}.
	3903
	3904	@menu
	3905	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	3906	* Token Values:: How @code{yylex} must return the semantic value
	3907	of the token it has read.
	3908	* Token Positions:: How @code{yylex} must return the text position
	3909	(line number, etc.) of the token, if the
	3910	actions want that.
	3911	* Pure Calling:: How the calling convention differs
	3912	in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	3913	@end menu
	3914
	3915	@node Calling Convention
	3916	@subsection Calling Convention for @code{yylex}
	3917
	3918	The value that @code{yylex} returns must be the positive numeric code
	3919	for the type of token it has just found; a zero or negative value
	3920	signifies end-of-input.
	3921
	3922	When a token is referred to in the grammar rules by a name, that name
	3923	in the parser file becomes a C macro whose definition is the proper
	3924	numeric code for that token type. So @code{yylex} can use the name
	3925	to indicate that type. @xref{Symbols}.
	3926
	3927	When a token is referred to in the grammar rules by a character literal,
	3928	the numeric code for that character is also the code for the token type.
	3929	So @code{yylex} can simply return that character code, possibly converted
	3930	to @code{unsigned char} to avoid sign-extension. The null character
	3931	must not be used this way, because its code is zero and that
	3932	signifies end-of-input.
	3933
	3934	Here is an example showing these things:
	3935
	3936	@example
	3937	int
	3938	yylex (void)
	3939	@{
	3940	@dots{}
	3941	if (c == EOF) /* Detect end-of-input. */
	3942	return 0;
	3943	@dots{}
	3944	if (c == '+' \|\| c == '-')
	3945	return c; /* Assume token type for `+' is '+'. */
	3946	@dots{}
	3947	return INT; /* Return the type of the token. */
	3948	@dots{}
	3949	@}
	3950	@end example
	3951
	3952	@noindent
	3953	This interface has been designed so that the output from the @code{lex}
	3954	utility can be used without change as the definition of @code{yylex}.
	3955
	3956	If the grammar uses literal string tokens, there are two ways that
	3957	@code{yylex} can determine the token type codes for them:
	3958
	3959	@itemize @bullet
	3960	@item
	3961	If the grammar defines symbolic token names as aliases for the
	3962	literal string tokens, @code{yylex} can use these symbolic names like
	3963	all others. In this case, the use of the literal string tokens in
	3964	the grammar file has no effect on @code{yylex}.
	3965
	3966	@item
	3967	@code{yylex} can find the multicharacter token in the @code{yytname}
	3968	table. The index of the token in the table is the token type's code.
	3969	The name of a multicharacter token is recorded in @code{yytname} with a
	3970	double-quote, the token's characters, and another double-quote. The
	3971	token's characters are not escaped in any way; they appear verbatim in
	3972	the contents of the string in the table.
	3973
	3974	Here's code for looking up a token in @code{yytname}, assuming that the
	3975	characters of the token are stored in @code{token_buffer}.
	3976
	3977	@smallexample
	3978	for (i = 0; i < YYNTOKENS; i++)
	3979	@{
	3980	if (yytname[i] != 0
	3981	&& yytname[i][0] == '"'
	3982	&& ! strncmp (yytname[i] + 1, token_buffer,
	3983	strlen (token_buffer))
	3984	&& yytname[i][strlen (token_buffer) + 1] == '"'
	3985	&& yytname[i][strlen (token_buffer) + 2] == 0)
	3986	break;
	3987	@}
	3988	@end smallexample
	3989
	3990	The @code{yytname} table is generated only if you use the
	3991	@code{%token-table} declaration. @xref{Decl Summary}.
	3992	@end itemize
	3993
	3994	@node Token Values
	3995	@subsection Semantic Values of Tokens
	3996
	3997	@vindex yylval
	3998	In an ordinary (non-reentrant) parser, the semantic value of the token must
	3999	be stored into the global variable @code{yylval}. When you are using
	4000	just one data type for semantic values, @code{yylval} has that type.
	4001	Thus, if the type is @code{int} (the default), you might write this in
	4002	@code{yylex}:
	4003
	4004	@example
	4005	@group
	4006	@dots{}
	4007	yylval = value; /* Put value onto Bison stack. */
	4008	return INT; /* Return the type of the token. */
	4009	@dots{}
	4010	@end group
	4011	@end example
	4012
	4013	When you are using multiple data types, @code{yylval}'s type is a union
	4014	made from the @code{%union} declaration (@pxref{Union Decl, ,The
	4015	Collection of Value Types}). So when you store a token's value, you
	4016	must use the proper member of the union. If the @code{%union}
	4017	declaration looks like this:
	4018
	4019	@example
	4020	@group
	4021	%union @{
	4022	int intval;
	4023	double val;
	4024	symrec *tptr;
	4025	@}
	4026	@end group
	4027	@end example
	4028
	4029	@noindent
	4030	then the code in @code{yylex} might look like this:
	4031
	4032	@example
	4033	@group
	4034	@dots{}
	4035	yylval.intval = value; /* Put value onto Bison stack. */
	4036	return INT; /* Return the type of the token. */
	4037	@dots{}
	4038	@end group
	4039	@end example
	4040
	4041	@node Token Positions
	4042	@subsection Textual Positions of Tokens
	4043
	4044	@vindex yylloc
	4045	If you are using the @samp{@@@var{n}}-feature (@pxref{Locations, ,
	4046	Tracking Locations}) in actions to keep track of the
	4047	textual locations of tokens and groupings, then you must provide this
	4048	information in @code{yylex}. The function @code{yyparse} expects to
	4049	find the textual location of a token just parsed in the global variable
	4050	@code{yylloc}. So @code{yylex} must store the proper data in that
	4051	variable.
	4052
	4053	By default, the value of @code{yylloc} is a structure and you need only
	4054	initialize the members that are going to be used by the actions. The
	4055	four members are called @code{first_line}, @code{first_column},
	4056	@code{last_line} and @code{last_column}. Note that the use of this
	4057	feature makes the parser noticeably slower.
	4058
	4059	@tindex YYLTYPE
	4060	The data type of @code{yylloc} has the name @code{YYLTYPE}.
	4061
	4062	@node Pure Calling
	4063	@subsection Calling Conventions for Pure Parsers
	4064
	4065	When you use the Bison declaration @code{%pure-parser} to request a
	4066	pure, reentrant parser, the global communication variables @code{yylval}
	4067	and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant)
	4068	Parser}.) In such parsers the two global variables are replaced by
	4069	pointers passed as arguments to @code{yylex}. You must declare them as
	4070	shown here, and pass the information back by storing it through those
	4071	pointers.
	4072
	4073	@example
	4074	int
	4075	yylex (YYSTYPE lvalp, YYLTYPE llocp)
	4076	@{
	4077	@dots{}
	4078	lvalp = value; / Put value onto Bison stack. */
	4079	return INT; /* Return the type of the token. */
	4080	@dots{}
	4081	@}
	4082	@end example
	4083
	4084	If the grammar file does not use the @samp{@@} constructs to refer to
	4085	textual positions, then the type @code{YYLTYPE} will not be defined. In
	4086	this case, omit the second argument; @code{yylex} will be called with
	4087	only one argument.
	4088
	4089
	4090	If you wish to pass the additional parameter data to @code{yylex}, use
	4091	@code{%lex-param} just like @code{%parse-param} (@pxref{Parser
	4092	Function}).
	4093
	4094	@deffn {Directive} lex-param @{@var{argument-declaration}@}
	4095	@findex %lex-param
	4096	Declare that @code{argument-declaration} is an additional @code{yylex}
	4097	argument declaration.
	4098	@end deffn
	4099
	4100	For instance:
	4101
	4102	@example
	4103	%parse-param @{int *nastiness@}
	4104	%lex-param @{int *nastiness@}
	4105	%parse-param @{int *randomness@}
	4106	@end example
	4107
	4108	@noindent
	4109	results in the following signature:
	4110
	4111	@example
	4112	int yylex (int *nastiness);
	4113	int yyparse (int nastiness, int randomness);
	4114	@end example
	4115
	4116	If @code{%pure-parser} is added:
	4117
	4118	@example
	4119	int yylex (YYSTYPE lvalp, int nastiness);
	4120	int yyparse (int nastiness, int randomness);
	4121	@end example
	4122
	4123	@noindent
	4124	and finally, if both @code{%pure-parser} and @code{%locations} are used:
	4125
	4126	@example
	4127	int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);
	4128	int yyparse (int nastiness, int randomness);
	4129	@end example
	4130
	4131	@node Error Reporting
	4132	@section The Error Reporting Function @code{yyerror}
	4133	@cindex error reporting function
	4134	@findex yyerror
	4135	@cindex parse error
	4136	@cindex syntax error
	4137
	4138	The Bison parser detects a @dfn{syntax error} or @dfn{parse error}
	4139	whenever it reads a token which cannot satisfy any syntax rule. An
	4140	action in the grammar can also explicitly proclaim an error, using the
	4141	macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use
	4142	in Actions}).
	4143
	4144	The Bison parser expects to report the error by calling an error
	4145	reporting function named @code{yyerror}, which you must supply. It is
	4146	called by @code{yyparse} whenever a syntax error is found, and it
	4147	receives one argument. For a syntax error, the string is normally
	4148	@w{@code{"syntax error"}}.
	4149
	4150	@findex %error-verbose
	4151	If you invoke the directive @code{%error-verbose} in the Bison
	4152	declarations section (@pxref{Bison Declarations, ,The Bison Declarations
	4153	Section}), then Bison provides a more verbose and specific error message
	4154	string instead of just plain @w{@code{"syntax error"}}.
	4155
	4156	The parser can detect one other kind of error: stack overflow. This
	4157	happens when the input contains constructions that are very deeply
	4158	nested. It isn't likely you will encounter this, since the Bison
	4159	parser extends its stack automatically up to a very large limit. But
	4160	if overflow happens, @code{yyparse} calls @code{yyerror} in the usual
	4161	fashion, except that the argument string is @w{@code{"parser stack
	4162	overflow"}}.
	4163
	4164	The following definition suffices in simple programs:
	4165
	4166	@example
	4167	@group
	4168	void
	4169	yyerror (const char *s)
	4170	@{
	4171	@end group
	4172	@group
	4173	fprintf (stderr, "%s\n", s);
	4174	@}
	4175	@end group
	4176	@end example
	4177
	4178	After @code{yyerror} returns to @code{yyparse}, the latter will attempt
	4179	error recovery if you have written suitable error recovery grammar rules
	4180	(@pxref{Error Recovery}). If recovery is impossible, @code{yyparse} will
	4181	immediately return 1.
	4182
	4183	Obviously, in location tracking pure parsers, @code{yyerror} should have
	4184	an access to the current location. This is indeed the case for the GLR
	4185	parsers, but not for the Yacc parser, for historical reasons. I.e., if
	4186	@samp{%locations %pure-parser} is passed then the prototypes for
	4187	@code{yyerror} are:
	4188
	4189	@example
	4190	void yyerror (const char msg); / Yacc parsers. */
	4191	void yyerror (YYLTYPE locp, const char msg); /* GLR parsers. */
	4192	@end example
	4193
	4194	If @samp{%parse-param @{int *nastiness@}} is used, then:
	4195
	4196	@example
	4197	void yyerror (int randomness, const char msg); /* Yacc parsers. */
	4198	void yyerror (int randomness, const char msg); /* GLR parsers. */
	4199	@end example
	4200
	4201	Finally, GLR and Yacc parsers share the same @code{yyerror} calling
	4202	convention for absolutely pure parsers, i.e., when the calling
	4203	convention of @code{yylex} @emph{and} the calling convention of
	4204	@code{%pure-parser} are pure. I.e.:
	4205
	4206	@example
	4207	/* Location tracking. */
	4208	%locations
	4209	/* Pure yylex. */
	4210	%pure-parser
	4211	%lex-param @{int *nastiness@}
	4212	/* Pure yyparse. */
	4213	%parse-param @{int *nastiness@}
	4214	%parse-param @{int *randomness@}
	4215	@end example
	4216
	4217	@noindent
	4218	results in the following signatures for all the parser kinds:
	4219
	4220	@example
	4221	int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);
	4222	int yyparse (int nastiness, int randomness);
	4223	void yyerror (YYLTYPE *locp,
	4224	int nastiness, int randomness,
	4225	const char *msg);
	4226	@end example
	4227
	4228	@noindent
	4229	Please, note that the prototypes are only indications of how the code
	4230	produced by Bison will use @code{yyerror}; you still have freedom on the
	4231	exit value, and even on making @code{yyerror} a variadic function. It
	4232	is precisely to enable this that the message is always passed last.
	4233
	4234	@vindex yynerrs
	4235	The variable @code{yynerrs} contains the number of syntax errors
	4236	encountered so far. Normally this variable is global; but if you
	4237	request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser})
	4238	then it is a local variable which only the actions can access.
	4239
	4240	@node Action Features
	4241	@section Special Features for Use in Actions
	4242	@cindex summary, action features
	4243	@cindex action features summary
	4244
	4245	Here is a table of Bison constructs, variables and macros that
	4246	are useful in actions.
	4247
	4248	@deffn {Variable} $$
	4249	Acts like a variable that contains the semantic value for the
	4250	grouping made by the current rule. @xref{Actions}.
	4251	@end deffn
	4252
	4253	@deffn {Variable} $@var{n}
	4254	Acts like a variable that contains the semantic value for the
	4255	@var{n}th component of the current rule. @xref{Actions}.
	4256	@end deffn
	4257
	4258	@deffn {Variable} $<@var{typealt}>$
	4259	Like @code{$$} but specifies alternative @var{typealt} in the union
	4260	specified by the @code{%union} declaration. @xref{Action Types, ,Data
	4261	Types of Values in Actions}.
	4262	@end deffn
	4263
	4264	@deffn {Variable} $<@var{typealt}>@var{n}
	4265	Like @code{$@var{n}} but specifies alternative @var{typealt} in the
	4266	union specified by the @code{%union} declaration.
	4267	@xref{Action Types, ,Data Types of Values in Actions}.
	4268	@end deffn
	4269
	4270	@deffn {Macro} YYABORT;
	4271	Return immediately from @code{yyparse}, indicating failure.
	4272	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	4273	@end deffn
	4274
	4275	@deffn {Macro} YYACCEPT;
	4276	Return immediately from @code{yyparse}, indicating success.
	4277	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	4278	@end deffn
	4279
	4280	@deffn {Macro} YYBACKUP (@var{token}, @var{value});
	4281	@findex YYBACKUP
	4282	Unshift a token. This macro is allowed only for rules that reduce
	4283	a single value, and only when there is no look-ahead token.
	4284	It is also disallowed in @acronym{GLR} parsers.
	4285	It installs a look-ahead token with token type @var{token} and
	4286	semantic value @var{value}; then it discards the value that was
	4287	going to be reduced by this rule.
	4288
	4289	If the macro is used when it is not valid, such as when there is
	4290	a look-ahead token already, then it reports a syntax error with
	4291	a message @samp{cannot back up} and performs ordinary error
	4292	recovery.
	4293
	4294	In either case, the rest of the action is not executed.
	4295	@end deffn
	4296
	4297	@deffn {Macro} YYEMPTY
	4298	@vindex YYEMPTY
	4299	Value stored in @code{yychar} when there is no look-ahead token.
	4300	@end deffn
	4301
	4302	@deffn {Macro} YYERROR;
	4303	@findex YYERROR
	4304	Cause an immediate syntax error. This statement initiates error
	4305	recovery just as if the parser itself had detected an error; however, it
	4306	does not call @code{yyerror}, and does not print any message. If you
	4307	want to print an error message, call @code{yyerror} explicitly before
	4308	the @samp{YYERROR;} statement. @xref{Error Recovery}.
	4309	@end deffn
	4310
	4311	@deffn {Macro} YYRECOVERING
	4312	This macro stands for an expression that has the value 1 when the parser
	4313	is recovering from a syntax error, and 0 the rest of the time.
	4314	@xref{Error Recovery}.
	4315	@end deffn
	4316
	4317	@deffn {Variable} yychar
	4318	Variable containing the current look-ahead token. (In a pure parser,
	4319	this is actually a local variable within @code{yyparse}.) When there is
	4320	no look-ahead token, the value @code{YYEMPTY} is stored in the variable.
	4321	@xref{Look-Ahead, ,Look-Ahead Tokens}.
	4322	@end deffn
	4323
	4324	@deffn {Macro} yyclearin;
	4325	Discard the current look-ahead token. This is useful primarily in
	4326	error rules. @xref{Error Recovery}.
	4327	@end deffn
	4328
	4329	@deffn {Macro} yyerrok;
	4330	Resume generating error messages immediately for subsequent syntax
	4331	errors. This is useful primarily in error rules.
	4332	@xref{Error Recovery}.
	4333	@end deffn
	4334
	4335	@deffn {Value} @@$
	4336	@findex @@$
	4337	Acts like a structure variable containing information on the textual position
	4338	of the grouping made by the current rule. @xref{Locations, ,
	4339	Tracking Locations}.
	4340
	4341	@c Check if those paragraphs are still useful or not.
	4342
	4343	@c @example
	4344	@c struct @{
	4345	@c int first_line, last_line;
	4346	@c int first_column, last_column;
	4347	@c @};
	4348	@c @end example
	4349
	4350	@c Thus, to get the starting line number of the third component, you would
	4351	@c use @samp{@@3.first_line}.
	4352
	4353	@c In order for the members of this structure to contain valid information,
	4354	@c you must make @code{yylex} supply this information about each token.
	4355	@c If you need only certain members, then @code{yylex} need only fill in
	4356	@c those members.
	4357
	4358	@c The use of this feature makes the parser noticeably slower.
	4359	@end deffn
	4360
	4361	@deffn {Value} @@@var{n}
	4362	@findex @@@var{n}
	4363	Acts like a structure variable containing information on the textual position
	4364	of the @var{n}th component of the current rule. @xref{Locations, ,
	4365	Tracking Locations}.
	4366	@end deffn
	4367
	4368
	4369	@node Algorithm
	4370	@chapter The Bison Parser Algorithm
	4371	@cindex Bison parser algorithm
	4372	@cindex algorithm of parser
	4373	@cindex shifting
	4374	@cindex reduction
	4375	@cindex parser stack
	4376	@cindex stack, parser
	4377
	4378	As Bison reads tokens, it pushes them onto a stack along with their
	4379	semantic values. The stack is called the @dfn{parser stack}. Pushing a
	4380	token is traditionally called @dfn{shifting}.
	4381
	4382	For example, suppose the infix calculator has read @samp{1 + 5 *}, with a
	4383	@samp{3} to come. The stack will have four elements, one for each token
	4384	that was shifted.
	4385
	4386	But the stack does not always have an element for each token read. When
	4387	the last @var{n} tokens and groupings shifted match the components of a
	4388	grammar rule, they can be combined according to that rule. This is called
	4389	@dfn{reduction}. Those tokens and groupings are replaced on the stack by a
	4390	single grouping whose symbol is the result (left hand side) of that rule.
	4391	Running the rule's action is part of the process of reduction, because this
	4392	is what computes the semantic value of the resulting grouping.
	4393
	4394	For example, if the infix calculator's parser stack contains this:
	4395
	4396	@example
	4397	1 + 5 * 3
	4398	@end example
	4399
	4400	@noindent
	4401	and the next input token is a newline character, then the last three
	4402	elements can be reduced to 15 via the rule:
	4403
	4404	@example
	4405	expr: expr '*' expr;
	4406	@end example
	4407
	4408	@noindent
	4409	Then the stack contains just these three elements:
	4410
	4411	@example
	4412	1 + 15
	4413	@end example
	4414
	4415	@noindent
	4416	At this point, another reduction can be made, resulting in the single value
	4417	16. Then the newline token can be shifted.
	4418
	4419	The parser tries, by shifts and reductions, to reduce the entire input down
	4420	to a single grouping whose symbol is the grammar's start-symbol
	4421	(@pxref{Language and Grammar, ,Languages and Context-Free Grammars}).
	4422
	4423	This kind of parser is known in the literature as a bottom-up parser.
	4424
	4425	@menu
	4426	* Look-Ahead:: Parser looks one token ahead when deciding what to do.
	4427	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	4428	* Precedence:: Operator precedence works by resolving conflicts.
	4429	* Contextual Precedence:: When an operator's precedence depends on context.
	4430	* Parser States:: The parser is a finite-state-machine with stack.
	4431	* Reduce/Reduce:: When two rules are applicable in the same situation.
	4432	* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
	4433	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	4434	* Stack Overflow:: What happens when stack gets full. How to avoid it.
	4435	@end menu
	4436
	4437	@node Look-Ahead
	4438	@section Look-Ahead Tokens
	4439	@cindex look-ahead token
	4440
	4441	The Bison parser does @emph{not} always reduce immediately as soon as the
	4442	last @var{n} tokens and groupings match a rule. This is because such a
	4443	simple strategy is inadequate to handle most languages. Instead, when a
	4444	reduction is possible, the parser sometimes ``looks ahead'' at the next
	4445	token in order to decide what to do.
	4446
	4447	When a token is read, it is not immediately shifted; first it becomes the
	4448	@dfn{look-ahead token}, which is not on the stack. Now the parser can
	4449	perform one or more reductions of tokens and groupings on the stack, while
	4450	the look-ahead token remains off to the side. When no more reductions
	4451	should take place, the look-ahead token is shifted onto the stack. This
	4452	does not mean that all possible reductions have been done; depending on the
	4453	token type of the look-ahead token, some rules may choose to delay their
	4454	application.
	4455
	4456	Here is a simple case where look-ahead is needed. These three rules define
	4457	expressions which contain binary addition operators and postfix unary
	4458	factorial operators (@samp{!}), and allow parentheses for grouping.
	4459
	4460	@example
	4461	@group
	4462	expr: term '+' expr
	4463	\| term
	4464	;
	4465	@end group
	4466
	4467	@group
	4468	term: '(' expr ')'
	4469	\| term '!'
	4470	\| NUMBER
	4471	;
	4472	@end group
	4473	@end example
	4474
	4475	Suppose that the tokens @w{@samp{1 + 2}} have been read and shifted; what
	4476	should be done? If the following token is @samp{)}, then the first three
	4477	tokens must be reduced to form an @code{expr}. This is the only valid
	4478	course, because shifting the @samp{)} would produce a sequence of symbols
	4479	@w{@code{term ')'}}, and no rule allows this.
	4480
	4481	If the following token is @samp{!}, then it must be shifted immediately so
	4482	that @w{@samp{2 !}} can be reduced to make a @code{term}. If instead the
	4483	parser were to reduce before shifting, @w{@samp{1 + 2}} would become an
	4484	@code{expr}. It would then be impossible to shift the @samp{!} because
	4485	doing so would produce on the stack the sequence of symbols @code{expr
	4486	'!'}. No rule allows that sequence.
	4487
	4488	@vindex yychar
	4489	The current look-ahead token is stored in the variable @code{yychar}.
	4490	@xref{Action Features, ,Special Features for Use in Actions}.
	4491
	4492	@node Shift/Reduce
	4493	@section Shift/Reduce Conflicts
	4494	@cindex conflicts
	4495	@cindex shift/reduce conflicts
	4496	@cindex dangling @code{else}
	4497	@cindex @code{else}, dangling
	4498
	4499	Suppose we are parsing a language which has if-then and if-then-else
	4500	statements, with a pair of rules like this:
	4501
	4502	@example
	4503	@group
	4504	if_stmt:
	4505	IF expr THEN stmt
	4506	\| IF expr THEN stmt ELSE stmt
	4507	;
	4508	@end group
	4509	@end example
	4510
	4511	@noindent
	4512	Here we assume that @code{IF}, @code{THEN} and @code{ELSE} are
	4513	terminal symbols for specific keyword tokens.
	4514
	4515	When the @code{ELSE} token is read and becomes the look-ahead token, the
	4516	contents of the stack (assuming the input is valid) are just right for
	4517	reduction by the first rule. But it is also legitimate to shift the
	4518	@code{ELSE}, because that would lead to eventual reduction by the second
	4519	rule.
	4520
	4521	This situation, where either a shift or a reduction would be valid, is
	4522	called a @dfn{shift/reduce conflict}. Bison is designed to resolve
	4523	these conflicts by choosing to shift, unless otherwise directed by
	4524	operator precedence declarations. To see the reason for this, let's
	4525	contrast it with the other alternative.
	4526
	4527	Since the parser prefers to shift the @code{ELSE}, the result is to attach
	4528	the else-clause to the innermost if-statement, making these two inputs
	4529	equivalent:
	4530
	4531	@example
	4532	if x then if y then win (); else lose;
	4533
	4534	if x then do; if y then win (); else lose; end;
	4535	@end example
	4536
	4537	But if the parser chose to reduce when possible rather than shift, the
	4538	result would be to attach the else-clause to the outermost if-statement,
	4539	making these two inputs equivalent:
	4540
	4541	@example
	4542	if x then if y then win (); else lose;
	4543
	4544	if x then do; if y then win (); end; else lose;
	4545	@end example
	4546
	4547	The conflict exists because the grammar as written is ambiguous: either
	4548	parsing of the simple nested if-statement is legitimate. The established
	4549	convention is that these ambiguities are resolved by attaching the
	4550	else-clause to the innermost if-statement; this is what Bison accomplishes
	4551	by choosing to shift rather than reduce. (It would ideally be cleaner to
	4552	write an unambiguous grammar, but that is very hard to do in this case.)
	4553	This particular ambiguity was first encountered in the specifications of
	4554	Algol 60 and is called the ``dangling @code{else}'' ambiguity.
	4555
	4556	To avoid warnings from Bison about predictable, legitimate shift/reduce
	4557	conflicts, use the @code{%expect @var{n}} declaration. There will be no
	4558	warning as long as the number of shift/reduce conflicts is exactly @var{n}.
	4559	@xref{Expect Decl, ,Suppressing Conflict Warnings}.
	4560
	4561	The definition of @code{if_stmt} above is solely to blame for the
	4562	conflict, but the conflict does not actually appear without additional
	4563	rules. Here is a complete Bison input file that actually manifests the
	4564	conflict:
	4565
	4566	@example
	4567	@group
	4568	%token IF THEN ELSE variable
	4569	%%
	4570	@end group
	4571	@group
	4572	stmt: expr
	4573	\| if_stmt
	4574	;
	4575	@end group
	4576
	4577	@group
	4578	if_stmt:
	4579	IF expr THEN stmt
	4580	\| IF expr THEN stmt ELSE stmt
	4581	;
	4582	@end group
	4583
	4584	expr: variable
	4585	;
	4586	@end example
	4587
	4588	@node Precedence
	4589	@section Operator Precedence
	4590	@cindex operator precedence
	4591	@cindex precedence of operators
	4592
	4593	Another situation where shift/reduce conflicts appear is in arithmetic
	4594	expressions. Here shifting is not always the preferred resolution; the
	4595	Bison declarations for operator precedence allow you to specify when to
	4596	shift and when to reduce.
	4597
	4598	@menu
	4599	* Why Precedence:: An example showing why precedence is needed.
	4600	* Using Precedence:: How to specify precedence in Bison grammars.
	4601	* Precedence Examples:: How these features are used in the previous example.
	4602	* How Precedence:: How they work.
	4603	@end menu
	4604
	4605	@node Why Precedence
	4606	@subsection When Precedence is Needed
	4607
	4608	Consider the following ambiguous grammar fragment (ambiguous because the
	4609	input @w{@samp{1 - 2 * 3}} can be parsed in two different ways):
	4610
	4611	@example
	4612	@group
	4613	expr: expr '-' expr
	4614	\| expr '*' expr
	4615	\| expr '<' expr
	4616	\| '(' expr ')'
	4617	@dots{}
	4618	;
	4619	@end group
	4620	@end example
	4621
	4622	@noindent
	4623	Suppose the parser has seen the tokens @samp{1}, @samp{-} and @samp{2};
	4624	should it reduce them via the rule for the subtraction operator? It
	4625	depends on the next token. Of course, if the next token is @samp{)}, we
	4626	must reduce; shifting is invalid because no single rule can reduce the
	4627	token sequence @w{@samp{- 2 )}} or anything starting with that. But if
	4628	the next token is @samp{*} or @samp{<}, we have a choice: either
	4629	shifting or reduction would allow the parse to complete, but with
	4630	different results.
	4631
	4632	To decide which one Bison should do, we must consider the results. If
	4633	the next operator token @var{op} is shifted, then it must be reduced
	4634	first in order to permit another opportunity to reduce the difference.
	4635	The result is (in effect) @w{@samp{1 - (2 @var{op} 3)}}. On the other
	4636	hand, if the subtraction is reduced before shifting @var{op}, the result
	4637	is @w{@samp{(1 - 2) @var{op} 3}}. Clearly, then, the choice of shift or
	4638	reduce should depend on the relative precedence of the operators
	4639	@samp{-} and @var{op}: @samp{*} should be shifted first, but not
	4640	@samp{<}.
	4641
	4642	@cindex associativity
	4643	What about input such as @w{@samp{1 - 2 - 5}}; should this be
	4644	@w{@samp{(1 - 2) - 5}} or should it be @w{@samp{1 - (2 - 5)}}? For most
	4645	operators we prefer the former, which is called @dfn{left association}.
	4646	The latter alternative, @dfn{right association}, is desirable for
	4647	assignment operators. The choice of left or right association is a
	4648	matter of whether the parser chooses to shift or reduce when the stack
	4649	contains @w{@samp{1 - 2}} and the look-ahead token is @samp{-}: shifting
	4650	makes right-associativity.
	4651
	4652	@node Using Precedence
	4653	@subsection Specifying Operator Precedence
	4654	@findex %left
	4655	@findex %right
	4656	@findex %nonassoc
	4657
	4658	Bison allows you to specify these choices with the operator precedence
	4659	declarations @code{%left} and @code{%right}. Each such declaration
	4660	contains a list of tokens, which are operators whose precedence and
	4661	associativity is being declared. The @code{%left} declaration makes all
	4662	those operators left-associative and the @code{%right} declaration makes
	4663	them right-associative. A third alternative is @code{%nonassoc}, which
	4664	declares that it is a syntax error to find the same operator twice ``in a
	4665	row''.
	4666
	4667	The relative precedence of different operators is controlled by the
	4668	order in which they are declared. The first @code{%left} or
	4669	@code{%right} declaration in the file declares the operators whose
	4670	precedence is lowest, the next such declaration declares the operators
	4671	whose precedence is a little higher, and so on.
	4672
	4673	@node Precedence Examples
	4674	@subsection Precedence Examples
	4675
	4676	In our example, we would want the following declarations:
	4677
	4678	@example
	4679	%left '<'
	4680	%left '-'
	4681	%left '*'
	4682	@end example
	4683
	4684	In a more complete example, which supports other operators as well, we
	4685	would declare them in groups of equal precedence. For example, @code{'+'} is
	4686	declared with @code{'-'}:
	4687
	4688	@example
	4689	%left '<' '>' '=' NE LE GE
	4690	%left '+' '-'
	4691	%left '*' '/'
	4692	@end example
	4693
	4694	@noindent
	4695	(Here @code{NE} and so on stand for the operators for ``not equal''
	4696	and so on. We assume that these tokens are more than one character long
	4697	and therefore are represented by names, not character literals.)
	4698
	4699	@node How Precedence
	4700	@subsection How Precedence Works
	4701
	4702	The first effect of the precedence declarations is to assign precedence
	4703	levels to the terminal symbols declared. The second effect is to assign
	4704	precedence levels to certain rules: each rule gets its precedence from
	4705	the last terminal symbol mentioned in the components. (You can also
	4706	specify explicitly the precedence of a rule. @xref{Contextual
	4707	Precedence, ,Context-Dependent Precedence}.)
	4708
	4709	Finally, the resolution of conflicts works by comparing the precedence
	4710	of the rule being considered with that of the look-ahead token. If the
	4711	token's precedence is higher, the choice is to shift. If the rule's
	4712	precedence is higher, the choice is to reduce. If they have equal
	4713	precedence, the choice is made based on the associativity of that
	4714	precedence level. The verbose output file made by @samp{-v}
	4715	(@pxref{Invocation, ,Invoking Bison}) says how each conflict was
	4716	resolved.
	4717
	4718	Not all rules and not all tokens have precedence. If either the rule or
	4719	the look-ahead token has no precedence, then the default is to shift.
	4720
	4721	@node Contextual Precedence
	4722	@section Context-Dependent Precedence
	4723	@cindex context-dependent precedence
	4724	@cindex unary operator precedence
	4725	@cindex precedence, context-dependent
	4726	@cindex precedence, unary operator
	4727	@findex %prec
	4728
	4729	Often the precedence of an operator depends on the context. This sounds
	4730	outlandish at first, but it is really very common. For example, a minus
	4731	sign typically has a very high precedence as a unary operator, and a
	4732	somewhat lower precedence (lower than multiplication) as a binary operator.
	4733
	4734	The Bison precedence declarations, @code{%left}, @code{%right} and
	4735	@code{%nonassoc}, can only be used once for a given token; so a token has
	4736	only one precedence declared in this way. For context-dependent
	4737	precedence, you need to use an additional mechanism: the @code{%prec}
	4738	modifier for rules.
	4739
	4740	The @code{%prec} modifier declares the precedence of a particular rule by
	4741	specifying a terminal symbol whose precedence should be used for that rule.
	4742	It's not necessary for that symbol to appear otherwise in the rule. The
	4743	modifier's syntax is:
	4744
	4745	@example
	4746	%prec @var{terminal-symbol}
	4747	@end example
	4748
	4749	@noindent
	4750	and it is written after the components of the rule. Its effect is to
	4751	assign the rule the precedence of @var{terminal-symbol}, overriding
	4752	the precedence that would be deduced for it in the ordinary way. The
	4753	altered rule precedence then affects how conflicts involving that rule
	4754	are resolved (@pxref{Precedence, ,Operator Precedence}).
	4755
	4756	Here is how @code{%prec} solves the problem of unary minus. First, declare
	4757	a precedence for a fictitious terminal symbol named @code{UMINUS}. There
	4758	are no tokens of this type, but the symbol serves to stand for its
	4759	precedence:
	4760
	4761	@example
	4762	@dots{}
	4763	%left '+' '-'
	4764	%left '*'
	4765	%left UMINUS
	4766	@end example
	4767
	4768	Now the precedence of @code{UMINUS} can be used in specific rules:
	4769
	4770	@example
	4771	@group
	4772	exp: @dots{}
	4773	\| exp '-' exp
	4774	@dots{}
	4775	\| '-' exp %prec UMINUS
	4776	@end group
	4777	@end example
	4778
	4779	@node Parser States
	4780	@section Parser States
	4781	@cindex finite-state machine
	4782	@cindex parser state
	4783	@cindex state (of parser)
	4784
	4785	The function @code{yyparse} is implemented using a finite-state machine.
	4786	The values pushed on the parser stack are not simply token type codes; they
	4787	represent the entire sequence of terminal and nonterminal symbols at or
	4788	near the top of the stack. The current state collects all the information
	4789	about previous input which is relevant to deciding what to do next.
	4790
	4791	Each time a look-ahead token is read, the current parser state together
	4792	with the type of look-ahead token are looked up in a table. This table
	4793	entry can say, ``Shift the look-ahead token.'' In this case, it also
	4794	specifies the new parser state, which is pushed onto the top of the
	4795	parser stack. Or it can say, ``Reduce using rule number @var{n}.''
	4796	This means that a certain number of tokens or groupings are taken off
	4797	the top of the stack, and replaced by one grouping. In other words,
	4798	that number of states are popped from the stack, and one new state is
	4799	pushed.
	4800
	4801	There is one other alternative: the table can say that the look-ahead token
	4802	is erroneous in the current state. This causes error processing to begin
	4803	(@pxref{Error Recovery}).
	4804
	4805	@node Reduce/Reduce
	4806	@section Reduce/Reduce Conflicts
	4807	@cindex reduce/reduce conflict
	4808	@cindex conflicts, reduce/reduce
	4809
	4810	A reduce/reduce conflict occurs if there are two or more rules that apply
	4811	to the same sequence of input. This usually indicates a serious error
	4812	in the grammar.
	4813
	4814	For example, here is an erroneous attempt to define a sequence
	4815	of zero or more @code{word} groupings.
	4816
	4817	@example
	4818	sequence: /* empty */
	4819	@{ printf ("empty sequence\n"); @}
	4820	\| maybeword
	4821	\| sequence word
	4822	@{ printf ("added word %s\n", $2); @}
	4823	;
	4824
	4825	maybeword: /* empty */
	4826	@{ printf ("empty maybeword\n"); @}
	4827	\| word
	4828	@{ printf ("single word %s\n", $1); @}
	4829	;
	4830	@end example
	4831
	4832	@noindent
	4833	The error is an ambiguity: there is more than one way to parse a single
	4834	@code{word} into a @code{sequence}. It could be reduced to a
	4835	@code{maybeword} and then into a @code{sequence} via the second rule.
	4836	Alternatively, nothing-at-all could be reduced into a @code{sequence}
	4837	via the first rule, and this could be combined with the @code{word}
	4838	using the third rule for @code{sequence}.
	4839
	4840	There is also more than one way to reduce nothing-at-all into a
	4841	@code{sequence}. This can be done directly via the first rule,
	4842	or indirectly via @code{maybeword} and then the second rule.
	4843
	4844	You might think that this is a distinction without a difference, because it
	4845	does not change whether any particular input is valid or not. But it does
	4846	affect which actions are run. One parsing order runs the second rule's
	4847	action; the other runs the first rule's action and the third rule's action.
	4848	In this example, the output of the program changes.
	4849
	4850	Bison resolves a reduce/reduce conflict by choosing to use the rule that
	4851	appears first in the grammar, but it is very risky to rely on this. Every
	4852	reduce/reduce conflict must be studied and usually eliminated. Here is the
	4853	proper way to define @code{sequence}:
	4854
	4855	@example
	4856	sequence: /* empty */
	4857	@{ printf ("empty sequence\n"); @}
	4858	\| sequence word
	4859	@{ printf ("added word %s\n", $2); @}
	4860	;
	4861	@end example
	4862
	4863	Here is another common error that yields a reduce/reduce conflict:
	4864
	4865	@example
	4866	sequence: /* empty */
	4867	\| sequence words
	4868	\| sequence redirects
	4869	;
	4870
	4871	words: /* empty */
	4872	\| words word
	4873	;
	4874
	4875	redirects:/* empty */
	4876	\| redirects redirect
	4877	;
	4878	@end example
	4879
	4880	@noindent
	4881	The intention here is to define a sequence which can contain either
	4882	@code{word} or @code{redirect} groupings. The individual definitions of
	4883	@code{sequence}, @code{words} and @code{redirects} are error-free, but the
	4884	three together make a subtle ambiguity: even an empty input can be parsed
	4885	in infinitely many ways!
	4886
	4887	Consider: nothing-at-all could be a @code{words}. Or it could be two
	4888	@code{words} in a row, or three, or any number. It could equally well be a
	4889	@code{redirects}, or two, or any number. Or it could be a @code{words}
	4890	followed by three @code{redirects} and another @code{words}. And so on.
	4891
	4892	Here are two ways to correct these rules. First, to make it a single level
	4893	of sequence:
	4894
	4895	@example
	4896	sequence: /* empty */
	4897	\| sequence word
	4898	\| sequence redirect
	4899	;
	4900	@end example
	4901
	4902	Second, to prevent either a @code{words} or a @code{redirects}
	4903	from being empty:
	4904
	4905	@example
	4906	sequence: /* empty */
	4907	\| sequence words
	4908	\| sequence redirects
	4909	;
	4910
	4911	words: word
	4912	\| words word
	4913	;
	4914
	4915	redirects:redirect
	4916	\| redirects redirect
	4917	;
	4918	@end example
	4919
	4920	@node Mystery Conflicts
	4921	@section Mysterious Reduce/Reduce Conflicts
	4922
	4923	Sometimes reduce/reduce conflicts can occur that don't look warranted.
	4924	Here is an example:
	4925
	4926	@example
	4927	@group
	4928	%token ID
	4929
	4930	%%
	4931	def: param_spec return_spec ','
	4932	;
	4933	param_spec:
	4934	type
	4935	\| name_list ':' type
	4936	;
	4937	@end group
	4938	@group
	4939	return_spec:
	4940	type
	4941	\| name ':' type
	4942	;
	4943	@end group
	4944	@group
	4945	type: ID
	4946	;
	4947	@end group
	4948	@group
	4949	name: ID
	4950	;
	4951	name_list:
	4952	name
	4953	\| name ',' name_list
	4954	;
	4955	@end group
	4956	@end example
	4957
	4958	It would seem that this grammar can be parsed with only a single token
	4959	of look-ahead: when a @code{param_spec} is being read, an @code{ID} is
	4960	a @code{name} if a comma or colon follows, or a @code{type} if another
	4961	@code{ID} follows. In other words, this grammar is @acronym{LR}(1).
	4962
	4963	@cindex @acronym{LR}(1)
	4964	@cindex @acronym{LALR}(1)
	4965	However, Bison, like most parser generators, cannot actually handle all
	4966	@acronym{LR}(1) grammars. In this grammar, two contexts, that after
	4967	an @code{ID}
	4968	at the beginning of a @code{param_spec} and likewise at the beginning of
	4969	a @code{return_spec}, are similar enough that Bison assumes they are the
	4970	same. They appear similar because the same set of rules would be
	4971	active---the rule for reducing to a @code{name} and that for reducing to
	4972	a @code{type}. Bison is unable to determine at that stage of processing
	4973	that the rules would require different look-ahead tokens in the two
	4974	contexts, so it makes a single parser state for them both. Combining
	4975	the two contexts causes a conflict later. In parser terminology, this
	4976	occurrence means that the grammar is not @acronym{LALR}(1).
	4977
	4978	In general, it is better to fix deficiencies than to document them. But
	4979	this particular deficiency is intrinsically hard to fix; parser
	4980	generators that can handle @acronym{LR}(1) grammars are hard to write
	4981	and tend to
	4982	produce parsers that are very large. In practice, Bison is more useful
	4983	as it is now.
	4984
	4985	When the problem arises, you can often fix it by identifying the two
	4986	parser states that are being confused, and adding something to make them
	4987	look distinct. In the above example, adding one rule to
	4988	@code{return_spec} as follows makes the problem go away:
	4989
	4990	@example
	4991	@group
	4992	%token BOGUS
	4993	@dots{}
	4994	%%
	4995	@dots{}
	4996	return_spec:
	4997	type
	4998	\| name ':' type
	4999	/* This rule is never used. */
	5000	\| ID BOGUS
	5001	;
	5002	@end group
	5003	@end example
	5004
	5005	This corrects the problem because it introduces the possibility of an
	5006	additional active rule in the context after the @code{ID} at the beginning of
	5007	@code{return_spec}. This rule is not active in the corresponding context
	5008	in a @code{param_spec}, so the two contexts receive distinct parser states.
	5009	As long as the token @code{BOGUS} is never generated by @code{yylex},
	5010	the added rule cannot alter the way actual input is parsed.
	5011
	5012	In this particular example, there is another way to solve the problem:
	5013	rewrite the rule for @code{return_spec} to use @code{ID} directly
	5014	instead of via @code{name}. This also causes the two confusing
	5015	contexts to have different sets of active rules, because the one for
	5016	@code{return_spec} activates the altered rule for @code{return_spec}
	5017	rather than the one for @code{name}.
	5018
	5019	@example
	5020	param_spec:
	5021	type
	5022	\| name_list ':' type
	5023	;
	5024	return_spec:
	5025	type
	5026	\| ID ':' type
	5027	;
	5028	@end example
	5029
	5030	@node Generalized LR Parsing
	5031	@section Generalized @acronym{LR} (@acronym{GLR}) Parsing
	5032	@cindex @acronym{GLR} parsing
	5033	@cindex generalized @acronym{LR} (@acronym{GLR}) parsing
	5034	@cindex ambiguous grammars
	5035	@cindex non-deterministic parsing
	5036
	5037	Bison produces @emph{deterministic} parsers that choose uniquely
	5038	when to reduce and which reduction to apply
	5039	based on a summary of the preceding input and on one extra token of lookahead.
	5040	As a result, normal Bison handles a proper subset of the family of
	5041	context-free languages.
	5042	Ambiguous grammars, since they have strings with more than one possible
	5043	sequence of reductions cannot have deterministic parsers in this sense.
	5044	The same is true of languages that require more than one symbol of
	5045	lookahead, since the parser lacks the information necessary to make a
	5046	decision at the point it must be made in a shift-reduce parser.
	5047	Finally, as previously mentioned (@pxref{Mystery Conflicts}),
	5048	there are languages where Bison's particular choice of how to
	5049	summarize the input seen so far loses necessary information.
	5050
	5051	When you use the @samp{%glr-parser} declaration in your grammar file,
	5052	Bison generates a parser that uses a different algorithm, called
	5053	Generalized @acronym{LR} (or @acronym{GLR}). A Bison @acronym{GLR}
	5054	parser uses the same basic
	5055	algorithm for parsing as an ordinary Bison parser, but behaves
	5056	differently in cases where there is a shift-reduce conflict that has not
	5057	been resolved by precedence rules (@pxref{Precedence}) or a
	5058	reduce-reduce conflict. When a @acronym{GLR} parser encounters such a
	5059	situation, it
	5060	effectively @emph{splits} into a several parsers, one for each possible
	5061	shift or reduction. These parsers then proceed as usual, consuming
	5062	tokens in lock-step. Some of the stacks may encounter other conflicts
	5063	and split further, with the result that instead of a sequence of states,
	5064	a Bison @acronym{GLR} parsing stack is what is in effect a tree of states.
	5065
	5066	In effect, each stack represents a guess as to what the proper parse
	5067	is. Additional input may indicate that a guess was wrong, in which case
	5068	the appropriate stack silently disappears. Otherwise, the semantics
	5069	actions generated in each stack are saved, rather than being executed
	5070	immediately. When a stack disappears, its saved semantic actions never
	5071	get executed. When a reduction causes two stacks to become equivalent,
	5072	their sets of semantic actions are both saved with the state that
	5073	results from the reduction. We say that two stacks are equivalent
	5074	when they both represent the same sequence of states,
	5075	and each pair of corresponding states represents a
	5076	grammar symbol that produces the same segment of the input token
	5077	stream.
	5078
	5079	Whenever the parser makes a transition from having multiple
	5080	states to having one, it reverts to the normal @acronym{LALR}(1) parsing
	5081	algorithm, after resolving and executing the saved-up actions.
	5082	At this transition, some of the states on the stack will have semantic
	5083	values that are sets (actually multisets) of possible actions. The
	5084	parser tries to pick one of the actions by first finding one whose rule
	5085	has the highest dynamic precedence, as set by the @samp{%dprec}
	5086	declaration. Otherwise, if the alternative actions are not ordered by
	5087	precedence, but there the same merging function is declared for both
	5088	rules by the @samp{%merge} declaration,
	5089	Bison resolves and evaluates both and then calls the merge function on
	5090	the result. Otherwise, it reports an ambiguity.
	5091
	5092	It is possible to use a data structure for the @acronym{GLR} parsing tree that
	5093	permits the processing of any @acronym{LALR}(1) grammar in linear time (in the
	5094	size of the input), any unambiguous (not necessarily
	5095	@acronym{LALR}(1)) grammar in
	5096	quadratic worst-case time, and any general (possibly ambiguous)
	5097	context-free grammar in cubic worst-case time. However, Bison currently
	5098	uses a simpler data structure that requires time proportional to the
	5099	length of the input times the maximum number of stacks required for any
	5100	prefix of the input. Thus, really ambiguous or non-deterministic
	5101	grammars can require exponential time and space to process. Such badly
	5102	behaving examples, however, are not generally of practical interest.
	5103	Usually, non-determinism in a grammar is local---the parser is ``in
	5104	doubt'' only for a few tokens at a time. Therefore, the current data
	5105	structure should generally be adequate. On @acronym{LALR}(1) portions of a
	5106	grammar, in particular, it is only slightly slower than with the default
	5107	Bison parser.
	5108
	5109	@node Stack Overflow
	5110	@section Stack Overflow, and How to Avoid It
	5111	@cindex stack overflow
	5112	@cindex parser stack overflow
	5113	@cindex overflow of parser stack
	5114
	5115	The Bison parser stack can overflow if too many tokens are shifted and
	5116	not reduced. When this happens, the parser function @code{yyparse}
	5117	returns a nonzero value, pausing only to call @code{yyerror} to report
	5118	the overflow.
	5119
	5120	Because Bison parsers have growing stacks, hitting the upper limit
	5121	usually results from using a right recursion instead of a left
	5122	recursion, @xref{Recursion, ,Recursive Rules}.
	5123
	5124	@vindex YYMAXDEPTH
	5125	By defining the macro @code{YYMAXDEPTH}, you can control how deep the
	5126	parser stack can become before a stack overflow occurs. Define the
	5127	macro with a value that is an integer. This value is the maximum number
	5128	of tokens that can be shifted (and not reduced) before overflow.
	5129	It must be a constant expression whose value is known at compile time.
	5130
	5131	The stack space allowed is not necessarily allocated. If you specify a
	5132	large value for @code{YYMAXDEPTH}, the parser actually allocates a small
	5133	stack at first, and then makes it bigger by stages as needed. This
	5134	increasing allocation happens automatically and silently. Therefore,
	5135	you do not need to make @code{YYMAXDEPTH} painfully small merely to save
	5136	space for ordinary inputs that do not need much stack.
	5137
	5138	@cindex default stack limit
	5139	The default value of @code{YYMAXDEPTH}, if you do not define it, is
	5140	10000.
	5141
	5142	@vindex YYINITDEPTH
	5143	You can control how much stack is allocated initially by defining the
	5144	macro @code{YYINITDEPTH}. This value too must be a compile-time
	5145	constant integer. The default is 200.
	5146
	5147	@c FIXME: C++ output.
	5148	Because of semantical differences between C and C++, the
	5149	@acronym{LALR}(1) parsers
	5150	in C produced by Bison by compiled as C++ cannot grow. In this precise
	5151	case (compiling a C parser as C++) you are suggested to grow
	5152	@code{YYINITDEPTH}. In the near future, a C++ output output will be
	5153	provided which addresses this issue.
	5154
	5155	@node Error Recovery
	5156	@chapter Error Recovery
	5157	@cindex error recovery
	5158	@cindex recovery from errors
	5159
	5160	It is not usually acceptable to have a program terminate on a syntax
	5161	error. For example, a compiler should recover sufficiently to parse the
	5162	rest of the input file and check it for errors; a calculator should accept
	5163	another expression.
	5164
	5165	In a simple interactive command parser where each input is one line, it may
	5166	be sufficient to allow @code{yyparse} to return 1 on error and have the
	5167	caller ignore the rest of the input line when that happens (and then call
	5168	@code{yyparse} again). But this is inadequate for a compiler, because it
	5169	forgets all the syntactic context leading up to the error. A syntax error
	5170	deep within a function in the compiler input should not cause the compiler
	5171	to treat the following line like the beginning of a source file.
	5172
	5173	@findex error
	5174	You can define how to recover from a syntax error by writing rules to
	5175	recognize the special token @code{error}. This is a terminal symbol that
	5176	is always defined (you need not declare it) and reserved for error
	5177	handling. The Bison parser generates an @code{error} token whenever a
	5178	syntax error happens; if you have provided a rule to recognize this token
	5179	in the current context, the parse can continue.
	5180
	5181	For example:
	5182
	5183	@example
	5184	stmnts: /* empty string */
	5185	\| stmnts '\n'
	5186	\| stmnts exp '\n'
	5187	\| stmnts error '\n'
	5188	@end example
	5189
	5190	The fourth rule in this example says that an error followed by a newline
	5191	makes a valid addition to any @code{stmnts}.
	5192
	5193	What happens if a syntax error occurs in the middle of an @code{exp}? The
	5194	error recovery rule, interpreted strictly, applies to the precise sequence
	5195	of a @code{stmnts}, an @code{error} and a newline. If an error occurs in
	5196	the middle of an @code{exp}, there will probably be some additional tokens
	5197	and subexpressions on the stack after the last @code{stmnts}, and there
	5198	will be tokens to read before the next newline. So the rule is not
	5199	applicable in the ordinary way.
	5200
	5201	But Bison can force the situation to fit the rule, by discarding part of
	5202	the semantic context and part of the input. First it discards states
	5203	and objects from the stack until it gets back to a state in which the
	5204	@code{error} token is acceptable. (This means that the subexpressions
	5205	already parsed are discarded, back to the last complete @code{stmnts}.)
	5206	At this point the @code{error} token can be shifted. Then, if the old
	5207	look-ahead token is not acceptable to be shifted next, the parser reads
	5208	tokens and discards them until it finds a token which is acceptable. In
	5209	this example, Bison reads and discards input until the next newline so
	5210	that the fourth rule can apply. Note that discarded symbols are
	5211	possible sources of memory leaks, see @ref{Destructor Decl, , Freeing
	5212	Discarded Symbols}, for a means to reclaim this memory.
	5213
	5214	The choice of error rules in the grammar is a choice of strategies for
	5215	error recovery. A simple and useful strategy is simply to skip the rest of
	5216	the current input line or current statement if an error is detected:
	5217
	5218	@example
	5219	stmnt: error ';' /* On error, skip until ';' is read. */
	5220	@end example
	5221
	5222	It is also useful to recover to the matching close-delimiter of an
	5223	opening-delimiter that has already been parsed. Otherwise the
	5224	close-delimiter will probably appear to be unmatched, and generate another,
	5225	spurious error message:
	5226
	5227	@example
	5228	primary: '(' expr ')'
	5229	\| '(' error ')'
	5230	@dots{}
	5231	;
	5232	@end example
	5233
	5234	Error recovery strategies are necessarily guesses. When they guess wrong,
	5235	one syntax error often leads to another. In the above example, the error
	5236	recovery rule guesses that an error is due to bad input within one
	5237	@code{stmnt}. Suppose that instead a spurious semicolon is inserted in the
	5238	middle of a valid @code{stmnt}. After the error recovery rule recovers
	5239	from the first error, another syntax error will be found straightaway,
	5240	since the text following the spurious semicolon is also an invalid
	5241	@code{stmnt}.
	5242
	5243	To prevent an outpouring of error messages, the parser will output no error
	5244	message for another syntax error that happens shortly after the first; only
	5245	after three consecutive input tokens have been successfully shifted will
	5246	error messages resume.
	5247
	5248	Note that rules which accept the @code{error} token may have actions, just
	5249	as any other rules can.
	5250
	5251	@findex yyerrok
	5252	You can make error messages resume immediately by using the macro
	5253	@code{yyerrok} in an action. If you do this in the error rule's action, no
	5254	error messages will be suppressed. This macro requires no arguments;
	5255	@samp{yyerrok;} is a valid C statement.
	5256
	5257	@findex yyclearin
	5258	The previous look-ahead token is reanalyzed immediately after an error. If
	5259	this is unacceptable, then the macro @code{yyclearin} may be used to clear
	5260	this token. Write the statement @samp{yyclearin;} in the error rule's
	5261	action.
	5262
	5263	For example, suppose that on a syntax error, an error handling routine is
	5264	called that advances the input stream to some point where parsing should
	5265	once again commence. The next symbol returned by the lexical scanner is
	5266	probably correct. The previous look-ahead token ought to be discarded
	5267	with @samp{yyclearin;}.
	5268
	5269	@vindex YYRECOVERING
	5270	The macro @code{YYRECOVERING} stands for an expression that has the
	5271	value 1 when the parser is recovering from a syntax error, and 0 the
	5272	rest of the time. A value of 1 indicates that error messages are
	5273	currently suppressed for new syntax errors.
	5274
	5275	@node Context Dependency
	5276	@chapter Handling Context Dependencies
	5277
	5278	The Bison paradigm is to parse tokens first, then group them into larger
	5279	syntactic units. In many languages, the meaning of a token is affected by
	5280	its context. Although this violates the Bison paradigm, certain techniques
	5281	(known as @dfn{kludges}) may enable you to write Bison parsers for such
	5282	languages.
	5283
	5284	@menu
	5285	* Semantic Tokens:: Token parsing can depend on the semantic context.
	5286	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	5287	* Tie-in Recovery:: Lexical tie-ins have implications for how
	5288	error recovery rules must be written.
	5289	@end menu
	5290
	5291	(Actually, ``kludge'' means any technique that gets its job done but is
	5292	neither clean nor robust.)
	5293
	5294	@node Semantic Tokens
	5295	@section Semantic Info in Token Types
	5296
	5297	The C language has a context dependency: the way an identifier is used
	5298	depends on what its current meaning is. For example, consider this:
	5299
	5300	@example
	5301	foo (x);
	5302	@end example
	5303
	5304	This looks like a function call statement, but if @code{foo} is a typedef
	5305	name, then this is actually a declaration of @code{x}. How can a Bison
	5306	parser for C decide how to parse this input?
	5307
	5308	The method used in @acronym{GNU} C is to have two different token types,
	5309	@code{IDENTIFIER} and @code{TYPENAME}. When @code{yylex} finds an
	5310	identifier, it looks up the current declaration of the identifier in order
	5311	to decide which token type to return: @code{TYPENAME} if the identifier is
	5312	declared as a typedef, @code{IDENTIFIER} otherwise.
	5313
	5314	The grammar rules can then express the context dependency by the choice of
	5315	token type to recognize. @code{IDENTIFIER} is accepted as an expression,
	5316	but @code{TYPENAME} is not. @code{TYPENAME} can start a declaration, but
	5317	@code{IDENTIFIER} cannot. In contexts where the meaning of the identifier
	5318	is @emph{not} significant, such as in declarations that can shadow a
	5319	typedef name, either @code{TYPENAME} or @code{IDENTIFIER} is
	5320	accepted---there is one rule for each of the two token types.
	5321
	5322	This technique is simple to use if the decision of which kinds of
	5323	identifiers to allow is made at a place close to where the identifier is
	5324	parsed. But in C this is not always so: C allows a declaration to
	5325	redeclare a typedef name provided an explicit type has been specified
	5326	earlier:
	5327
	5328	@example
	5329	typedef int foo, bar, lose;
	5330	static foo (bar); /* @r{redeclare @code{bar} as static variable} */
	5331	static int foo (lose); /* @r{redeclare @code{foo} as function} */
	5332	@end example
	5333
	5334	Unfortunately, the name being declared is separated from the declaration
	5335	construct itself by a complicated syntactic structure---the ``declarator''.
	5336
	5337	As a result, part of the Bison parser for C needs to be duplicated, with
	5338	all the nonterminal names changed: once for parsing a declaration in
	5339	which a typedef name can be redefined, and once for parsing a
	5340	declaration in which that can't be done. Here is a part of the
	5341	duplication, with actions omitted for brevity:
	5342
	5343	@example
	5344	initdcl:
	5345	declarator maybeasm '='
	5346	init
	5347	\| declarator maybeasm
	5348	;
	5349
	5350	notype_initdcl:
	5351	notype_declarator maybeasm '='
	5352	init
	5353	\| notype_declarator maybeasm
	5354	;
	5355	@end example
	5356
	5357	@noindent
	5358	Here @code{initdcl} can redeclare a typedef name, but @code{notype_initdcl}
	5359	cannot. The distinction between @code{declarator} and
	5360	@code{notype_declarator} is the same sort of thing.
	5361
	5362	There is some similarity between this technique and a lexical tie-in
	5363	(described next), in that information which alters the lexical analysis is
	5364	changed during parsing by other parts of the program. The difference is
	5365	here the information is global, and is used for other purposes in the
	5366	program. A true lexical tie-in has a special-purpose flag controlled by
	5367	the syntactic context.
	5368
	5369	@node Lexical Tie-ins
	5370	@section Lexical Tie-ins
	5371	@cindex lexical tie-in
	5372
	5373	One way to handle context-dependency is the @dfn{lexical tie-in}: a flag
	5374	which is set by Bison actions, whose purpose is to alter the way tokens are
	5375	parsed.
	5376
	5377	For example, suppose we have a language vaguely like C, but with a special
	5378	construct @samp{hex (@var{hex-expr})}. After the keyword @code{hex} comes
	5379	an expression in parentheses in which all integers are hexadecimal. In
	5380	particular, the token @samp{a1b} must be treated as an integer rather than
	5381	as an identifier if it appears in that context. Here is how you can do it:
	5382
	5383	@example
	5384	@group
	5385	%@{
	5386	int hexflag;
	5387	%@}
	5388	%%
	5389	@dots{}
	5390	@end group
	5391	@group
	5392	expr: IDENTIFIER
	5393	\| constant
	5394	\| HEX '('
	5395	@{ hexflag = 1; @}
	5396	expr ')'
	5397	@{ hexflag = 0;
	5398	$$ = $4; @}
	5399	\| expr '+' expr
	5400	@{ $$ = make_sum ($1, $3); @}
	5401	@dots{}
	5402	;
	5403	@end group
	5404
	5405	@group
	5406	constant:
	5407	INTEGER
	5408	\| STRING
	5409	;
	5410	@end group
	5411	@end example
	5412
	5413	@noindent
	5414	Here we assume that @code{yylex} looks at the value of @code{hexflag}; when
	5415	it is nonzero, all integers are parsed in hexadecimal, and tokens starting
	5416	with letters are parsed as integers if possible.
	5417
	5418	The declaration of @code{hexflag} shown in the prologue of the parser file
	5419	is needed to make it accessible to the actions (@pxref{Prologue, ,The Prologue}).
	5420	You must also write the code in @code{yylex} to obey the flag.
	5421
	5422	@node Tie-in Recovery
	5423	@section Lexical Tie-ins and Error Recovery
	5424
	5425	Lexical tie-ins make strict demands on any error recovery rules you have.
	5426	@xref{Error Recovery}.
	5427
	5428	The reason for this is that the purpose of an error recovery rule is to
	5429	abort the parsing of one construct and resume in some larger construct.
	5430	For example, in C-like languages, a typical error recovery rule is to skip
	5431	tokens until the next semicolon, and then start a new statement, like this:
	5432
	5433	@example
	5434	stmt: expr ';'
	5435	\| IF '(' expr ')' stmt @{ @dots{} @}
	5436	@dots{}
	5437	error ';'
	5438	@{ hexflag = 0; @}
	5439	;
	5440	@end example
	5441
	5442	If there is a syntax error in the middle of a @samp{hex (@var{expr})}
	5443	construct, this error rule will apply, and then the action for the
	5444	completed @samp{hex (@var{expr})} will never run. So @code{hexflag} would
	5445	remain set for the entire rest of the input, or until the next @code{hex}
	5446	keyword, causing identifiers to be misinterpreted as integers.
	5447
	5448	To avoid this problem the error recovery rule itself clears @code{hexflag}.
	5449
	5450	There may also be an error recovery rule that works within expressions.
	5451	For example, there could be a rule which applies within parentheses
	5452	and skips to the close-parenthesis:
	5453
	5454	@example
	5455	@group
	5456	expr: @dots{}
	5457	\| '(' expr ')'
	5458	@{ $$ = $2; @}
	5459	\| '(' error ')'
	5460	@dots{}
	5461	@end group
	5462	@end example
	5463
	5464	If this rule acts within the @code{hex} construct, it is not going to abort
	5465	that construct (since it applies to an inner level of parentheses within
	5466	the construct). Therefore, it should not clear the flag: the rest of
	5467	the @code{hex} construct should be parsed with the flag still in effect.
	5468
	5469	What if there is an error recovery rule which might abort out of the
	5470	@code{hex} construct or might not, depending on circumstances? There is no
	5471	way you can write the action to determine whether a @code{hex} construct is
	5472	being aborted or not. So if you are using a lexical tie-in, you had better
	5473	make sure your error recovery rules are not of this kind. Each rule must
	5474	be such that you can be sure that it always will, or always won't, have to
	5475	clear the flag.
	5476
	5477	@c ================================================== Debugging Your Parser
	5478
	5479	@node Debugging
	5480	@chapter Debugging Your Parser
	5481
	5482	Developing a parser can be a challenge, especially if you don't
	5483	understand the algorithm (@pxref{Algorithm, ,The Bison Parser
	5484	Algorithm}). Even so, sometimes a detailed description of the automaton
	5485	can help (@pxref{Understanding, , Understanding Your Parser}), or
	5486	tracing the execution of the parser can give some insight on why it
	5487	behaves improperly (@pxref{Tracing, , Tracing Your Parser}).
	5488
	5489	@menu
	5490	* Understanding:: Understanding the structure of your parser.
	5491	* Tracing:: Tracing the execution of your parser.
	5492	@end menu
	5493
	5494	@node Understanding
	5495	@section Understanding Your Parser
	5496
	5497	As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm})
	5498	Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
	5499	frequent than one would hope), looking at this automaton is required to
	5500	tune or simply fix a parser. Bison provides two different
	5501	representation of it, either textually or graphically (as a @acronym{VCG}
	5502	file).
	5503
	5504	The textual file is generated when the options @option{--report} or
	5505	@option{--verbose} are specified, see @xref{Invocation, , Invoking
	5506	Bison}. Its name is made by removing @samp{.tab.c} or @samp{.c} from
	5507	the parser output file name, and adding @samp{.output} instead.
	5508	Therefore, if the input file is @file{foo.y}, then the parser file is
	5509	called @file{foo.tab.c} by default. As a consequence, the verbose
	5510	output file is called @file{foo.output}.
	5511
	5512	The following grammar file, @file{calc.y}, will be used in the sequel:
	5513
	5514	@example
	5515	%token NUM STR
	5516	%left '+' '-'
	5517	%left '*'
	5518	%%
	5519	exp: exp '+' exp
	5520	\| exp '-' exp
	5521	\| exp '*' exp
	5522	\| exp '/' exp
	5523	\| NUM
	5524	;
	5525	useless: STR;
	5526	%%
	5527	@end example
	5528
	5529	@command{bison} reports:
	5530
	5531	@example
	5532	calc.y: warning: 1 useless nonterminal and 1 useless rule
	5533	calc.y:11.1-7: warning: useless nonterminal: useless
	5534	calc.y:11.8-12: warning: useless rule: useless: STR
	5535	calc.y contains 7 shift/reduce conflicts.
	5536	@end example
	5537
	5538	When given @option{--report=state}, in addition to @file{calc.tab.c}, it
	5539	creates a file @file{calc.output} with contents detailed below. The
	5540	order of the output and the exact presentation might vary, but the
	5541	interpretation is the same.
	5542
	5543	The first section includes details on conflicts that were solved thanks
	5544	to precedence and/or associativity:
	5545
	5546	@example
	5547	Conflict in state 8 between rule 2 and token '+' resolved as reduce.
	5548	Conflict in state 8 between rule 2 and token '-' resolved as reduce.
	5549	Conflict in state 8 between rule 2 and token '*' resolved as shift.
	5550	@exdent @dots{}
	5551	@end example
	5552
	5553	@noindent
	5554	The next section lists states that still have conflicts.
	5555
	5556	@example
	5557	State 8 contains 1 shift/reduce conflict.
	5558	State 9 contains 1 shift/reduce conflict.
	5559	State 10 contains 1 shift/reduce conflict.
	5560	State 11 contains 4 shift/reduce conflicts.
	5561	@end example
	5562
	5563	@noindent
	5564	@cindex token, useless
	5565	@cindex useless token
	5566	@cindex nonterminal, useless
	5567	@cindex useless nonterminal
	5568	@cindex rule, useless
	5569	@cindex useless rule
	5570	The next section reports useless tokens, nonterminal and rules. Useless
	5571	nonterminals and rules are removed in order to produce a smaller parser,
	5572	but useless tokens are preserved, since they might be used by the
	5573	scanner (note the difference between ``useless'' and ``not used''
	5574	below):
	5575
	5576	@example
	5577	Useless nonterminals:
	5578	useless
	5579
	5580	Terminals which are not used:
	5581	STR
	5582
	5583	Useless rules:
	5584	#6 useless: STR;
	5585	@end example
	5586
	5587	@noindent
	5588	The next section reproduces the exact grammar that Bison used:
	5589
	5590	@example
	5591	Grammar
	5592
	5593	Number, Line, Rule
	5594	0 5 $accept -> exp $end
	5595	1 5 exp -> exp '+' exp
	5596	2 6 exp -> exp '-' exp
	5597	3 7 exp -> exp '*' exp
	5598	4 8 exp -> exp '/' exp
	5599	5 9 exp -> NUM
	5600	@end example
	5601
	5602	@noindent
	5603	and reports the uses of the symbols:
	5604
	5605	@example
	5606	Terminals, with rules where they appear
	5607
	5608	$end (0) 0
	5609	'*' (42) 3
	5610	'+' (43) 1
	5611	'-' (45) 2
	5612	'/' (47) 4
	5613	error (256)
	5614	NUM (258) 5
	5615
	5616	Nonterminals, with rules where they appear
	5617
	5618	$accept (8)
	5619	on left: 0
	5620	exp (9)
	5621	on left: 1 2 3 4 5, on right: 0 1 2 3 4
	5622	@end example
	5623
	5624	@noindent
	5625	@cindex item
	5626	@cindex pointed rule
	5627	@cindex rule, pointed
	5628	Bison then proceeds onto the automaton itself, describing each state
	5629	with it set of @dfn{items}, also known as @dfn{pointed rules}. Each
	5630	item is a production rule together with a point (marked by @samp{.})
	5631	that the input cursor.
	5632
	5633	@example
	5634	state 0
	5635
	5636	$accept -> . exp $ (rule 0)
	5637
	5638	NUM shift, and go to state 1
	5639
	5640	exp go to state 2
	5641	@end example
	5642
	5643	This reads as follows: ``state 0 corresponds to being at the very
	5644	beginning of the parsing, in the initial rule, right before the start
	5645	symbol (here, @code{exp}). When the parser returns to this state right
	5646	after having reduced a rule that produced an @code{exp}, the control
	5647	flow jumps to state 2. If there is no such transition on a nonterminal
	5648	symbol, and the lookahead is a @code{NUM}, then this token is shifted on
	5649	the parse stack, and the control flow jumps to state 1. Any other
	5650	lookahead triggers a syntax error.''
	5651
	5652	@cindex core, item set
	5653	@cindex item set core
	5654	@cindex kernel, item set
	5655	@cindex item set core
	5656	Even though the only active rule in state 0 seems to be rule 0, the
	5657	report lists @code{NUM} as a lookahead symbol because @code{NUM} can be
	5658	at the beginning of any rule deriving an @code{exp}. By default Bison
	5659	reports the so-called @dfn{core} or @dfn{kernel} of the item set, but if
	5660	you want to see more detail you can invoke @command{bison} with
	5661	@option{--report=itemset} to list all the items, include those that can
	5662	be derived:
	5663
	5664	@example
	5665	state 0
	5666
	5667	$accept -> . exp $ (rule 0)
	5668	exp -> . exp '+' exp (rule 1)
	5669	exp -> . exp '-' exp (rule 2)
	5670	exp -> . exp '*' exp (rule 3)
	5671	exp -> . exp '/' exp (rule 4)
	5672	exp -> . NUM (rule 5)
	5673
	5674	NUM shift, and go to state 1
	5675
	5676	exp go to state 2
	5677	@end example
	5678
	5679	@noindent
	5680	In the state 1...
	5681
	5682	@example
	5683	state 1
	5684
	5685	exp -> NUM . (rule 5)
	5686
	5687	$default reduce using rule 5 (exp)
	5688	@end example
	5689
	5690	@noindent
	5691	the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead
	5692	(@samp{$default}), the parser will reduce it. If it was coming from
	5693	state 0, then, after this reduction it will return to state 0, and will
	5694	jump to state 2 (@samp{exp: go to state 2}).
	5695
	5696	@example
	5697	state 2
	5698
	5699	$accept -> exp . $ (rule 0)
	5700	exp -> exp . '+' exp (rule 1)
	5701	exp -> exp . '-' exp (rule 2)
	5702	exp -> exp . '*' exp (rule 3)
	5703	exp -> exp . '/' exp (rule 4)
	5704
	5705	$ shift, and go to state 3
	5706	'+' shift, and go to state 4
	5707	'-' shift, and go to state 5
	5708	'*' shift, and go to state 6
	5709	'/' shift, and go to state 7
	5710	@end example
	5711
	5712	@noindent
	5713	In state 2, the automaton can only shift a symbol. For instance,
	5714	because of the item @samp{exp -> exp . '+' exp}, if the lookahead if
	5715	@samp{+}, it will be shifted on the parse stack, and the automaton
	5716	control will jump to state 4, corresponding to the item @samp{exp -> exp
	5717	'+' . exp}. Since there is no default action, any other token than
	5718	those listed above will trigger a syntax error.
	5719
	5720	The state 3 is named the @dfn{final state}, or the @dfn{accepting
	5721	state}:
	5722
	5723	@example
	5724	state 3
	5725
	5726	$accept -> exp $ . (rule 0)
	5727
	5728	$default accept
	5729	@end example
	5730
	5731	@noindent
	5732	the initial rule is completed (the start symbol and the end
	5733	of input were read), the parsing exits successfully.
	5734
	5735	The interpretation of states 4 to 7 is straightforward, and is left to
	5736	the reader.
	5737
	5738	@example
	5739	state 4
	5740
	5741	exp -> exp '+' . exp (rule 1)
	5742
	5743	NUM shift, and go to state 1
	5744
	5745	exp go to state 8
	5746
	5747	state 5
	5748
	5749	exp -> exp '-' . exp (rule 2)
	5750
	5751	NUM shift, and go to state 1
	5752
	5753	exp go to state 9
	5754
	5755	state 6
	5756
	5757	exp -> exp '*' . exp (rule 3)
	5758
	5759	NUM shift, and go to state 1
	5760
	5761	exp go to state 10
	5762
	5763	state 7
	5764
	5765	exp -> exp '/' . exp (rule 4)
	5766
	5767	NUM shift, and go to state 1
	5768
	5769	exp go to state 11
	5770	@end example
	5771
	5772	As was announced in beginning of the report, @samp{State 8 contains 1
	5773	shift/reduce conflict}:
	5774
	5775	@example
	5776	state 8
	5777
	5778	exp -> exp . '+' exp (rule 1)
	5779	exp -> exp '+' exp . (rule 1)
	5780	exp -> exp . '-' exp (rule 2)
	5781	exp -> exp . '*' exp (rule 3)
	5782	exp -> exp . '/' exp (rule 4)
	5783
	5784	'*' shift, and go to state 6
	5785	'/' shift, and go to state 7
	5786
	5787	'/' [reduce using rule 1 (exp)]
	5788	$default reduce using rule 1 (exp)
	5789	@end example
	5790
	5791	Indeed, there are two actions associated to the lookahead @samp{/}:
	5792	either shifting (and going to state 7), or reducing rule 1. The
	5793	conflict means that either the grammar is ambiguous, or the parser lacks
	5794	information to make the right decision. Indeed the grammar is
	5795	ambiguous, as, since we did not specify the precedence of @samp{/}, the
	5796	sentence @samp{NUM + NUM / NUM} can be parsed as @samp{NUM + (NUM /
	5797	NUM)}, which corresponds to shifting @samp{/}, or as @samp{(NUM + NUM) /
	5798	NUM}, which corresponds to reducing rule 1.
	5799
	5800	Because in @acronym{LALR}(1) parsing a single decision can be made, Bison
	5801	arbitrarily chose to disable the reduction, see @ref{Shift/Reduce, ,
	5802	Shift/Reduce Conflicts}. Discarded actions are reported in between
	5803	square brackets.
	5804
	5805	Note that all the previous states had a single possible action: either
	5806	shifting the next token and going to the corresponding state, or
	5807	reducing a single rule. In the other cases, i.e., when shifting
	5808	@emph{and} reducing is possible or when @emph{several} reductions are
	5809	possible, the lookahead is required to select the action. State 8 is
	5810	one such state: if the lookahead is @samp{*} or @samp{/} then the action
	5811	is shifting, otherwise the action is reducing rule 1. In other words,
	5812	the first two items, corresponding to rule 1, are not eligible when the
	5813	lookahead is @samp{}, since we specified that @samp{} has higher
	5814	precedence that @samp{+}. More generally, some items are eligible only
	5815	with some set of possible lookaheads. When run with
	5816	@option{--report=lookahead}, Bison specifies these lookaheads:
	5817
	5818	@example
	5819	state 8
	5820
	5821	exp -> exp . '+' exp [$, '+', '-', '/'] (rule 1)
	5822	exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1)
	5823	exp -> exp . '-' exp (rule 2)
	5824	exp -> exp . '*' exp (rule 3)
	5825	exp -> exp . '/' exp (rule 4)
	5826
	5827	'*' shift, and go to state 6
	5828	'/' shift, and go to state 7
	5829
	5830	'/' [reduce using rule 1 (exp)]
	5831	$default reduce using rule 1 (exp)
	5832	@end example
	5833
	5834	The remaining states are similar:
	5835
	5836	@example
	5837	state 9
	5838
	5839	exp -> exp . '+' exp (rule 1)
	5840	exp -> exp . '-' exp (rule 2)
	5841	exp -> exp '-' exp . (rule 2)
	5842	exp -> exp . '*' exp (rule 3)
	5843	exp -> exp . '/' exp (rule 4)
	5844
	5845	'*' shift, and go to state 6
	5846	'/' shift, and go to state 7
	5847
	5848	'/' [reduce using rule 2 (exp)]
	5849	$default reduce using rule 2 (exp)
	5850
	5851	state 10
	5852
	5853	exp -> exp . '+' exp (rule 1)
	5854	exp -> exp . '-' exp (rule 2)
	5855	exp -> exp . '*' exp (rule 3)
	5856	exp -> exp '*' exp . (rule 3)
	5857	exp -> exp . '/' exp (rule 4)
	5858
	5859	'/' shift, and go to state 7
	5860
	5861	'/' [reduce using rule 3 (exp)]
	5862	$default reduce using rule 3 (exp)
	5863
	5864	state 11
	5865
	5866	exp -> exp . '+' exp (rule 1)
	5867	exp -> exp . '-' exp (rule 2)
	5868	exp -> exp . '*' exp (rule 3)
	5869	exp -> exp . '/' exp (rule 4)
	5870	exp -> exp '/' exp . (rule 4)
	5871
	5872	'+' shift, and go to state 4
	5873	'-' shift, and go to state 5
	5874	'*' shift, and go to state 6
	5875	'/' shift, and go to state 7
	5876
	5877	'+' [reduce using rule 4 (exp)]
	5878	'-' [reduce using rule 4 (exp)]
	5879	'*' [reduce using rule 4 (exp)]
	5880	'/' [reduce using rule 4 (exp)]
	5881	$default reduce using rule 4 (exp)
	5882	@end example
	5883
	5884	@noindent
	5885	Observe that state 11 contains conflicts due to the lack of precedence
	5886	of @samp{/} wrt @samp{+}, @samp{-}, and @samp{*}, but also because the
	5887	associativity of @samp{/} is not specified.
	5888
	5889
	5890	@node Tracing
	5891	@section Tracing Your Parser
	5892	@findex yydebug
	5893	@cindex debugging
	5894	@cindex tracing the parser
	5895
	5896	If a Bison grammar compiles properly but doesn't do what you want when it
	5897	runs, the @code{yydebug} parser-trace feature can help you figure out why.
	5898
	5899	There are several means to enable compilation of trace facilities:
	5900
	5901	@table @asis
	5902	@item the macro @code{YYDEBUG}
	5903	@findex YYDEBUG
	5904	Define the macro @code{YYDEBUG} to a nonzero value when you compile the
	5905	parser. This is compliant with @acronym{POSIX} Yacc. You could use
	5906	@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define
	5907	YYDEBUG 1} in the prologue of the grammar file (@pxref{Prologue, , The
	5908	Prologue}).
	5909
	5910	@item the option @option{-t}, @option{--debug}
	5911	Use the @samp{-t} option when you run Bison (@pxref{Invocation,
	5912	,Invoking Bison}). This is @acronym{POSIX} compliant too.
	5913
	5914	@item the directive @samp{%debug}
	5915	@findex %debug
	5916	Add the @code{%debug} directive (@pxref{Decl Summary, ,Bison
	5917	Declaration Summary}). This is a Bison extension, which will prove
	5918	useful when Bison will output parsers for languages that don't use a
	5919	preprocessor. Unless @acronym{POSIX} and Yacc portability matter to
	5920	you, this is
	5921	the preferred solution.
	5922	@end table
	5923
	5924	We suggest that you always enable the debug option so that debugging is
	5925	always possible.
	5926
	5927	The trace facility outputs messages with macro calls of the form
	5928	@code{YYFPRINTF (stderr, @var{format}, @var{args})} where
	5929	@var{format} and @var{args} are the usual @code{printf} format and
	5930	arguments. If you define @code{YYDEBUG} to a nonzero value but do not
	5931	define @code{YYFPRINTF}, @code{<stdio.h>} is automatically included
	5932	and @code{YYPRINTF} is defined to @code{fprintf}.
	5933
	5934	Once you have compiled the program with trace facilities, the way to
	5935	request a trace is to store a nonzero value in the variable @code{yydebug}.
	5936	You can do this by making the C code do it (in @code{main}, perhaps), or
	5937	you can alter the value with a C debugger.
	5938
	5939	Each step taken by the parser when @code{yydebug} is nonzero produces a
	5940	line or two of trace information, written on @code{stderr}. The trace
	5941	messages tell you these things:
	5942
	5943	@itemize @bullet
	5944	@item
	5945	Each time the parser calls @code{yylex}, what kind of token was read.
	5946
	5947	@item
	5948	Each time a token is shifted, the depth and complete contents of the
	5949	state stack (@pxref{Parser States}).
	5950
	5951	@item
	5952	Each time a rule is reduced, which rule it is, and the complete contents
	5953	of the state stack afterward.
	5954	@end itemize
	5955
	5956	To make sense of this information, it helps to refer to the listing file
	5957	produced by the Bison @samp{-v} option (@pxref{Invocation, ,Invoking
	5958	Bison}). This file shows the meaning of each state in terms of
	5959	positions in various rules, and also what each state will do with each
	5960	possible input token. As you read the successive trace messages, you
	5961	can see that the parser is functioning according to its specification in
	5962	the listing file. Eventually you will arrive at the place where
	5963	something undesirable happens, and you will see which parts of the
	5964	grammar are to blame.
	5965
	5966	The parser file is a C program and you can use C debuggers on it, but it's
	5967	not easy to interpret what it is doing. The parser function is a
	5968	finite-state machine interpreter, and aside from the actions it executes
	5969	the same code over and over. Only the values of variables show where in
	5970	the grammar it is working.
	5971
	5972	@findex YYPRINT
	5973	The debugging information normally gives the token type of each token
	5974	read, but not its semantic value. You can optionally define a macro
	5975	named @code{YYPRINT} to provide a way to print the value. If you define
	5976	@code{YYPRINT}, it should take three arguments. The parser will pass a
	5977	standard I/O stream, the numeric code for the token type, and the token
	5978	value (from @code{yylval}).
	5979
	5980	Here is an example of @code{YYPRINT} suitable for the multi-function
	5981	calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}):
	5982
	5983	@smallexample
	5984	#define YYPRINT(file, type, value) print_token_value (file, type, value)
	5985
	5986	static void
	5987	print_token_value (FILE *file, int type, YYSTYPE value)
	5988	@{
	5989	if (type == VAR)
	5990	fprintf (file, "%s", value.tptr->name);
	5991	else if (type == NUM)
	5992	fprintf (file, "%d", value.val);
	5993	@}
	5994	@end smallexample
	5995
	5996	@c ================================================= Invoking Bison
	5997
	5998	@node Invocation
	5999	@chapter Invoking Bison
	6000	@cindex invoking Bison
	6001	@cindex Bison invocation
	6002	@cindex options for invoking Bison
	6003
	6004	The usual way to invoke Bison is as follows:
	6005
	6006	@example
	6007	bison @var{infile}
	6008	@end example
	6009
	6010	Here @var{infile} is the grammar file name, which usually ends in
	6011	@samp{.y}. The parser file's name is made by replacing the @samp{.y}
	6012	with @samp{.tab.c}. Thus, the @samp{bison foo.y} filename yields
	6013	@file{foo.tab.c}, and the @samp{bison hack/foo.y} filename yields
	6014	@file{hack/foo.tab.c}. It's also possible, in case you are writing
	6015	C++ code instead of C in your grammar file, to name it @file{foo.ypp}
	6016	or @file{foo.y++}. Then, the output files will take an extension like
	6017	the given one as input (respectively @file{foo.tab.cpp} and
	6018	@file{foo.tab.c++}).
	6019	This feature takes effect with all options that manipulate filenames like
	6020	@samp{-o} or @samp{-d}.
	6021
	6022	For example :
	6023
	6024	@example
	6025	bison -d @var{infile.yxx}
	6026	@end example
	6027	@noindent
	6028	will produce @file{infile.tab.cxx} and @file{infile.tab.hxx}, and
	6029
	6030	@example
	6031	bison -d -o @var{output.c++} @var{infile.y}
	6032	@end example
	6033	@noindent
	6034	will produce @file{output.c++} and @file{outfile.h++}.
	6035
	6036	@menu
	6037	* Bison Options:: All the options described in detail,
	6038	in alphabetical order by short options.
	6039	* Option Cross Key:: Alphabetical list of long options.
	6040	@end menu
	6041
	6042	@node Bison Options
	6043	@section Bison Options
	6044
	6045	Bison supports both traditional single-letter options and mnemonic long
	6046	option names. Long option names are indicated with @samp{--} instead of
	6047	@samp{-}. Abbreviations for option names are allowed as long as they
	6048	are unique. When a long option takes an argument, like
	6049	@samp{--file-prefix}, connect the option name and the argument with
	6050	@samp{=}.
	6051
	6052	Here is a list of options that can be used with Bison, alphabetized by
	6053	short option. It is followed by a cross key alphabetized by long
	6054	option.
	6055
	6056	@c Please, keep this ordered as in `bison --help'.
	6057	@noindent
	6058	Operations modes:
	6059	@table @option
	6060	@item -h
	6061	@itemx --help
	6062	Print a summary of the command-line options to Bison and exit.
	6063
	6064	@item -V
	6065	@itemx --version
	6066	Print the version number of Bison and exit.
	6067
	6068	@need 1750
	6069	@item -y
	6070	@itemx --yacc
	6071	Equivalent to @samp{-o y.tab.c}; the parser output file is called
	6072	@file{y.tab.c}, and the other outputs are called @file{y.output} and
	6073	@file{y.tab.h}. The purpose of this option is to imitate Yacc's output
	6074	file name conventions. Thus, the following shell script can substitute
	6075	for Yacc:
	6076
	6077	@example
	6078	bison -y $*
	6079	@end example
	6080	@end table
	6081
	6082	@noindent
	6083	Tuning the parser:
	6084
	6085	@table @option
	6086	@item -S @var{file}
	6087	@itemx --skeleton=@var{file}
	6088	Specify the skeleton to use. You probably don't need this option unless
	6089	you are developing Bison.
	6090
	6091	@item -t
	6092	@itemx --debug
	6093	In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
	6094	already defined, so that the debugging facilities are compiled.
	6095	@xref{Tracing, ,Tracing Your Parser}.
	6096
	6097	@item --locations
	6098	Pretend that @code{%locations} was specified. @xref{Decl Summary}.
	6099
	6100	@item -p @var{prefix}
	6101	@itemx --name-prefix=@var{prefix}
	6102	Pretend that @code{%name-prefix="@var{prefix}"} was specified.
	6103	@xref{Decl Summary}.
	6104
	6105	@item -l
	6106	@itemx --no-lines
	6107	Don't put any @code{#line} preprocessor commands in the parser file.
	6108	Ordinarily Bison puts them in the parser file so that the C compiler
	6109	and debuggers will associate errors with your source file, the
	6110	grammar file. This option causes them to associate errors with the
	6111	parser file, treating it as an independent source file in its own right.
	6112
	6113	@item -n
	6114	@itemx --no-parser
	6115	Pretend that @code{%no-parser} was specified. @xref{Decl Summary}.
	6116
	6117	@item -k
	6118	@itemx --token-table
	6119	Pretend that @code{%token-table} was specified. @xref{Decl Summary}.
	6120	@end table
	6121
	6122	@noindent
	6123	Adjust the output:
	6124
	6125	@table @option
	6126	@item -d
	6127	@itemx --defines
	6128	Pretend that @code{%defines} was specified, i.e., write an extra output
	6129	file containing macro definitions for the token type names defined in
	6130	the grammar and the semantic value type @code{YYSTYPE}, as well as a few
	6131	@code{extern} variable declarations. @xref{Decl Summary}.
	6132
	6133	@item --defines=@var{defines-file}
	6134	Same as above, but save in the file @var{defines-file}.
	6135
	6136	@item -b @var{file-prefix}
	6137	@itemx --file-prefix=@var{prefix}
	6138	Pretend that @code{%verbose} was specified, i.e, specify prefix to use
	6139	for all Bison output file names. @xref{Decl Summary}.
	6140
	6141	@item -r @var{things}
	6142	@itemx --report=@var{things}
	6143	Write an extra output file containing verbose description of the comma
	6144	separated list of @var{things} among:
	6145
	6146	@table @code
	6147	@item state
	6148	Description of the grammar, conflicts (resolved and unresolved), and
	6149	@acronym{LALR} automaton.
	6150
	6151	@item lookahead
	6152	Implies @code{state} and augments the description of the automaton with
	6153	each rule's lookahead set.
	6154
	6155	@item itemset
	6156	Implies @code{state} and augments the description of the automaton with
	6157	the full set of items for each state, instead of its core only.
	6158	@end table
	6159
	6160	For instance, on the following grammar
	6161
	6162	@item -v
	6163	@itemx --verbose
	6164	Pretend that @code{%verbose} was specified, i.e, write an extra output
	6165	file containing verbose descriptions of the grammar and
	6166	parser. @xref{Decl Summary}.
	6167
	6168	@item -o @var{filename}
	6169	@itemx --output=@var{filename}
	6170	Specify the @var{filename} for the parser file.
	6171
	6172	The other output files' names are constructed from @var{filename} as
	6173	described under the @samp{-v} and @samp{-d} options.
	6174
	6175	@item -g
	6176	Output a @acronym{VCG} definition of the @acronym{LALR}(1) grammar
	6177	automaton computed by Bison. If the grammar file is @file{foo.y}, the
	6178	@acronym{VCG} output file will
	6179	be @file{foo.vcg}.
	6180
	6181	@item --graph=@var{graph-file}
	6182	The behavior of @var{--graph} is the same than @samp{-g}. The only
	6183	difference is that it has an optional argument which is the name of
	6184	the output graph filename.
	6185	@end table
	6186
	6187	@node Option Cross Key
	6188	@section Option Cross Key
	6189
	6190	Here is a list of options, alphabetized by long option, to help you find
	6191	the corresponding short option.
	6192
	6193	@tex
	6194	\def\leaderfill{\leaders\hbox to 1em{\hss.\hss}\hfill}
	6195
	6196	{\tt
	6197	\line{ --debug \leaderfill -t}
	6198	\line{ --defines \leaderfill -d}
	6199	\line{ --file-prefix \leaderfill -b}
	6200	\line{ --graph \leaderfill -g}
	6201	\line{ --help \leaderfill -h}
	6202	\line{ --name-prefix \leaderfill -p}
	6203	\line{ --no-lines \leaderfill -l}
	6204	\line{ --no-parser \leaderfill -n}
	6205	\line{ --output \leaderfill -o}
	6206	\line{ --token-table \leaderfill -k}
	6207	\line{ --verbose \leaderfill -v}
	6208	\line{ --version \leaderfill -V}
	6209	\line{ --yacc \leaderfill -y}
	6210	}
	6211	@end tex
	6212
	6213	@ifinfo
	6214	@example
	6215	--debug -t
	6216	--defines=@var{defines-file} -d
	6217	--file-prefix=@var{prefix} -b @var{file-prefix}
	6218	--graph=@var{graph-file} -d
	6219	--help -h
	6220	--name-prefix=@var{prefix} -p @var{name-prefix}
	6221	--no-lines -l
	6222	--no-parser -n
	6223	--output=@var{outfile} -o @var{outfile}
	6224	--token-table -k
	6225	--verbose -v
	6226	--version -V
	6227	--yacc -y
	6228	@end example
	6229	@end ifinfo
	6230
	6231	@c ================================================= Invoking Bison
	6232
	6233	@node FAQ
	6234	@chapter Frequently Asked Questions
	6235	@cindex frequently asked questions
	6236	@cindex questions
	6237
	6238	Several questions about Bison come up occasionally. Here some of them
	6239	are addressed.
	6240
	6241	@menu
	6242	* Parser Stack Overflow:: Breaking the Stack Limits
	6243	@end menu
	6244
	6245	@node Parser Stack Overflow
	6246	@section Parser Stack Overflow
	6247
	6248	@display
	6249	My parser returns with error with a @samp{parser stack overflow}
	6250	message. What can I do?
	6251	@end display
	6252
	6253	This question is already addressed elsewhere, @xref{Recursion,
	6254	,Recursive Rules}.
	6255
	6256	@c ================================================= Table of Symbols
	6257
	6258	@node Table of Symbols
	6259	@appendix Bison Symbols
	6260	@cindex Bison symbols, table of
	6261	@cindex symbols in Bison, table of
	6262
	6263	@deffn {Variable} @@$
	6264	In an action, the location of the left-hand side of the rule.
	6265	@xref{Locations, , Locations Overview}.
	6266	@end deffn
	6267
	6268	@deffn {Variable} @@@var{n}
	6269	In an action, the location of the @var{n}-th symbol of the right-hand
	6270	side of the rule. @xref{Locations, , Locations Overview}.
	6271	@end deffn
	6272
	6273	@deffn {Variable} $$
	6274	In an action, the semantic value of the left-hand side of the rule.
	6275	@xref{Actions}.
	6276	@end deffn
	6277
	6278	@deffn {Variable} $@var{n}
	6279	In an action, the semantic value of the @var{n}-th symbol of the
	6280	right-hand side of the rule. @xref{Actions}.
	6281	@end deffn
	6282
	6283	@deffn {Symbol} $accept
	6284	The predefined nonterminal whose only rule is @samp{$accept: @var{start}
	6285	$end}, where @var{start} is the start symbol. @xref{Start Decl, , The
	6286	Start-Symbol}. It cannot be used in the grammar.
	6287	@end deffn
	6288
	6289	@deffn {Symbol} $end
	6290	The predefined token marking the end of the token stream. It cannot be
	6291	used in the grammar.
	6292	@end deffn
	6293
	6294	@deffn {Symbol} $undefined
	6295	The predefined token onto which all undefined values returned by
	6296	@code{yylex} are mapped. It cannot be used in the grammar, rather, use
	6297	@code{error}.
	6298	@end deffn
	6299
	6300	@deffn {Symbol} error
	6301	A token name reserved for error recovery. This token may be used in
	6302	grammar rules so as to allow the Bison parser to recognize an error in
	6303	the grammar without halting the process. In effect, a sentence
	6304	containing an error may be recognized as valid. On a syntax error, the
	6305	token @code{error} becomes the current look-ahead token. Actions
	6306	corresponding to @code{error} are then executed, and the look-ahead
	6307	token is reset to the token that originally caused the violation.
	6308	@xref{Error Recovery}.
	6309	@end deffn
	6310
	6311	@deffn {Macro} YYABORT
	6312	Macro to pretend that an unrecoverable syntax error has occurred, by
	6313	making @code{yyparse} return 1 immediately. The error reporting
	6314	function @code{yyerror} is not called. @xref{Parser Function, ,The
	6315	Parser Function @code{yyparse}}.
	6316	@end deffn
	6317
	6318	@deffn {Macro} YYACCEPT
	6319	Macro to pretend that a complete utterance of the language has been
	6320	read, by making @code{yyparse} return 0 immediately.
	6321	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6322	@end deffn
	6323
	6324	@deffn {Macro} YYBACKUP
	6325	Macro to discard a value from the parser stack and fake a look-ahead
	6326	token. @xref{Action Features, ,Special Features for Use in Actions}.
	6327	@end deffn
	6328
	6329	@deffn {Macro} YYDEBUG
	6330	Macro to define to equip the parser with tracing code. @xref{Tracing,
	6331	,Tracing Your Parser}.
	6332	@end deffn
	6333
	6334	@deffn {Macro} YYERROR
	6335	Macro to pretend that a syntax error has just been detected: call
	6336	@code{yyerror} and then perform normal error recovery if possible
	6337	(@pxref{Error Recovery}), or (if recovery is impossible) make
	6338	@code{yyparse} return 1. @xref{Error Recovery}.
	6339	@end deffn
	6340
	6341	@deffn {Macro} YYERROR_VERBOSE
	6342	An obsolete macro that you define with @code{#define} in the Bison
	6343	declarations section to request verbose, specific error message strings
	6344	when @code{yyerror} is called. It doesn't matter what definition you
	6345	use for @code{YYERROR_VERBOSE}, just whether you define it. Using
	6346	@code{%error-verbose} is preferred.
	6347	@end deffn
	6348
	6349	@deffn {Macro} YYINITDEPTH
	6350	Macro for specifying the initial size of the parser stack.
	6351	@xref{Stack Overflow}.
	6352	@end deffn
	6353
	6354	@deffn {Macro} YYLEX_PARAM
	6355	An obsolete macro for specifying an extra argument (or list of extra
	6356	arguments) for @code{yyparse} to pass to @code{yylex}. he use of this
	6357	macro is deprecated, and is supported only for Yacc like parsers.
	6358	@xref{Pure Calling,, Calling Conventions for Pure Parsers}.
	6359	@end deffn
	6360
	6361	@deffn {Macro} YYLTYPE
	6362	Macro for the data type of @code{yylloc}; a structure with four
	6363	members. @xref{Location Type, , Data Types of Locations}.
	6364	@end deffn
	6365
	6366	@deffn {Type} yyltype
	6367	Default value for YYLTYPE.
	6368	@end deffn
	6369
	6370	@deffn {Macro} YYMAXDEPTH
	6371	Macro for specifying the maximum size of the parser stack. @xref{Stack
	6372	Overflow}.
	6373	@end deffn
	6374
	6375	@deffn {Macro} YYPARSE_PARAM
	6376	An obsolete macro for specifying the name of a parameter that
	6377	@code{yyparse} should accept. The use of this macro is deprecated, and
	6378	is supported only for Yacc like parsers. @xref{Pure Calling,, Calling
	6379	Conventions for Pure Parsers}.
	6380	@end deffn
	6381
	6382	@deffn {Macro} YYRECOVERING
	6383	Macro whose value indicates whether the parser is recovering from a
	6384	syntax error. @xref{Action Features, ,Special Features for Use in Actions}.
	6385	@end deffn
	6386
	6387	@deffn {Macro} YYSTACK_USE_ALLOCA
	6388	Macro used to control the use of @code{alloca}. If defined to @samp{0},
	6389	the parser will not use @code{alloca} but @code{malloc} when trying to
	6390	grow its internal stacks. Do @emph{not} define @code{YYSTACK_USE_ALLOCA}
	6391	to anything else.
	6392	@end deffn
	6393
	6394	@deffn {Macro} YYSTYPE
	6395	Macro for the data type of semantic values; @code{int} by default.
	6396	@xref{Value Type, ,Data Types of Semantic Values}.
	6397	@end deffn
	6398
	6399	@deffn {Variable} yychar
	6400	External integer variable that contains the integer value of the current
	6401	look-ahead token. (In a pure parser, it is a local variable within
	6402	@code{yyparse}.) Error-recovery rule actions may examine this variable.
	6403	@xref{Action Features, ,Special Features for Use in Actions}.
	6404	@end deffn
	6405
	6406	@deffn {Variable} yyclearin
	6407	Macro used in error-recovery rule actions. It clears the previous
	6408	look-ahead token. @xref{Error Recovery}.
	6409	@end deffn
	6410
	6411	@deffn {Variable} yydebug
	6412	External integer variable set to zero by default. If @code{yydebug}
	6413	is given a nonzero value, the parser will output information on input
	6414	symbols and parser action. @xref{Tracing, ,Tracing Your Parser}.
	6415	@end deffn
	6416
	6417	@deffn {Macro} yyerrok
	6418	Macro to cause parser to recover immediately to its normal mode
	6419	after a syntax error. @xref{Error Recovery}.
	6420	@end deffn
	6421
	6422	@deffn {Function} yyerror
	6423	User-supplied function to be called by @code{yyparse} on error. The
	6424	function receives one argument, a pointer to a character string
	6425	containing an error message. @xref{Error Reporting, ,The Error
	6426	Reporting Function @code{yyerror}}.
	6427	@end deffn
	6428
	6429	@deffn {Function} yylex
	6430	User-supplied lexical analyzer function, called with no arguments to get
	6431	the next token. @xref{Lexical, ,The Lexical Analyzer Function
	6432	@code{yylex}}.
	6433	@end deffn
	6434
	6435	@deffn {Variable} yylval
	6436	External variable in which @code{yylex} should place the semantic
	6437	value associated with a token. (In a pure parser, it is a local
	6438	variable within @code{yyparse}, and its address is passed to
	6439	@code{yylex}.) @xref{Token Values, ,Semantic Values of Tokens}.
	6440	@end deffn
	6441
	6442	@deffn {Variable} yylloc
	6443	External variable in which @code{yylex} should place the line and column
	6444	numbers associated with a token. (In a pure parser, it is a local
	6445	variable within @code{yyparse}, and its address is passed to
	6446	@code{yylex}.) You can ignore this variable if you don't use the
	6447	@samp{@@} feature in the grammar actions. @xref{Token Positions,
	6448	,Textual Positions of Tokens}.
	6449	@end deffn
	6450
	6451	@deffn {Variable} yynerrs
	6452	Global variable which Bison increments each time there is a syntax error.
	6453	(In a pure parser, it is a local variable within @code{yyparse}.)
	6454	@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
	6455	@end deffn
	6456
	6457	@deffn {Function} yyparse
	6458	The parser function produced by Bison; call this function to start
	6459	parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6460	@end deffn
	6461
	6462	@deffn {Directive} %debug
	6463	Equip the parser for debugging. @xref{Decl Summary}.
	6464	@end deffn
	6465
	6466	@deffn {Directive} %defines
	6467	Bison declaration to create a header file meant for the scanner.
	6468	@xref{Decl Summary}.
	6469	@end deffn
	6470
	6471	@deffn {Directive} %destructor
	6472	Specifying how the parser should reclaim the memory associated to
	6473	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	6474	@end deffn
	6475
	6476	@deffn {Directive} %dprec
	6477	Bison declaration to assign a precedence to a rule that is used at parse
	6478	time to resolve reduce/reduce conflicts. @xref{GLR Parsers, ,Writing
	6479	@acronym{GLR} Parsers}.
	6480	@end deffn
	6481
	6482	@deffn {Directive} %error-verbose
	6483	Bison declaration to request verbose, specific error message strings
	6484	when @code{yyerror} is called.
	6485	@end deffn
	6486
	6487	@deffn {Directive} %file-prefix="@var{prefix}"
	6488	Bison declaration to set the prefix of the output files. @xref{Decl
	6489	Summary}.
	6490	@end deffn
	6491
	6492	@deffn {Directive} %glr-parser
	6493	Bison declaration to produce a @acronym{GLR} parser. @xref{GLR
	6494	Parsers, ,Writing @acronym{GLR} Parsers}.
	6495	@end deffn
	6496
	6497	@deffn {Directive} %left
	6498	Bison declaration to assign left associativity to token(s).
	6499	@xref{Precedence Decl, ,Operator Precedence}.
	6500	@end deffn
	6501
	6502	@deffn {Directive} %lex-param @{@var{argument-declaration}@}
	6503	Bison declaration to specifying an additional parameter that
	6504	@code{yylex} should accept. @xref{Pure Calling,, Calling Conventions
	6505	for Pure Parsers}.
	6506	@end deffn
	6507
	6508	@deffn {Directive} %merge
	6509	Bison declaration to assign a merging function to a rule. If there is a
	6510	reduce/reduce conflict with a rule having the same merging function, the
	6511	function is applied to the two semantic values to get a single result.
	6512	@xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}.
	6513	@end deffn
	6514
	6515	@deffn {Directive} %name-prefix="@var{prefix}"
	6516	Bison declaration to rename the external symbols. @xref{Decl Summary}.
	6517	@end deffn
	6518
	6519	@deffn {Directive} %no-lines
	6520	Bison declaration to avoid generating @code{#line} directives in the
	6521	parser file. @xref{Decl Summary}.
	6522	@end deffn
	6523
	6524	@deffn {Directive} %nonassoc
	6525	Bison declaration to assign non-associativity to token(s).
	6526	@xref{Precedence Decl, ,Operator Precedence}.
	6527	@end deffn
	6528
	6529	@deffn {Directive} %output="@var{filename}"
	6530	Bison declaration to set the name of the parser file. @xref{Decl
	6531	Summary}.
	6532	@end deffn
	6533
	6534	@deffn {Directive} %parse-param @{@var{argument-declaration}@}
	6535	Bison declaration to specifying an additional parameter that
	6536	@code{yyparse} should accept. @xref{Parser Function,, The Parser
	6537	Function @code{yyparse}}.
	6538	@end deffn
	6539
	6540	@deffn {Directive} %prec
	6541	Bison declaration to assign a precedence to a specific rule.
	6542	@xref{Contextual Precedence, ,Context-Dependent Precedence}.
	6543	@end deffn
	6544
	6545	@deffn {Directive} %pure-parser
	6546	Bison declaration to request a pure (reentrant) parser.
	6547	@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	6548	@end deffn
	6549
	6550	@deffn {Directive} %right
	6551	Bison declaration to assign right associativity to token(s).
	6552	@xref{Precedence Decl, ,Operator Precedence}.
	6553	@end deffn
	6554
	6555	@deffn {Directive} %start
	6556	Bison declaration to specify the start symbol. @xref{Start Decl, ,The
	6557	Start-Symbol}.
	6558	@end deffn
	6559
	6560	@deffn {Directive} %token
	6561	Bison declaration to declare token(s) without specifying precedence.
	6562	@xref{Token Decl, ,Token Type Names}.
	6563	@end deffn
	6564
	6565	@deffn {Directive} %token-table
	6566	Bison declaration to include a token name table in the parser file.
	6567	@xref{Decl Summary}.
	6568	@end deffn
	6569
	6570	@deffn {Directive} %type
	6571	Bison declaration to declare nonterminals. @xref{Type Decl,
	6572	,Nonterminal Symbols}.
	6573	@end deffn
	6574
	6575	@deffn {Directive} %union
	6576	Bison declaration to specify several possible data types for semantic
	6577	values. @xref{Union Decl, ,The Collection of Value Types}.
	6578	@end deffn
	6579
	6580	@sp 1
	6581
	6582	These are the punctuation and delimiters used in Bison input:
	6583
	6584	@deffn {Delimiter} %%
	6585	Delimiter used to separate the grammar rule section from the
	6586	Bison declarations section or the epilogue.
	6587	@xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}.
	6588	@end deffn
	6589
	6590	@c Don't insert spaces, or check the DVI output.
	6591	@deffn {Delimiter} %@{@var{code}%@}
	6592	All code listed between @samp{%@{} and @samp{%@}} is copied directly to
	6593	the output file uninterpreted. Such code forms the prologue of the input
	6594	file. @xref{Grammar Outline, ,Outline of a Bison
	6595	Grammar}.
	6596	@end deffn
	6597
	6598	@deffn {Construct} /@dots{}/
	6599	Comment delimiters, as in C.
	6600	@end deffn
	6601
	6602	@deffn {Delimiter} :
	6603	Separates a rule's result from its components. @xref{Rules, ,Syntax of
	6604	Grammar Rules}.
	6605	@end deffn
	6606
	6607	@deffn {Delimiter} ;
	6608	Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}.
	6609	@end deffn
	6610
	6611	@deffn {Delimiter} \|
	6612	Separates alternate rules for the same result nonterminal.
	6613	@xref{Rules, ,Syntax of Grammar Rules}.
	6614	@end deffn
	6615
	6616	@node Glossary
	6617	@appendix Glossary
	6618	@cindex glossary
	6619
	6620	@table @asis
	6621	@item Backus-Naur Form (@acronym{BNF}; also called ``Backus Normal Form'')
	6622	Formal method of specifying context-free grammars originally proposed
	6623	by John Backus, and slightly improved by Peter Naur in his 1960-01-02
	6624	committee document contributing to what became the Algol 60 report.
	6625	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	6626
	6627	@item Context-free grammars
	6628	Grammars specified as rules that can be applied regardless of context.
	6629	Thus, if there is a rule which says that an integer can be used as an
	6630	expression, integers are allowed @emph{anywhere} an expression is
	6631	permitted. @xref{Language and Grammar, ,Languages and Context-Free
	6632	Grammars}.
	6633
	6634	@item Dynamic allocation
	6635	Allocation of memory that occurs during execution, rather than at
	6636	compile time or on entry to a function.
	6637
	6638	@item Empty string
	6639	Analogous to the empty set in set theory, the empty string is a
	6640	character string of length zero.
	6641
	6642	@item Finite-state stack machine
	6643	A ``machine'' that has discrete states in which it is said to exist at
	6644	each instant in time. As input to the machine is processed, the
	6645	machine moves from state to state as specified by the logic of the
	6646	machine. In the case of the parser, the input is the language being
	6647	parsed, and the states correspond to various stages in the grammar
	6648	rules. @xref{Algorithm, ,The Bison Parser Algorithm}.
	6649
	6650	@item Generalized @acronym{LR} (@acronym{GLR})
	6651	A parsing algorithm that can handle all context-free grammars, including those
	6652	that are not @acronym{LALR}(1). It resolves situations that Bison's
	6653	usual @acronym{LALR}(1)
	6654	algorithm cannot by effectively splitting off multiple parsers, trying all
	6655	possible parsers, and discarding those that fail in the light of additional
	6656	right context. @xref{Generalized LR Parsing, ,Generalized
	6657	@acronym{LR} Parsing}.
	6658
	6659	@item Grouping
	6660	A language construct that is (in general) grammatically divisible;
	6661	for example, `expression' or `declaration' in C@.
	6662	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	6663
	6664	@item Infix operator
	6665	An arithmetic operator that is placed between the operands on which it
	6666	performs some operation.
	6667
	6668	@item Input stream
	6669	A continuous flow of data between devices or programs.
	6670
	6671	@item Language construct
	6672	One of the typical usage schemas of the language. For example, one of
	6673	the constructs of the C language is the @code{if} statement.
	6674	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	6675
	6676	@item Left associativity
	6677	Operators having left associativity are analyzed from left to right:
	6678	@samp{a+b+c} first computes @samp{a+b} and then combines with
	6679	@samp{c}. @xref{Precedence, ,Operator Precedence}.
	6680
	6681	@item Left recursion
	6682	A rule whose result symbol is also its first component symbol; for
	6683	example, @samp{expseq1 : expseq1 ',' exp;}. @xref{Recursion, ,Recursive
	6684	Rules}.
	6685
	6686	@item Left-to-right parsing
	6687	Parsing a sentence of a language by analyzing it token by token from
	6688	left to right. @xref{Algorithm, ,The Bison Parser Algorithm}.
	6689
	6690	@item Lexical analyzer (scanner)
	6691	A function that reads an input stream and returns tokens one by one.
	6692	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	6693
	6694	@item Lexical tie-in
	6695	A flag, set by actions in the grammar rules, which alters the way
	6696	tokens are parsed. @xref{Lexical Tie-ins}.
	6697
	6698	@item Literal string token
	6699	A token which consists of two or more fixed characters. @xref{Symbols}.
	6700
	6701	@item Look-ahead token
	6702	A token already read but not yet shifted. @xref{Look-Ahead, ,Look-Ahead
	6703	Tokens}.
	6704
	6705	@item @acronym{LALR}(1)
	6706	The class of context-free grammars that Bison (like most other parser
	6707	generators) can handle; a subset of @acronym{LR}(1). @xref{Mystery
	6708	Conflicts, ,Mysterious Reduce/Reduce Conflicts}.
	6709
	6710	@item @acronym{LR}(1)
	6711	The class of context-free grammars in which at most one token of
	6712	look-ahead is needed to disambiguate the parsing of any piece of input.
	6713
	6714	@item Nonterminal symbol
	6715	A grammar symbol standing for a grammatical construct that can
	6716	be expressed through rules in terms of smaller constructs; in other
	6717	words, a construct that is not a token. @xref{Symbols}.
	6718
	6719	@item Parser
	6720	A function that recognizes valid sentences of a language by analyzing
	6721	the syntax structure of a set of tokens passed to it from a lexical
	6722	analyzer.
	6723
	6724	@item Postfix operator
	6725	An arithmetic operator that is placed after the operands upon which it
	6726	performs some operation.
	6727
	6728	@item Reduction
	6729	Replacing a string of nonterminals and/or terminals with a single
	6730	nonterminal, according to a grammar rule. @xref{Algorithm, ,The Bison
	6731	Parser Algorithm}.
	6732
	6733	@item Reentrant
	6734	A reentrant subprogram is a subprogram which can be in invoked any
	6735	number of times in parallel, without interference between the various
	6736	invocations. @xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	6737
	6738	@item Reverse polish notation
	6739	A language in which all operators are postfix operators.
	6740
	6741	@item Right recursion
	6742	A rule whose result symbol is also its last component symbol; for
	6743	example, @samp{expseq1: exp ',' expseq1;}. @xref{Recursion, ,Recursive
	6744	Rules}.
	6745
	6746	@item Semantics
	6747	In computer languages, the semantics are specified by the actions
	6748	taken for each instance of the language, i.e., the meaning of
	6749	each statement. @xref{Semantics, ,Defining Language Semantics}.
	6750
	6751	@item Shift
	6752	A parser is said to shift when it makes the choice of analyzing
	6753	further input from the stream rather than reducing immediately some
	6754	already-recognized rule. @xref{Algorithm, ,The Bison Parser Algorithm}.
	6755
	6756	@item Single-character literal
	6757	A single character that is recognized and interpreted as is.
	6758	@xref{Grammar in Bison, ,From Formal Rules to Bison Input}.
	6759
	6760	@item Start symbol
	6761	The nonterminal symbol that stands for a complete valid utterance in
	6762	the language being parsed. The start symbol is usually listed as the
	6763	first nonterminal symbol in a language specification.
	6764	@xref{Start Decl, ,The Start-Symbol}.
	6765
	6766	@item Symbol table
	6767	A data structure where symbol names and associated data are stored
	6768	during parsing to allow for recognition and use of existing
	6769	information in repeated uses of a symbol. @xref{Multi-function Calc}.
	6770
	6771	@item Syntax error
	6772	An error encountered during parsing of an input stream due to invalid
	6773	syntax. @xref{Error Recovery}.
	6774
	6775	@item Token
	6776	A basic, grammatically indivisible unit of a language. The symbol
	6777	that describes a token in the grammar is a terminal symbol.
	6778	The input of the Bison parser is a stream of tokens which comes from
	6779	the lexical analyzer. @xref{Symbols}.
	6780
	6781	@item Terminal symbol
	6782	A grammar symbol that has no rules in the grammar and therefore is
	6783	grammatically indivisible. The piece of text it represents is a token.
	6784	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	6785	@end table
	6786
	6787	@node Copying This Manual
	6788	@appendix Copying This Manual
	6789
	6790	@menu
	6791	* GNU Free Documentation License:: License for copying this manual.
	6792	@end menu
	6793
	6794	@include fdl.texi
	6795
	6796	@node Index
	6797	@unnumbered Index
	6798
	6799	@printindex cp
	6800
	6801	@bye